The Hello World of deep learning !
The “Hello World” of deep learning is a simple program that trains a model to classify images of handwritten digits (from 0 to 9). It is called the “Hello World” of deep learning because it is often the first program that people write when learning about deep learning.
To write this program, we will use a popular deep learning library called TensorFlow. TensorFlow is an open-source library developed by Google for deep learning applications. It is widely used in industry and academia, and is the basis for many other deep learning libraries.
The first step in writing our “Hello World” program is to install TensorFlow. You can install TensorFlow using pip, which is the Python package manager. here is the code To install TensorFlow :
pip install tensorflow
Once TensorFlow is installed, we can import it into our Python program using the following line of code:
import tensorflow as tf
Next, we need to get some data to train our model on. For this program, we will use the MNIST dataset, which is a collection of 70,000 images of handwritten digits. The MNIST dataset is built into TensorFlow, so we can easily access it using the following code:
mnist = tf.keras.datasets.mnist
This will download the MNIST dataset and store it in the mnist
variable. The MNIST dataset is divided into a training set and a test set. We will use the training set to train our model, and the test set to evaluate the accuracy of our model.
To split the MNIST dataset into a training set and a test set, we can use the following code:
(x_train, y_train), (x_test, y_test) = mnist.load_data()
This will split the MNIST dataset into two variables: x_train
and y_train
, which contain the training data and labels, respectively, and x_test
and y_test
, which contain the test data and labels, respectively.
Now that we have our data, we can start building our model. In TensorFlow, a model is represented as a sequence of layers. To build a model, we will create an instance of the Sequential
class, which represents a linear stack of layers. Then, we will add layers to the model using the add
method.
For this program, we will use a simple model consisting of a single layer. We will use the Flatten
layer to flatten the input images into a 1D array, and the Dense
layer to perform the actual classification. Here's the code to build our model:
model = tf.keras.Sequential()
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))
The Flatten
layer will flatten the input images into a 1D array of 28x28 = 784 pixels. The Dense
layer will have 10 units, one for each possible digit (0 to 9). The activation
argument specifies the activation function to use. We will use the softmax
activation function, which is a popular choice for classification tasks because it maps the output of the model to a probability distribution over the classes. In other words, the output of the model will be a list of probabilities, one for each class, that sum to 1. The class with the highest probability will be the model’s prediction.
For example, if our model is trained to classify images of handwritten digits, the output of the model might look something like this:
[0.1, 0.2, 0.3, 0.4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
This output corresponds to a probability distribution over the 10 classes (digits 0 to 9). The class with the highest probability (in this case, class 3 with probability 0.4) will be the model’s prediction.
Now that we have our model defined, we need to compile it before we can start training it. Compiling a model in TensorFlow involves specifying a loss function and an optimizer. The loss function measures how well the model is doing on the training data, and the optimizer adjusts the model’s parameters in order to minimize the loss.
For this program, we will use the sparse_categorical_crossentropy
loss function and the Adam
optimizer. We will also specify that we want to track the accuracy of our model during training. Here's the code to compile our model:
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
Now that our model is compiled, we can start training it on the training data. To train the model, we will call the fit
method and pass in the training data (x_train
and y_train
) and the number of epochs (iterations over the entire dataset) to train for. We will also specify a batch size, which is the number of samples to use in one epoch.
Here’s the code to train our model:
model.fit(x_train, y_train, epochs=5, batch_size=32)
After training the model for 5 epochs, we can evaluate its accuracy on the test data using the evaluate
method. This will give us a final accuracy score for our model.
model.evaluate(x_test, y_test)
That’s it! We’ve now written the “Hello World” of deep learning. Of course, this is just a simple example, and there are many more advanced techniques that we can use to improve the accuracy of our model. But this simple program should give you a good foundation for learning more about deep learning and TensorFlow.
I hope you found this tutorial helpful! If you have any questions or suggestions, feel free to leave a comment below.