How to download the MNIST images?

MNIST dataset is a frequently used image dataset for neuron network and deep learning study. How to download the MNIST dataset?

import tensorflow as tf
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()

You can load the MNIST dataset through one line of code. You do not need to specify the url of the MNIST dataset. The function knows where to download it. tf.keras.datasets.mnist.load_data has a default parameter path=’mnist.npz’. This is a file about 11MB on an amazon aws server. The parameter specifies a cache path name for the downloaded file(if you do not specify the full path name, the file is saved in ~/.keras/datasets/, i.e., your home directory). The first time you load the MNIST images, the function looks for the cache file and does not find it so it will download it from the amazon server. After downloading, the file will be saved to ~/.keras/datasets/mnist.npz. Next time you call the function, it will find it in the cache directory, and use it instead of downloading it again.

MNIST dataset has a training dataset and a test dataset. The training dataset has 60000 images of size 28*28, each image has a label(class). The test dataset has 10000 image, again each test image has a label. So the shape of X_train is (60000,28,28), the shape of y_train is (60000,), the shape of X_test is (10000,28,28), and the shape of y_test is (10000,). The dtype of all these numpy arrays is uint8. Note that the images in both the training set and the test set are unordered(by their classes).

Leave a Reply