.. _sec_kaggle_cifar10: Image Classification (CIFAR-10) on Kaggle ========================================= So far, we have been using high-level APIs of deep learning frameworks to directly obtain image datasets in tensor format. However, custom image datasets often come in the form of image files. In this section, we will start from raw image files, and organize, read, then transform them into tensor format step by step. We experimented with the CIFAR-10 dataset in :numref:`sec_image_augmentation`, which is an important dataset in computer vision. In this section, we will apply the knowledge we learned in previous sections to practice the Kaggle competition of CIFAR-10 image classification. The web address of the competition is https://www.kaggle.com/c/cifar-10 :numref:`fig_kaggle_cifar10` shows the information on the competition’s webpage. In order to submit the results, you need to register a Kaggle account. .. _fig_kaggle_cifar10: .. figure:: ../img/kaggle-cifar10.png :width: 600px CIFAR-10 image classification competition webpage information. The competition dataset can be obtained by clicking the “Data” tab. .. raw:: html

.. raw:: html

.. raw:: latex \diilbookstyleinputcell .. code:: python import collections import math import os import shutil import pandas as pd import torch import torchvision from torch import nn from d2l import torch as d2l .. raw:: html

.. raw:: html

.. raw:: latex \diilbookstyleinputcell .. code:: python import collections import math import os import shutil import pandas as pd from mxnet import gluon, init, npx from mxnet.gluon import nn from d2l import mxnet as d2l npx.set_np() .. raw:: html

.. raw:: html

Obtaining and Organizing the Dataset ------------------------------------ The competition dataset is divided into a training set and a test set, which contain 50000 and 300000 images, respectively. In the test set, 10000 images will be used for evaluation, while the remaining 290000 images will not be evaluated: they are included just to make it hard to cheat with *manually* labeled results of the test set. The images in this dataset are all png color (RGB channels) image files, whose height and width are both 32 pixels. The images cover a total of 10 categories, namely airplanes, cars, birds, cats, deer, dogs, frogs, horses, boats, and trucks. The upper-left corner of :numref:`fig_kaggle_cifar10` shows some images of airplanes, cars, and birds in the dataset. Downloading the Dataset ~~~~~~~~~~~~~~~~~~~~~~~ After logging in to Kaggle, we can click the “Data” tab on the CIFAR-10 image classification competition webpage shown in :numref:`fig_kaggle_cifar10` and download the dataset by clicking the “Download All” button. After unzipping the downloaded file in ``../data``, and unzipping ``train.7z`` and ``test.7z`` inside it, you will find the entire dataset in the following paths: - ``../data/cifar-10/train/[1-50000].png`` - ``../data/cifar-10/test/[1-300000].png`` - ``../data/cifar-10/trainLabels.csv`` - ``../data/cifar-10/sampleSubmission.csv`` where the ``train`` and ``test`` directories contain the training and testing images, respectively, ``trainLabels.csv`` provides labels for the training images, and ``sample_submission.csv`` is a sample submission file. To make it easier to get started, we provide a small-scale sample of the dataset that contains the first 1000 training images and 5 random testing images. To use the full dataset of the Kaggle competition, you need to set the following ``demo`` variable to ``False``. .. raw:: html