In this notebook, I used the pokemon images dataset from here but unfortuantely it is not available now.

Load required libraries

from fastai.vision import *
from fastai.metrics import error_rate
from fastai.callbacks.tracker import ReduceLROnPlateauCallback, SaveModelCallback
from fastai.callbacks import CSVLogger

Prepare Data for training

path = Path(".")

Form data bunch object from the folders.

data = ImageDataBunch.from_folder(path, train=".",
                                    ds_tfms=get_transforms(),
                                    size=128, bs=64, valid_pct=0.2).normalize(imagenet_stats)

Check the number of different pokemon images that we have.

len(data.classes)

928

Creating a CNN model from architecture of resnet18. I could use a bigger model but I would not be able to serve them from Google or OneDrive because of the size.

Error Rate is 1-accuracy.
Using mixup for better regularization.
Converting the operations to be performed in a lower precision, more

learn = cnn_learner(data, models.resnet18, metrics=error_rate).mixup().to_fp16()

Adding callbacks to monitor the training process and

Reduce the learning_rate by using the ReduceLROnPlateauCallback.
Saving the model on every improvement in error_rate
Log the training stats in a csv file.

callbacks_list = [
    ReduceLROnPlateauCallback(learn=learn, monitor='error_rate', factor=1e-6, patience=5, min_delta=1e-5),
    SaveModelCallback(learn, mode="min", every='improvement', monitor='error_rate', name='best'),
    CSVLogger(learn=learn, append=True)
]

Start Training

Now, All the setup has been made, Let's train the model with default parameters, for 15 epochs.

learn.fit_one_cycle(15, callbacks=callbacks_list)

Better model found at epoch 0 with error_rate value: 0.9857983589172363.
Better model found at epoch 1 with error_rate value: 0.870561957359314.
Better model found at epoch 2 with error_rate value: 0.6721444725990295.
Better model found at epoch 3 with error_rate value: 0.5408805012702942.
Better model found at epoch 4 with error_rate value: 0.46966931223869324.
Better model found at epoch 5 with error_rate value: 0.4209778904914856.
Epoch 6: reducing lr to 2.599579409433508e-09
Better model found at epoch 6 with error_rate value: 0.37289512157440186.
Better model found at epoch 7 with error_rate value: 0.34388312697410583.
Better model found at epoch 8 with error_rate value: 0.3268411457538605.
Better model found at epoch 9 with error_rate value: 0.3138567805290222.
Better model found at epoch 10 with error_rate value: 0.30066952109336853.
Better model found at epoch 11 with error_rate value: 0.29093122482299805.
Epoch 12: reducing lr to 2.606527959586539e-10
Better model found at epoch 12 with error_rate value: 0.2836275100708008.
Better model found at epoch 13 with error_rate value: 0.28281599283218384.
Better model found at epoch 14 with error_rate value: 0.2801785469055176.

Now that we have got some decent accuracy let us try to save the model and interpret from it.

In the following cell, I

Load the best weights saved by the callbacks during training.
Convert the model back to use 32 bit precision.
Export the model as a whole.
Export the weights alone.

learn.load("best");
learn.to_fp32()
learn.export("pokemon_resnet18_st1.pkl")
learn.save("pokemon_resnet18_st1_wgts")

Model Interpretation

It is very important that we get to know what the model has learnt from the training process. We can do that with the help of ClassificationInterpretation class from the fastai library.

interp = ClassificationInterpretation.from_learner(learn)
# Get the instances where the model has made the most error (by loss value) in the validation set.
losses,idxs = interp.top_losses()
# Check whether the values are all of same length as the validation set
len(data.valid_ds)==len(losses)==len(idxs)

True

Interpret the images where the model made errors during the validation.

The cell below shows

the image.
the model's prediction of that image.
the actual label of that image.
the loss and probability(the extent to which the model is sure about it's prediction).

You can notice that the image has some of it's regions blighted, as far I know these are the regions that the model looked at to make the prediction for the corresponding image.

interp.plot_top_losses(9, figsize=(15,11))

Let us also see which pokemon have confused the model the most.

interp.most_confused(min_val=3)

[('Sharpedo(Mega)', 'Sharpedo', 7),
 ('Moltres', 'Rapidash', 4),
 ('Thundurus(Incarnate)', 'Thundurus(Therian)', 4),
 ('Charizard(Mega Y)', 'Charizard', 3),
 ('Greninja', 'Greninja(Ash)', 3),
 ('Groudon(Primal)', 'Incineroar', 3),
 ('Latias(Mega)', 'Latios(Mega)', 3),
 ('Nidoran(Female)', 'Nidorina', 3)]

Apart from the 2nd one in this list, You can see why the model was confused generally, most of it's confusion stem from the evolved species of the same pokemon.

Let's try to train the model a little bit differently this time.

learn.load('best');

Till now we have been training only the tail region of the model (i.e.) only the last two/ three layers of our model, so essentially this model is almost same as the model which was pretrained on 1000 categories of the ImageNet dataset with some minor tweaks for our problem here. We have some options to improve the model, which are

Train all the layers so that the model can adapt to the current classification problem. We do that by unfreeze().
Train with a very low learning rate so that it does'nt forget the learnings from the pretrained weights.

Let's see how well we can improve the model.

learn.to_fp16()
learn.unfreeze()

Before we start training again, We need to figure out at what speed the neural network should learn, this is controlled by the learning rate parameter and finding a value for is crucial to the training process.

Luckily the fastai's lr_find method will help us do just the same.

learn.lr_find(start_lr=1e-20)
# Plot the learning rates and the corresponding losses.
learn.recorder.plot(suggestion=True)
# Get the suggested learning rate
min_grad_lr = learn.recorder.min_grad_lr

Min numerical gradient: 9.77E-17
Min loss divided by 10: 6.46E-09

Use the same callbacks as before and train for 30 epochs.

learn.fit_one_cycle(30, min_grad_lr, callbacks=callbacks_list)

Better model found at epoch 0 with error_rate value: 0.2801785469055176.
Better model found at epoch 9 with error_rate value: 0.27997565269470215.
Epoch 11: reducing lr to 9.288489603500534e-23
Epoch 17: reducing lr to 5.97347999592849e-23
Epoch 29: reducing lr to 3.9089488838232423e-28

We can see that the model has improved slightly but not much, other ways that we can try are

Try using a different architecture rather than resnet18.
Add more Image augmentation methods (even though fastai has some reasonable defaults).

Persist the environment so that we would be able to deploy the model without any problems

!pip freeze > resnet18.txt

Try the model

Curious to try out the model?, I have built a small Flask web app which is hosted here. You can find the code for the same in my github repo.

The website may take some time to load since it was hosted on a free tier heroku dyno.

That's it for this post, Please share it if you have found it useful. Don't hesitate to leave a comment if you find that any of my explanation needs some clarification.

epoch	train_loss	valid_loss	error_rate	time
0	7.173859	6.631011	0.985798	01:09
1	6.246106	5.397111	0.870562	01:08
2	5.001963	3.665833	0.672144	01:07
3	4.327330	2.772682	0.540881	01:06
4	3.941842	2.320177	0.469669	01:06
5	3.648211	2.069086	0.420978	01:06
6	3.423512	1.901359	0.372895	01:06
7	3.328791	1.758360	0.343883	01:06
8	3.140401	1.657776	0.326841	01:06
9	3.044241	1.591135	0.313857	01:07
10	2.940413	1.538893	0.300670	01:06
11	2.759924	1.502491	0.290931	01:07
12	2.781063	1.474272	0.283628	01:06
13	2.761597	1.457427	0.282816	01:06
14	2.700450	1.459171	0.280179	01:07

epoch	train_loss	valid_loss	error_rate	time
0	2.648827	1.461440	0.280179	01:08
1	2.687755	1.460599	0.282004	01:08
2	2.646746	1.471151	0.281802	01:07
3	2.647440	1.466154	0.284033	01:07
4	2.687051	1.459437	0.280179	01:07
5	2.656536	1.468453	0.284236	01:07
6	2.646480	1.469294	0.280787	01:08
7	2.707206	1.462577	0.281802	01:08
8	2.650942	1.462410	0.283222	01:07
9	2.657768	1.457848	0.279976	01:07
10	2.689249	1.459695	0.281193	01:07
11	2.656215	1.463556	0.282613	01:07
12	2.715505	1.461581	0.282410	01:09
13	2.689469	1.462295	0.282410	01:08
14	2.685328	1.460551	0.283222	01:08
15	2.624705	1.458205	0.283222	01:10
16	2.675736	1.468264	0.283628	01:11
17	2.641450	1.461090	0.281193	01:10
18	2.662758	1.455160	0.283425	01:12
19	2.662972	1.459052	0.283019	01:13
20	2.711507	1.464223	0.282207	01:13
21	2.697404	1.463553	0.283425	01:13
22	2.643310	1.462558	0.280584	01:12
23	2.657411	1.463225	0.285048	01:12
24	2.679297	1.467203	0.283425	01:13
25	2.654091	1.464559	0.281599	01:12
26	2.619208	1.465727	0.283222	01:12
27	2.622938	1.466129	0.280990	01:12
28	2.646025	1.465645	0.284236	01:13
29	2.679323	1.458704	0.284033	01:13