This is a fun project that I did to try out the pretrained models on datastream created from google images. It was a good practice in building data pipelines, creating image transforms, importing the pretrained models and optimizing the learning rate.

I searched for images of some of the famous football players on google images and fed the first 200 images to this model to see how well the model is able to train on an uncurated noisy data.

Recognizing players can be an useful application in sport analytics. A lot of in-game metrics like xG and xA scores, heatmaps depend on some kind of deep learning algorithms running in the backend.

Setting up the libraries

This step mounts the google drive in working enironment which would be used as a backend storage.

from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).

from fastai.vision import *

import warnings
warnings.filterwarnings("ignore")

Creating data pipeline

ls

[0m[01;34mgdrive[0m/  [01;34msample_data[0m/

cd gdrive/My Drive/fastai_player_recog/

/content/gdrive/My Drive/fastai_player_recog

folder = 'cr'
file = 'cr.csv'

folder = 'messi'
file = 'messi.csv'

folder = 'suarez'
file = 'suarez.csv'

path = Path('data/players')
dest = path/folder
dest.mkdir(parents=True, exist_ok=True)

path.ls

<bound method <lambda> of PosixPath('data/players')>

classes = ['cr','messi','suarez']

download_images(path/file, dest, max_pics=200)

for c in classes:
    print(c)
    verify_images(path/c, delete=True, max_size=500)

cr

messi

suarez

cannot identify image file <_io.BufferedReader name='data/players/suarez/00000070.jpg'>
cannot identify image file <_io.BufferedReader name='data/players/suarez/00000096.png'>

Modeling - Data Transformation

Transforms

path = Path('data/players/')

path

PosixPath('data/players')

transforms = get_transforms()

np.random.seed(42)
data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.2,
        ds_tfms=transforms, size=224, bs=32).normalize(imagenet_stats)

data.classes

['cr', 'messi', 'suarez']

data.show_batch(rows=3, figsize=(7,8))

As you can see above, the dataset is uncurated with lot of noisy pictures. For example, the last picture is a sketch. There are several such images that are inaccurate/improper but we shall be using them to induce noise in our models.

data.classes, data.c, len(data.train_ds), len(data.valid_ds)

(['cr', 'messi', 'suarez'], 3, 383, 95)

Training

As seen below, we are able to achieve around 87%-89% accuracy by optimizing the learning rate. If we now remove the noise from the data, we shall quite easily be able to achieve an even higher accuracy. It’d be interesting to see which images are misclassified the most.

learn = cnn_learner(data, models.resnet152, metrics=error_rate)

learn.fit_one_cycle(30)

epoch	train_loss	valid_loss	error_rate	time
0	0.153895	0.486600	0.147368	00:08
1	0.176932	0.453679	0.147368	00:08
2	0.186371	0.438741	0.136842	00:08
3	0.176964	0.498095	0.126316	00:09
4	0.187868	0.495250	0.157895	00:08
5	0.173565	0.450809	0.147368	00:09
6	0.167043	0.444867	0.115789	00:09
7	0.161417	0.477105	0.178947	00:08
8	0.164916	0.773907	0.168421	00:08
9	0.157456	0.554971	0.189474	00:08
10	0.157130	0.592353	0.136842	00:09
11	0.168478	0.647615	0.136842	00:08
12	0.172660	0.529567	0.168421	00:09
13	0.178256	0.557514	0.157895	00:08
14	0.172853	0.423964	0.168421	00:09
15	0.170725	0.382557	0.115789	00:09
16	0.156468	0.471120	0.147368	00:09
17	0.133140	0.533500	0.136842	00:09
18	0.115539	0.595399	0.147368	00:09
19	0.108522	0.631604	0.136842	00:09
20	0.105572	0.566845	0.147368	00:09
21	0.103867	0.629708	0.157895	00:09
22	0.096924	0.590672	0.147368	00:09
23	0.091466	0.573798	0.136842	00:09
24	0.080457	0.582823	0.136842	00:09
25	0.077717	0.551564	0.136842	00:09
26	0.075651	0.545016	0.126316	00:09
27	0.064027	0.561220	0.126316	00:09
28	0.060929	0.569121	0.126316	00:09
29	0.054775	0.578649	0.126316	00:09

learn.save('stage-1')

learn.unfreeze()

learn.lr_find(stop_div=False, num_it=200)

<div>
    <style>
        /* Turns off some styling */
        progress {
            /* gets rid of default border in Firefox and Opera. */
            border: none;
            /* Needs to be in here for Safari polyfill so background images work as expected. */
            background-size: auto;
        }
        .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
            background: #F44336;
        }
    </style>
  <progress value='18' class='' max='19', style='width:300px; height:20px; vertical-align: middle;'></progress>
  94.74% [18/19 02:59<00:09]
</div>

epoch	train_loss	valid_loss	error_rate
0	0.029879	#na#	00:09
1	0.039062	#na#	00:09
2	0.040193	#na#	00:09
3	0.038014	#na#	00:09
4	0.036690	#na#	00:09
5	0.050070	#na#	00:10
6	0.065584	#na#	00:10
7	0.091199	#na#	00:10
8	0.370187	#na#	00:10
9	0.688983	#na#	00:10
10	0.859709	#na#	00:09
11	0.963872	#na#	00:09
12	1.173712	#na#	00:09
13	1.679814	#na#	00:09
14	2.981434	#na#	00:09
15	14.075550	#na#	00:09
16	116.549461	#na#	00:09
17	650.312622	#na#	00:09

<progress value='1' class='' max='11', style='width:300px; height:20px; vertical-align: middle;'></progress> 9.09% [1/11 00:01<00:17 735.8074]

LR Finder is complete, type {learner_name}.recorder.plot() to see the graph. ```python learn.recorder.plot() ```

```python learn.fit_one_cycle(10, max_lr=slice(1e-03,1e-06)) ```

epoch	train_loss	valid_loss	error_rate	time
0	0.067066	0.845543	0.178947	00:11
1	0.042421	1.085019	0.231579	00:11
2	0.106151	1.190116	0.284211	00:11
3	0.217121	5.310340	0.610526	00:11
4	0.304309	3.054987	0.578947	00:11
5	0.348517	1.629803	0.368421	00:11
6	0.353958	0.877336	0.200000	00:11
7	0.319123	0.604664	0.168421	00:11
8	0.277269	0.568268	0.136842	00:11
9	0.257772	0.594630	0.126316	00:11

```python learn.save('stage-2') ``` ```python learn.freeze_to(100) ``` ```python learn.fit_one_cycle(30) ```

epoch	train_loss	valid_loss	error_rate	time
0	0.199813	0.610031	0.136842	00:08
1	0.169803	0.593857	0.136842	00:08
2	0.145744	0.590429	0.136842	00:08
3	0.139687	0.596360	0.147368	00:08
4	0.137717	0.611183	0.136842	00:08
5	0.122393	0.663416	0.136842	00:08
6	0.124853	0.701337	0.126316	00:08
7	0.136023	0.690178	0.136842	00:08
8	0.134810	0.711177	0.136842	00:08
9	0.133105	0.759611	0.136842	00:08
10	0.122329	0.687816	0.157895	00:08
11	0.121954	0.623489	0.157895	00:08
12	0.108721	0.719655	0.157895	00:08
13	0.099113	0.656762	0.147368	00:08
14	0.094180	0.653667	0.157895	00:08
15	0.089522	0.598895	0.115789	00:08
16	0.082002	0.629596	0.126316	00:08
17	0.093078	0.576757	0.115789	00:08
18	0.085515	0.558922	0.115789	00:08
19	0.082467	0.533377	0.115789	00:08
20	0.076356	0.555016	0.126316	00:08
21	0.073800	0.568873	0.126316	00:08
22	0.078990	0.541078	0.115789	00:08
23	0.076296	0.547799	0.115789	00:08
24	0.077476	0.534094	0.126316	00:08
25	0.071813	0.546575	0.115789	00:08
26	0.065507	0.554435	0.115789	00:09
27	0.063554	0.564250	0.126316	00:08
28	0.058297	0.568706	0.126316	00:08
29	0.055142	0.556144	0.126316	00:08

```python interp = ClassificationInterpretation.from_learner(learn) ``` Below shown are the images with highest losses. As you can see, most of it is noise. For example, the third image is an edited version, 6th and 9th images are graphic designs and not raw images. ```python interp.plot_top_losses(9, figsize=(15,11)) ```

```python interp.plot_confusion_matrix() ```

The confusion matrix thus shows that the model is quite adept at classifying the images. It currently shows an accuracy of 87%-89%. The accuracy can further be improved by - increasing the data - inceasing the computation effort - removing the noise from data ```python ```