SlideShare uma empresa Scribd logo
1 de 28
Baixar para ler offline
Neural

Network From
Scratch
/Image Recognition
Neural Network From Scratch
is the new black. It seamlessly identifies
people, animals, places, buildings, any
objects you configure.
The world of Image Recognitions moves fast. Here’s
the proof: facial recognition technology became
Taking all this into account – we have decided to build
our own neural network from scratch. Our goal was to
recognize medical masks on real-life streets footage from
web cams around the world. Here’s how we did it.
20 times more accurate (from 2014 to 2018).
Image Recognition
Already used by companies like Google, Shutterstock,
Ebay, Salesforce, Pinterest, it’s expected to grow even
more
Neural Network From Scratch
Table of contents
1. Technology Stack_
2. The Basics_
3. Educating Modules_
4. Anchors_
5. Labels_
6. Models_
7. Sizes_
8. Jitter_
9. Datasets - Actual images and their description
10. Tools for Labelling_
11. Commands for educating modules_
Neural Network From Scratch
The Essential Tech Stack:
Neural Network From Scratch
We decided to start simple. Earlier we discovered a
system, which worked great for detecting fire.
The basics
It used the ready-made module
in concert with
ImageAI
the special repository
This exact system could recognize fire on images.
We grasped that idea from standard modules and
educational pictures that were available online.
Neural Network From Scratch
However, this system used old-school tech
stack. We had to install old-school Python 3.6,
Tensorflow 1, older version of OpenCV. 

version 3.6
version 1
Sstruggling with older technologies wasn’t convenient.
The only decent solution, in this case, is called
Anaconda. But this also didn’t work out.


we tried to compile everything and debug this,
we simply gave it up.


As expected, using older technologies resulted in
failures and bugs. We also couldn’t educate models at
all. Typically, we would receive various errors during
the educating process. 

After
Older TechnologiesChallenge 1:
Neural Network From Scratch
Trying to understand the basicsChallenge 2:
At this stage, we decided to dive right into it &
understand how recognition works from inside. We
used imageAI, rewriting everything in NodeJs. This is
much more convenient than using Python. 

Keep in mind, this is Linux-based. If you are using
Windows or macOS you’ll need to install Linux
sub-system. Yes, it’s not going to work well on NodeJs.
But the ultimate goal was to understand how the
recognition worked.
‹
So we chose the most powerful module of objects
recognition called This is a truly universal
technology that can recognize anything for a short time
period.
Yolov3
Neural Network From Scratch
Old-school Tensorflow versionsChallenge 3:
Nevertheless, we soon faced another problem – ‹
was based on darknetYolov3
Tensorflow
After we put everything together on Linux, new
issue came up. The versions weren’t compatible,
even though we used the latest versions of each
software.
‹
Turns out that the problem was an old-school
version of . It’s a neural network for
educating. It was the ‘brains’ behind our software.
So we had to fix this issue altogether for

everything to start working.https://pjreddie.com/darknet/yolo/
Neural Network From Scratch
was to convert Tensorflow versions using a special
third-party app. We launched it using the following
code:
tf_upgrade_v2.py  --infile yolo.py  --outfile
yolo_v2.py
The app converted the older code tf1 into tf2. This
didn’t work perfectly, and occasionally we had to
re-write the code. But the problems were partially
gone.
The solution
What’s more, it was pretty fun to change to 

inside the entire code. Keras became
the part of Tensorflow in its newest version so we
had to completely remove older Keras code from
the project.‹

So after all this struggle, our custom ImageAI still
couldn’t educate modules. But it worked pretty well
for recognizing photos and videos (shots). So the
standard pre-educated module could recognize fire
on video, great!
Tensorflow
Keras
Neural Network From Scratch
Finally we have found Yolo (Version 3) on a web
forum. It doesn’t require Linux and supports nightly
version of Tensorflow 2. We used
‹
The forum user completely rewrote original latest
version of Yolo so that it could support TF2 and
Windows. Finally all versions were compatible. This
library was just perfect for educating models.

Here’s what data you need to initialize the
program:
Educating modules
this library.
So we started
"anchors": [31,29,86,119,34,27],

"labels": ["mask"],

"net_size": 288

"pretrained": {

"keras_format": "configsmask_500weights.h5",

"darknet_format": "yolov3.weights"

},

"train": {

"min_size": 288,

"max_size": 288,

"num_epoch": 30,

"train_image_folder": "dataset/mask_500/train/images",

"train_annot_folder":
"dataset/mask_500/train/annotations",

"valid_image_folder": "dataset/mask_500/train/images",

"valid_annot_folder":
"dataset/mask_500/train/annotations",

"batch_size": 8,

"learning_rate": 1e-4,

"save_folder": "configs/mask_500",

"jitter": false

}
to dig deeper and found out the following:
Neural Network From Scratch
1. Anchors
const kmeans = require("node-kmeans");

export function k_means(ann_dims, anchor_num) {

return kmeans.clusterize(ann_dims, { k: anchor_num }, (err, res) => {

if (err) console.error(err);

// else console.log("%o", res);

}
Anchors are basically the extent of how much the
elements can widen or narrow down; it’s also the
distance that element can move to the centre of
the object.
Basically, we take a on a picture
and move it left or right. In our case, we started
with the simplest solution here. Here it goes:
Central point
Neural Network From Scratch
Here’s the basic logic for calculating ‘central’ points:
It’s likely that there are better solutions but we
decided not get caught up with this. We already
received the ‘magical’ numbers, so we decided to
move on.
This is simple. Labels are the names of the objects
that we are looking for. In our case, it’s a‹‹
In the perfect-world scenario, we would need to
distinguish faces without masks for comparison.
const clasters = k_means(annotation_dims, num_anchors);

const centroids = [];

clasters.groups.forEach(Group => {

centroids.push(Group.centroid);

});

const anchors: any = centroids;

const widths = anchors.map(c => c[0]);

const sorted_indices: any = widths

.map((item, index) => {

return { item, index };

})

.sort((a, b) => a.item - b.item)

.map(i => i.index);

const anchor_array = [];

let out_string = "";

for (let i = 0; i < sorted_indices.length; i++) {

anchor_array.push(Math.trunc(anchors[i][0] * 416));

anchor_array.push(Math.trunc(anchors[i][1] * 416));

out_string +=

Math.trunc(anchors[i][0] * 416) +

"," +

Math.trunc(anchors[i][1] * 416) +

", ";

}

const reverse_anchor_array = anchor_array;
2. Labels
“Mask”
Neural Network From Scratch
We strictly need yolov3.weights for educating
models. This is a standard pre-defined model,
needed for the initial education. It’s critical that this
default model shouldn’t be further educated; it’s
used for the structure and annotations. 

In a nutshell, we need to shrink pictures for educating
process (their size should be multiple of 2). Ideally, the
picture should be shrunk geometrically. Therefore, we need
to define its min size, max size and net size. 


Obviously, all pictures have different sizes. Therefore, a new
issue came up – we couldn’t combine different shapes. This
means, we’ll need to use the same values in all three
parameters. 

3. Models
4. SizesMan
Bear
Dog
Kitty cat
Monkey
Bird
Weak Module - 288

Strong Module- 41
Neural Network From Scratch
The strong module takes much time to educate.
It’s pretty accurate, however it doesn’t always
recognize all objects. So after we played around
with it, it turned out that it also requires many
images.
This value is used for cropping images [0-1]. As a
rule, we use false or 0.3.
6. Jitter
As a Rule:
Crop images
Batch size is the amount of pictures that are
compared to each other. This number should be
5. Batch size
multiple of 2 If you go for a bigger quantity,
this will result into a high load for your system.
False or 0.3
[0-1]
Neural Network From Scratch
7. Datasets - Actual images and their description
Since our task is to recognize masks on human
faces, we simply headed over to Google and
searched for the relevant images
To speed things up, we used a simple utility
– Picture Google Grabber. It retrieves images from
Google using relevant keywords. 

There is a slight issue. The previews in Google are
low-quality. We needed to follow each URL and download
original image onsite.

Medical masks on streets
Neural Network From Scratch
Picture Google Grabber
Just enter a few search queries and done. You’ll
receive the full collection of images.
file manager Files Grabber
535.jpg
540.jpg
536.jpg
541.jpg
537.jpg
542.jpg
538.jpg 539.jpg
543.jpg 544.jpg
You can make it even more convenient & rename all
images. Select all of them, press F2 – pictures will be
renamed automatically.
Neural Network From Scratch
Next step: add annotations to your images, then
highlight all masks. Below you will find the
annotations for Yolov3 in XML format. 


Earlier we rewrote using and
now this came in handy.
<annotation>

<folder>images</folder>

<filename>img (3).jpg</filename>

<path>C:GitaaaaaaaaaaaaaaaaaaaaaFIre-detectionsrcmasktraina
nnotationsimg (3).jpg</

path>

<source>

<database>Unknown</database>

</source>

<size>

<width>1280</width>

<height>720</height> ‹
<depth>3</depth>

</size>

<segmented>0</segmented>

<object>
<name>mask</name>

<pose>Unspecified</pose>

<truncated>0</truncated>

<difficult>0</difficult>

<bndbox>

<xmin>715</xmin>

<ymin>445</ymin>

<xmax>722</xmax>

<ymax>448</ymax>

</bndbox>

</object>

</annotation>
NodejsImageAi
Neural Network From Scratch
Mask
cache
json
logs
models
annotations
images
train
Even though ImageAI failed to educate modules, it
beautifully broke down data into folders. They had
the following structure:
The annotations backup is located in cache.
Meanwhile, we have labels and anchors in JSON.
We really don’t need logs and modules. At the same time,
’Train’ > ‘Validation’ store pictures & annotations using
similar titles. This structure is pretty convenient for
educating multiple modules and storing data. 


The solution worked well with standard models.
Nevertheless, we need to provide our own images &
educate specific pictures with masks.
{"labels":["mask"],"anchors":

[31,29,86,119,34,27,25,29,71,124,44,48,67,69,37,30,45,45]

}
Neural Network From Scratch
Labeling
We discovered 4 apps for labelling
images on the web. Here they go,
rated. 

At the first sight, the program seems convenient to use.
However, it requires to use Qt. It’s a fully functional
programming language, which includes many modules.
This language has advantages too.‹‹
For instance, it saves annotations straight into Pascal
Voc XML format, which is exactly what we need. ‹
Unfortunately, it’s extremely complicated to install on

Windows, that’s why we decided to try other solutions.
1) LabelIMG
Neural Network From Scratch
This is just a simple webpage on the Internet.

However, as we later found out – it’s pretty sluggish
and hard to use. If indeed, you decide to use it, get
ready to suffer. The major downside here is that the
data is formatted in JSON.


So we had to convert it into Pascal VOC. The webpage
also doesn’t have SSL protocols, which is disappointing. 

Taking all this into account, we decided to go with
alternative options.
2. VGG Image Annotator
Neural Network From Scratch
We gave up on this solution from start. With supervise.ly
you need to highlight the ‘needed’ object every time.
The truth is – we don’t need such power at the moment. ‹‹
What’s more, users have to clearly define all the
objects/classes/types in advance. If you forget any of it
– you will receive errors. The system stops working. 


Naturally, we weren’t too excited about this fact. This is
recommended for more accurate, complex labelling. In
our case, we can just use a basic square.

3) Supervise.ly
Neural Network From Scratch
Labelbox is an excellent software. It has the
perfect hotkey system and it quickly highlights
lots of objects. Besides, it’s convenient for
teams: multiple people can label objects in the
real-time. 
‹
This tool also generates essential statistics like
the amount of missed pictures & it features
many other cool tidbits. It’s a little surprise, that
we decided to use Labelbox.
4) Labelbox
Neural Network From Scratch
1. The structure of information in JSON is too customized.
1. [

2. {

3. "ID": "",

4. "DataRow ID": "",

5. "Labeled Data": "",

6. "Label": {

7. "mask": [

8. {

9. "geometry": [

10. { "x": 208, "y": 276 },

11. { "x": 307, "y": 276 },

12. { "x": 307, "y": 352 },

13. { "x": 208, "y": 352 }

14. ]

15. }

16. ]

17. },

18. "Created By": "@gmail.com",
19. "Project Name": "masks 500",

20. "Created At": "2020-03-26T07:40:26.000Z",

21. "Updated At": "2020-03-26T07:40:26.000Z",

22. "Seconds to Label": 7.069,

23. "External ID": "img2 (124).jpg",

24. "Agreement": null,

25. "Benchmark Agreement": null,

26. "Benchmark ID": null,

27. "Benchmark Reference ID": null,

28. "Dataset Name": "masks 500",

29. "Reviews": [],

30. "View Label": "",

31. "Masks": {

32. "mask": ""

33. }

34. ]
Labelbox has 2 downsides:
Neural Network From Scratch
const image = cv.imread(`${img_dir}${ann["External ID"]}`);

const size = image.sizes;
Changed this into
<xmin>258</xmin><ymin>208</ymin><xmax>322</xmax><ymax
>244</

ymax>
it DOES NOT include image dimensions 

However, there’s an image title at least. Anyway, we
had to use in order to upload an
image. We took the dimensions from there.
We fixed this with the help of the following code:

Typescript saves the day.
opencv4nodejs
2. As you can see,
3. New Issue That
Follows:
1. { "x": 208, "y": 276 },

2. { "x": 307, "y": 276 },

3. { "x": 307, "y": 352 },

4. { "x": 208, "y": 352 }
const xArr: number[] = [];

const yArr: number[] = [];

object.geometry.forEach(e => {

xArr.push(e.x);

yArr.push(e.y);

});

obj["xmin"] = [Math.min(...xArr)];

obj["ymin"] = [Math.min(...yArr)];

obj["xmax"] = [Math.max(...xArr)];

obj["ymax"] = [Math.max(...yArr)];
Neural Network From Scratch
We used Elementree to compose the structure of
the XML tree & set the parameters. Then we
created a loop, so that it would work for many
images.

Next step
All results are kept in the Annotations folder.
Therefore, we have a full dataset with annotations and
convenient structure. The only thing left is to use our
software for educating.
Neural Network From Scratch
The software responds to the following
commands: 

// =============== read ===================

python src/pred.py -c configs/mask.json -i
imgs/1.jpg ‹‹
This command helps to recognize an image. We
just need to prepare our JSON file with the
required parameters (described in the
beginning) and configure the path to the
needed image. 


Educating Models
// ================= test ================

python src/eval.py -c configs/ mask.json

This is basically the benchmark of the model. The
command shows 3 different parameters, displaying the
quality of the module. 


{'fscore': 0.21052631578947367, 'precision':
0.8461538461538461, 'recall': 0.12021857923497267} 


1) Fscore stands for the probability that the model will
find the object on the image.

2) Precision is the probability that the model will find
the right object & won’t make a mistake.

3) Recall – the probability that the square will be
evenly drawn and that it won’t shift its position.
Neural Network From Scratch
Educating Models
// ================== train ===================

python src/train_eager.py -c configs/ mask.json 

We use this command for educating our neural
network. After lots of mistakes, we achieved the
first results, finally. In this instance, we used 90
images.
‹
Size - 288, Batch_size – 8, num_epoch – 20,

time for education – 1,5 hours.
 At this point, we can even process images and videos. But
potentially, with more powerful module this system could
process real-life webcam footage. 

Here’s the result:
Thank’s for
watching

Mais conteĂșdo relacionado

Mais procurados

Neural network
Neural network Neural network
Neural network Faireen
 
artificial neural network
artificial neural networkartificial neural network
artificial neural networkPallavi Yadav
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANNMohamed Talaat
 
Ai and neural networks
Ai and neural networksAi and neural networks
Ai and neural networksNikhil Kansari
 
Neural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics CourseNeural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics CourseMohaiminur Rahman
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkFaria Priya
 
Artificial Neural Network and its Applications
Artificial Neural Network and its ApplicationsArtificial Neural Network and its Applications
Artificial Neural Network and its Applicationsshritosh kumar
 
Forecasting of Sales using Neural network techniques
Forecasting of Sales using Neural network techniquesForecasting of Sales using Neural network techniques
Forecasting of Sales using Neural network techniquesHitesh Dua
 
Artificial Neural Network report
Artificial Neural Network reportArtificial Neural Network report
Artificial Neural Network reportAnjali Agrawal
 
Neural Network Research Projects Topics
Neural Network Research Projects TopicsNeural Network Research Projects Topics
Neural Network Research Projects TopicsMatlab Simulation
 
Artificial Neural Network Paper Presentation
Artificial Neural Network Paper PresentationArtificial Neural Network Paper Presentation
Artificial Neural Network Paper Presentationguestac67362
 
Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)spartacus131211
 
Learning Methods in a Neural Network
Learning Methods in a Neural NetworkLearning Methods in a Neural Network
Learning Methods in a Neural NetworkSaransh Choudhary
 
Neural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance IndustryNeural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance IndustryInderjeet Singh
 
Neural networks.ppt
Neural networks.pptNeural networks.ppt
Neural networks.pptSrinivashR3
 
Machine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuMachine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuSeokhyun Yoon
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design TrainingESCOM
 
Regression and Artificial Neural Network in R
Regression and Artificial Neural Network in RRegression and Artificial Neural Network in R
Regression and Artificial Neural Network in RDr. Vaibhav Kumar
 

Mais procurados (20)

Neural Computing
Neural ComputingNeural Computing
Neural Computing
 
Neural network
Neural network Neural network
Neural network
 
artificial neural network
artificial neural networkartificial neural network
artificial neural network
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
 
Ai and neural networks
Ai and neural networksAi and neural networks
Ai and neural networks
 
Neural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics CourseNeural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics Course
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Artificial Neural Network and its Applications
Artificial Neural Network and its ApplicationsArtificial Neural Network and its Applications
Artificial Neural Network and its Applications
 
Forecasting of Sales using Neural network techniques
Forecasting of Sales using Neural network techniquesForecasting of Sales using Neural network techniques
Forecasting of Sales using Neural network techniques
 
Artificial Neural Network report
Artificial Neural Network reportArtificial Neural Network report
Artificial Neural Network report
 
Neural Network Research Projects Topics
Neural Network Research Projects TopicsNeural Network Research Projects Topics
Neural Network Research Projects Topics
 
Neural network
Neural networkNeural network
Neural network
 
Artificial Neural Network Paper Presentation
Artificial Neural Network Paper PresentationArtificial Neural Network Paper Presentation
Artificial Neural Network Paper Presentation
 
Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)
 
Learning Methods in a Neural Network
Learning Methods in a Neural NetworkLearning Methods in a Neural Network
Learning Methods in a Neural Network
 
Neural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance IndustryNeural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance Industry
 
Neural networks.ppt
Neural networks.pptNeural networks.ppt
Neural networks.ppt
 
Machine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuMachine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dku
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
 
Regression and Artificial Neural Network in R
Regression and Artificial Neural Network in RRegression and Artificial Neural Network in R
Regression and Artificial Neural Network in R
 

Semelhante a Neural network image recognition

Designing a neural network architecture for image recognition
Designing a neural network architecture for image recognitionDesigning a neural network architecture for image recognition
Designing a neural network architecture for image recognitionShandukaniVhulondo
 
One shot learning
One shot learningOne shot learning
One shot learningVuong Ho Ngoc
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecognIlyas CHAOUA
 
Deep Learning Demystified
Deep Learning DemystifiedDeep Learning Demystified
Deep Learning DemystifiedAffine Analytics
 
A local metric for defocus blur detection cnn feature learning screenshots
A local metric for defocus blur detection   cnn feature learning screenshotsA local metric for defocus blur detection   cnn feature learning screenshots
A local metric for defocus blur detection cnn feature learning screenshotsVenkat Projects
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...StampedeCon
 
Team16_Narayana_InstanceSegmentation.pptx
Team16_Narayana_InstanceSegmentation.pptxTeam16_Narayana_InstanceSegmentation.pptx
Team16_Narayana_InstanceSegmentation.pptxAnimeGuru1
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & PythonLonghow Lam
 
The Evolution Of Eclipse 1. 1 )
The Evolution Of Eclipse 1. 1 )The Evolution Of Eclipse 1. 1 )
The Evolution Of Eclipse 1. 1 )Patty Buckley
 
Mind Control to Major Tom: Is It Time to Put Your EEG Headset On?
Mind Control to Major Tom: Is It Time to Put Your EEG Headset On? Mind Control to Major Tom: Is It Time to Put Your EEG Headset On?
Mind Control to Major Tom: Is It Time to Put Your EEG Headset On? Steve Poole
 
How to Build a Neural Network and Make Predictions
How to Build a Neural Network and Make PredictionsHow to Build a Neural Network and Make Predictions
How to Build a Neural Network and Make PredictionsDeveloper Helps
 
Cat and dog classification
Cat and dog classificationCat and dog classification
Cat and dog classificationomaraldabash
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakPyData
 
How to implement artificial intelligence solutions
How to implement artificial intelligence solutionsHow to implement artificial intelligence solutions
How to implement artificial intelligence solutionsCarlos Toxtli
 
Apple Machine Learning
Apple Machine LearningApple Machine Learning
Apple Machine LearningDenise Nepraunig
 
Enhance your java applications with deep learning using deep netts
Enhance your java applications with deep learning using deep nettsEnhance your java applications with deep learning using deep netts
Enhance your java applications with deep learning using deep nettsZoran Sevarac, PhD
 
Image_recognition.pptx
Image_recognition.pptxImage_recognition.pptx
Image_recognition.pptxjohn6938
 
Siddha Ganju. Deep learning on mobile
Siddha Ganju. Deep learning on mobileSiddha Ganju. Deep learning on mobile
Siddha Ganju. Deep learning on mobileLviv Startup Club
 
Siddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for MobileSiddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for MobileIT Arena
 

Semelhante a Neural network image recognition (20)

Designing a neural network architecture for image recognition
Designing a neural network architecture for image recognitionDesigning a neural network architecture for image recognition
Designing a neural network architecture for image recognition
 
One shot learning
One shot learningOne shot learning
One shot learning
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 
Deep Learning Demystified
Deep Learning DemystifiedDeep Learning Demystified
Deep Learning Demystified
 
A local metric for defocus blur detection cnn feature learning screenshots
A local metric for defocus blur detection   cnn feature learning screenshotsA local metric for defocus blur detection   cnn feature learning screenshots
A local metric for defocus blur detection cnn feature learning screenshots
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
 
Team16_Narayana_InstanceSegmentation.pptx
Team16_Narayana_InstanceSegmentation.pptxTeam16_Narayana_InstanceSegmentation.pptx
Team16_Narayana_InstanceSegmentation.pptx
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & Python
 
The Evolution Of Eclipse 1. 1 )
The Evolution Of Eclipse 1. 1 )The Evolution Of Eclipse 1. 1 )
The Evolution Of Eclipse 1. 1 )
 
Mind Control to Major Tom: Is It Time to Put Your EEG Headset On?
Mind Control to Major Tom: Is It Time to Put Your EEG Headset On? Mind Control to Major Tom: Is It Time to Put Your EEG Headset On?
Mind Control to Major Tom: Is It Time to Put Your EEG Headset On?
 
How to Build a Neural Network and Make Predictions
How to Build a Neural Network and Make PredictionsHow to Build a Neural Network and Make Predictions
How to Build a Neural Network and Make Predictions
 
Ai use cases
Ai use casesAi use cases
Ai use cases
 
Cat and dog classification
Cat and dog classificationCat and dog classification
Cat and dog classification
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 
How to implement artificial intelligence solutions
How to implement artificial intelligence solutionsHow to implement artificial intelligence solutions
How to implement artificial intelligence solutions
 
Apple Machine Learning
Apple Machine LearningApple Machine Learning
Apple Machine Learning
 
Enhance your java applications with deep learning using deep netts
Enhance your java applications with deep learning using deep nettsEnhance your java applications with deep learning using deep netts
Enhance your java applications with deep learning using deep netts
 
Image_recognition.pptx
Image_recognition.pptxImage_recognition.pptx
Image_recognition.pptx
 
Siddha Ganju. Deep learning on mobile
Siddha Ganju. Deep learning on mobileSiddha Ganju. Deep learning on mobile
Siddha Ganju. Deep learning on mobile
 
Siddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for MobileSiddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for Mobile
 

Último

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĂșjo
 

Último (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Neural network image recognition

  • 2. Neural Network From Scratch is the new black. It seamlessly identifies people, animals, places, buildings, any objects you configure. The world of Image Recognitions moves fast. Here’s the proof: facial recognition technology became Taking all this into account – we have decided to build our own neural network from scratch. Our goal was to recognize medical masks on real-life streets footage from web cams around the world. Here’s how we did it. 20 times more accurate (from 2014 to 2018). Image Recognition Already used by companies like Google, Shutterstock, Ebay, Salesforce, Pinterest, it’s expected to grow even more
  • 3. Neural Network From Scratch Table of contents 1. Technology Stack_ 2. The Basics_ 3. Educating Modules_ 4. Anchors_ 5. Labels_ 6. Models_ 7. Sizes_ 8. Jitter_ 9. Datasets - Actual images and their description 10. Tools for Labelling_ 11. Commands for educating modules_
  • 4. Neural Network From Scratch The Essential Tech Stack:
  • 5. Neural Network From Scratch We decided to start simple. Earlier we discovered a system, which worked great for detecting fire. The basics It used the ready-made module in concert with ImageAI the special repository This exact system could recognize fire on images. We grasped that idea from standard modules and educational pictures that were available online.
  • 6. Neural Network From Scratch However, this system used old-school tech stack. We had to install old-school Python 3.6, Tensorflow 1, older version of OpenCV. version 3.6 version 1 Sstruggling with older technologies wasn’t convenient. The only decent solution, in this case, is called Anaconda. But this also didn’t work out. we tried to compile everything and debug this, we simply gave it up. As expected, using older technologies resulted in failures and bugs. We also couldn’t educate models at all. Typically, we would receive various errors during the educating process. After Older TechnologiesChallenge 1:
  • 7. Neural Network From Scratch Trying to understand the basicsChallenge 2: At this stage, we decided to dive right into it & understand how recognition works from inside. We used imageAI, rewriting everything in NodeJs. This is much more convenient than using Python. Keep in mind, this is Linux-based. If you are using Windows or macOS you’ll need to install Linux sub-system. Yes, it’s not going to work well on NodeJs. But the ultimate goal was to understand how the recognition worked. ‹ So we chose the most powerful module of objects recognition called This is a truly universal technology that can recognize anything for a short time period. Yolov3
  • 8. Neural Network From Scratch Old-school Tensorflow versionsChallenge 3: Nevertheless, we soon faced another problem – ‹ was based on darknetYolov3 Tensorflow After we put everything together on Linux, new issue came up. The versions weren’t compatible, even though we used the latest versions of each software. ‹ Turns out that the problem was an old-school version of . It’s a neural network for educating. It was the ‘brains’ behind our software. So we had to fix this issue altogether for everything to start working.https://pjreddie.com/darknet/yolo/
  • 9. Neural Network From Scratch was to convert Tensorflow versions using a special third-party app. We launched it using the following code: tf_upgrade_v2.py --infile yolo.py --outfile yolo_v2.py The app converted the older code tf1 into tf2. This didn’t work perfectly, and occasionally we had to re-write the code. But the problems were partially gone. The solution What’s more, it was pretty fun to change to inside the entire code. Keras became the part of Tensorflow in its newest version so we had to completely remove older Keras code from the project.‹ So after all this struggle, our custom ImageAI still couldn’t educate modules. But it worked pretty well for recognizing photos and videos (shots). So the standard pre-educated module could recognize fire on video, great! Tensorflow Keras
  • 10. Neural Network From Scratch Finally we have found Yolo (Version 3) on a web forum. It doesn’t require Linux and supports nightly version of Tensorflow 2. We used ‹ The forum user completely rewrote original latest version of Yolo so that it could support TF2 and Windows. Finally all versions were compatible. This library was just perfect for educating models. Here’s what data you need to initialize the program: Educating modules this library. So we started "anchors": [31,29,86,119,34,27], "labels": ["mask"], "net_size": 288 "pretrained": { "keras_format": "configsmask_500weights.h5", "darknet_format": "yolov3.weights" }, "train": { "min_size": 288, "max_size": 288, "num_epoch": 30, "train_image_folder": "dataset/mask_500/train/images", "train_annot_folder": "dataset/mask_500/train/annotations", "valid_image_folder": "dataset/mask_500/train/images", "valid_annot_folder": "dataset/mask_500/train/annotations", "batch_size": 8, "learning_rate": 1e-4, "save_folder": "configs/mask_500", "jitter": false } to dig deeper and found out the following:
  • 11. Neural Network From Scratch 1. Anchors const kmeans = require("node-kmeans"); export function k_means(ann_dims, anchor_num) { return kmeans.clusterize(ann_dims, { k: anchor_num }, (err, res) => { if (err) console.error(err); // else console.log("%o", res); } Anchors are basically the extent of how much the elements can widen or narrow down; it’s also the distance that element can move to the centre of the object. Basically, we take a on a picture and move it left or right. In our case, we started with the simplest solution here. Here it goes: Central point
  • 12. Neural Network From Scratch Here’s the basic logic for calculating ‘central’ points: It’s likely that there are better solutions but we decided not get caught up with this. We already received the ‘magical’ numbers, so we decided to move on. This is simple. Labels are the names of the objects that we are looking for. In our case, it’s a‹‹ In the perfect-world scenario, we would need to distinguish faces without masks for comparison. const clasters = k_means(annotation_dims, num_anchors); const centroids = []; clasters.groups.forEach(Group => { centroids.push(Group.centroid); }); const anchors: any = centroids; const widths = anchors.map(c => c[0]); const sorted_indices: any = widths .map((item, index) => { return { item, index }; }) .sort((a, b) => a.item - b.item) .map(i => i.index); const anchor_array = []; let out_string = ""; for (let i = 0; i < sorted_indices.length; i++) { anchor_array.push(Math.trunc(anchors[i][0] * 416)); anchor_array.push(Math.trunc(anchors[i][1] * 416)); out_string += Math.trunc(anchors[i][0] * 416) + "," + Math.trunc(anchors[i][1] * 416) + ", "; } const reverse_anchor_array = anchor_array; 2. Labels “Mask”
  • 13. Neural Network From Scratch We strictly need yolov3.weights for educating models. This is a standard pre-defined model, needed for the initial education. It’s critical that this default model shouldn’t be further educated; it’s used for the structure and annotations. In a nutshell, we need to shrink pictures for educating process (their size should be multiple of 2). Ideally, the picture should be shrunk geometrically. Therefore, we need to define its min size, max size and net size. Obviously, all pictures have different sizes. Therefore, a new issue came up – we couldn’t combine different shapes. This means, we’ll need to use the same values in all three parameters. 3. Models 4. SizesMan Bear Dog Kitty cat Monkey Bird Weak Module - 288 Strong Module- 41
  • 14. Neural Network From Scratch The strong module takes much time to educate. It’s pretty accurate, however it doesn’t always recognize all objects. So after we played around with it, it turned out that it also requires many images. This value is used for cropping images [0-1]. As a rule, we use false or 0.3. 6. Jitter As a Rule: Crop images Batch size is the amount of pictures that are compared to each other. This number should be 5. Batch size multiple of 2 If you go for a bigger quantity, this will result into a high load for your system. False or 0.3 [0-1]
  • 15. Neural Network From Scratch 7. Datasets - Actual images and their description Since our task is to recognize masks on human faces, we simply headed over to Google and searched for the relevant images To speed things up, we used a simple utility – Picture Google Grabber. It retrieves images from Google using relevant keywords. There is a slight issue. The previews in Google are low-quality. We needed to follow each URL and download original image onsite. Medical masks on streets
  • 16. Neural Network From Scratch Picture Google Grabber Just enter a few search queries and done. You’ll receive the full collection of images. file manager Files Grabber 535.jpg 540.jpg 536.jpg 541.jpg 537.jpg 542.jpg 538.jpg 539.jpg 543.jpg 544.jpg You can make it even more convenient & rename all images. Select all of them, press F2 – pictures will be renamed automatically.
  • 17. Neural Network From Scratch Next step: add annotations to your images, then highlight all masks. Below you will find the annotations for Yolov3 in XML format. Earlier we rewrote using and now this came in handy. <annotation> <folder>images</folder> <filename>img (3).jpg</filename> <path>C:GitaaaaaaaaaaaaaaaaaaaaaFIre-detectionsrcmasktraina nnotationsimg (3).jpg</ path> <source> <database>Unknown</database> </source> <size> <width>1280</width> <height>720</height> ‹ <depth>3</depth> </size> <segmented>0</segmented> <object> <name>mask</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>715</xmin> <ymin>445</ymin> <xmax>722</xmax> <ymax>448</ymax> </bndbox> </object> </annotation> NodejsImageAi
  • 18. Neural Network From Scratch Mask cache json logs models annotations images train Even though ImageAI failed to educate modules, it beautifully broke down data into folders. They had the following structure: The annotations backup is located in cache. Meanwhile, we have labels and anchors in JSON. We really don’t need logs and modules. At the same time, ’Train’ > ‘Validation’ store pictures & annotations using similar titles. This structure is pretty convenient for educating multiple modules and storing data. The solution worked well with standard models. Nevertheless, we need to provide our own images & educate specific pictures with masks. {"labels":["mask"],"anchors": [31,29,86,119,34,27,25,29,71,124,44,48,67,69,37,30,45,45] }
  • 19. Neural Network From Scratch Labeling We discovered 4 apps for labelling images on the web. Here they go, rated. At the first sight, the program seems convenient to use. However, it requires to use Qt. It’s a fully functional programming language, which includes many modules. This language has advantages too.‹‹ For instance, it saves annotations straight into Pascal Voc XML format, which is exactly what we need. ‹ Unfortunately, it’s extremely complicated to install on Windows, that’s why we decided to try other solutions. 1) LabelIMG
  • 20. Neural Network From Scratch This is just a simple webpage on the Internet. However, as we later found out – it’s pretty sluggish and hard to use. If indeed, you decide to use it, get ready to suffer. The major downside here is that the data is formatted in JSON. So we had to convert it into Pascal VOC. The webpage also doesn’t have SSL protocols, which is disappointing. Taking all this into account, we decided to go with alternative options. 2. VGG Image Annotator
  • 21. Neural Network From Scratch We gave up on this solution from start. With supervise.ly you need to highlight the ‘needed’ object every time. The truth is – we don’t need such power at the moment. ‹‹ What’s more, users have to clearly define all the objects/classes/types in advance. If you forget any of it – you will receive errors. The system stops working. Naturally, we weren’t too excited about this fact. This is recommended for more accurate, complex labelling. In our case, we can just use a basic square. 3) Supervise.ly
  • 22. Neural Network From Scratch Labelbox is an excellent software. It has the perfect hotkey system and it quickly highlights lots of objects. Besides, it’s convenient for teams: multiple people can label objects in the real-time. ‹ This tool also generates essential statistics like the amount of missed pictures & it features many other cool tidbits. It’s a little surprise, that we decided to use Labelbox. 4) Labelbox
  • 23. Neural Network From Scratch 1. The structure of information in JSON is too customized. 1. [ 2. { 3. "ID": "", 4. "DataRow ID": "", 5. "Labeled Data": "", 6. "Label": { 7. "mask": [ 8. { 9. "geometry": [ 10. { "x": 208, "y": 276 }, 11. { "x": 307, "y": 276 }, 12. { "x": 307, "y": 352 }, 13. { "x": 208, "y": 352 } 14. ] 15. } 16. ] 17. }, 18. "Created By": "@gmail.com", 19. "Project Name": "masks 500", 20. "Created At": "2020-03-26T07:40:26.000Z", 21. "Updated At": "2020-03-26T07:40:26.000Z", 22. "Seconds to Label": 7.069, 23. "External ID": "img2 (124).jpg", 24. "Agreement": null, 25. "Benchmark Agreement": null, 26. "Benchmark ID": null, 27. "Benchmark Reference ID": null, 28. "Dataset Name": "masks 500", 29. "Reviews": [], 30. "View Label": "", 31. "Masks": { 32. "mask": "" 33. } 34. ] Labelbox has 2 downsides:
  • 24. Neural Network From Scratch const image = cv.imread(`${img_dir}${ann["External ID"]}`); const size = image.sizes; Changed this into <xmin>258</xmin><ymin>208</ymin><xmax>322</xmax><ymax >244</ ymax> it DOES NOT include image dimensions However, there’s an image title at least. Anyway, we had to use in order to upload an image. We took the dimensions from there. We fixed this with the help of the following code: Typescript saves the day. opencv4nodejs 2. As you can see, 3. New Issue That Follows: 1. { "x": 208, "y": 276 }, 2. { "x": 307, "y": 276 }, 3. { "x": 307, "y": 352 }, 4. { "x": 208, "y": 352 } const xArr: number[] = []; const yArr: number[] = []; object.geometry.forEach(e => { xArr.push(e.x); yArr.push(e.y); }); obj["xmin"] = [Math.min(...xArr)]; obj["ymin"] = [Math.min(...yArr)]; obj["xmax"] = [Math.max(...xArr)]; obj["ymax"] = [Math.max(...yArr)];
  • 25. Neural Network From Scratch We used Elementree to compose the structure of the XML tree & set the parameters. Then we created a loop, so that it would work for many images. Next step All results are kept in the Annotations folder. Therefore, we have a full dataset with annotations and convenient structure. The only thing left is to use our software for educating.
  • 26. Neural Network From Scratch The software responds to the following commands: // =============== read =================== python src/pred.py -c configs/mask.json -i imgs/1.jpg ‹‹ This command helps to recognize an image. We just need to prepare our JSON file with the required parameters (described in the beginning) and configure the path to the needed image. Educating Models // ================= test ================ python src/eval.py -c configs/ mask.json This is basically the benchmark of the model. The command shows 3 different parameters, displaying the quality of the module. {'fscore': 0.21052631578947367, 'precision': 0.8461538461538461, 'recall': 0.12021857923497267} 1) Fscore stands for the probability that the model will find the object on the image. 2) Precision is the probability that the model will find the right object & won’t make a mistake. 3) Recall – the probability that the square will be evenly drawn and that it won’t shift its position.
  • 27. Neural Network From Scratch Educating Models // ================== train =================== python src/train_eager.py -c configs/ mask.json We use this command for educating our neural network. After lots of mistakes, we achieved the first results, finally. In this instance, we used 90 images. ‹ Size - 288, Batch_size – 8, num_epoch – 20, time for education – 1,5 hours. At this point, we can even process images and videos. But potentially, with more powerful module this system could process real-life webcam footage. Here’s the result: