SlideShare a Scribd company logo
1 of 13
Deep Learning OCR using Nimbix
POWER8
INTRODUCTION
• OCR is the transformation of Images of text to
Machine encoded text.
• A simple API to an OCR library might provide a
function which takes as input an image and
outputs a string.
• In this project we have applied Deep learning
Neural Network to solve Optical Character
Recognition.
• We have made use of Tensorflow and
Convolutional Neural Network.
MOTIVATION
• Optical character recognition is needed when the
information should be readable both to humans and to a
machine and alternative inputs can not be predefined.
• The basic OCR system was invented to convert the data
available on papers in to computer process able
documents, So that the documents can be editable and
reusable.
• Traditional OCR techniques are typically multi-stage
processes. For example, first the image may be divided into
smaller regions that contain the individual characters,
second the individual characters are recognized, and finally
the result is pieced back together. A difficulty with this
approach is to obtain a good division of the original image.
Sample Architecture for CNN
What are convolution Neural Network
• Step 1 – Convolution Operation
• Step 1(b) – ReLu layer (Rectified Linear unit)
• Step 2 – Pooling
• Step 3 – Flattering
• Step 4 – Full Connection
Fully Connected Layer of CNN model
Source : Created by Kirill Eremenko, Hadelin de Ponteves, SuperDataScience Team
OCRGen.py
STRINGPOWER
AI
Dataset is generated using the
Python Imaging Library (PIL)
A fully convolutional network is presented
which transforms the input volume into a
sequence of character predictions.
Predicted Output
Fully Connected
Layer
CSV file
Deep OCR Architecture
• A fully convolutional network is presented
which transforms the input volume into a
sequence of character predictions. These
character predictions can then be transformed
into a string. The architecture of the network
is shown below in Figure.
• Where N is the number of possible characters. In this example,
there are 63 possible characters for uppercase and lowercase
characters, digits, and a blank character. The parenthesized values in
the convolutional layers are the filter sizes and stride values from
top to bottom respectively. The values in the reshape layer are the
reshaped dimension.
• The input volume is a rectangular RGB image. This first height and
width of this volume are reduced across the convolutional layers
using striding. The 3rd dimension of this volume increases from 3
channels (RGB) to 1 channel for each character possible. Thus, the
volume is transformed from an RGB image into a sequence of
vectors. Applying argmax across the channel dimension gives a
sequence of 1-hot encoded vectors which can be transformed into a
string.
SOURCE
https://github.com/nicholastoddsmith/pythonml/blob/master/Dee
pOCR/TFModel/_classes.txt
Result
• To facilitate training this network, a dataset is generated using the Python
Imaging Library (PIL). Random strings consisting of alphanumeric
characters are generated. Using PIL, images are generated for each
random string. A CSV file is also generated which contains the file name
and the associated random string. Some examples from the generated
dataset are shown below in Figure.
Training Data
Generating Data
Test Data
• Training and cross-validation results are
shown
Training the Network
• To train the network, the CSV file is parsed and the images are loaded into
memory. Each target value for the training data is a sequence of 1-hot
vectors. Thus the target matrix is a 3D matrix with the three dimensions
corresponding to sample, character, and 1-hot encoding respectively.
• Next the neural network is constructed using the artificial neural network
classifier (ANNC) class from TFANN. The architecture described above is
represented in the following lines of code using ANNC
• Softmax cross-entropy is used as the loss function which is performed
over the 3rd dimension of the output.
• Fitting the network and performing predictions is simple using the ANNC
class. The prediction is split up using array_split from numpy to prevent
out of memory errors.
System Details
• Distributed Deep Learning (DDL) environment
on POWER8 system with IBM PowerAI ML/DL
frameworks.
• 40 threads POWER8, 256 RAM, 1 x k80 GPU
• PushToCompute for compiling POWER8
applications and deploying directly to the
Nimbix Cloud

More Related Content

What's hot

In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitJinwon Lee
 
ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3zukun
 
Aerial detection part2
Aerial detection part2Aerial detection part2
Aerial detection part2ssuser456ad6
 
Lecture 11 neural network principles
Lecture 11 neural network principlesLecture 11 neural network principles
Lecture 11 neural network principlesVajira Thambawita
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis -  Massimo PeriniDeep Stream Dynamic Graph Analytics with Grapharis -  Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo PeriniFlink Forward
 
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...Sunghoon Joo
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...Jinwon Lee
 
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region DescriptorsHardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region DescriptorsMarek Kraft
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignJinwon Lee
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsMLAI2
 
Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...LogicMindtech Nologies
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
NUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node SystemNUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node SystemMohammad Tahsin Alshalabi
 
Strings in c langauge
Strings in c langaugeStrings in c langauge
Strings in c langaugeYash Thakkar
 
Graph Matching
Graph MatchingGraph Matching
Graph Matchinggraphitech
 
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...CSCJournals
 
A short introduction to Network coding
A short introduction to Network codingA short introduction to Network coding
A short introduction to Network codingArash Pourdamghani
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsVijay Karan
 

What's hot (20)

In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unit
 
Jpeg
JpegJpeg
Jpeg
 
ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3
 
Aerial detection part2
Aerial detection part2Aerial detection part2
Aerial detection part2
 
Tldr
TldrTldr
Tldr
 
Lecture 11 neural network principles
Lecture 11 neural network principlesLecture 11 neural network principles
Lecture 11 neural network principles
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis -  Massimo PeriniDeep Stream Dynamic Graph Analytics with Grapharis -  Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
 
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
 
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region DescriptorsHardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with Hypergraphs
 
Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
NUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node SystemNUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node System
 
Strings in c langauge
Strings in c langaugeStrings in c langauge
Strings in c langauge
 
Graph Matching
Graph MatchingGraph Matching
Graph Matching
 
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
 
A short introduction to Network coding
A short introduction to Network codingA short introduction to Network coding
A short introduction to Network coding
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab Projects
 

Similar to Ocr using tensor flow

Handwriting recognition
Handwriting recognitionHandwriting recognition
Handwriting recognitionMaeda Hanafi
 
Teach a neural network to read handwriting
Teach a neural network to read handwritingTeach a neural network to read handwriting
Teach a neural network to read handwritingVipul Kaushal
 
Devanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkDevanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkIRJET Journal
 
Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015Beatrice van Eden
 
artificial neural network
artificial neural networkartificial neural network
artificial neural networkPallavi Yadav
 
A Neural Network that Understands Handwriting
A Neural Network that Understands HandwritingA Neural Network that Understands Handwriting
A Neural Network that Understands HandwritingShivam Sawhney
 
Alphabet Recognition System Based on Artifical Neural Network
Alphabet Recognition System Based on Artifical Neural NetworkAlphabet Recognition System Based on Artifical Neural Network
Alphabet Recognition System Based on Artifical Neural Networkijtsrd
 
OCR for Gujarati Numeral using Neural Network
OCR for Gujarati Numeral using Neural NetworkOCR for Gujarati Numeral using Neural Network
OCR for Gujarati Numeral using Neural Networkijsrd.com
 
Implementation and Performance Evaluation of Neural Network for English Alpha...
Implementation and Performance Evaluation of Neural Network for English Alpha...Implementation and Performance Evaluation of Neural Network for English Alpha...
Implementation and Performance Evaluation of Neural Network for English Alpha...ijtsrd
 
Opticalcharacter recognition
Opticalcharacter recognition Opticalcharacter recognition
Opticalcharacter recognition Shobhit Saxena
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1ananth
 
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...
IRJET- Intelligent Character Recognition of Handwritten Characters using ...IRJET Journal
 
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character RecognitionIRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character RecognitionIRJET Journal
 
Text Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A ReviewText Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A ReviewIRJET Journal
 

Similar to Ocr using tensor flow (20)

Handwriting recognition
Handwriting recognitionHandwriting recognition
Handwriting recognition
 
Teach a neural network to read handwriting
Teach a neural network to read handwritingTeach a neural network to read handwriting
Teach a neural network to read handwriting
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
 
Devanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkDevanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural Network
 
UNIT-4.pptx
UNIT-4.pptxUNIT-4.pptx
UNIT-4.pptx
 
Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015
 
UNIT-4.pdf
UNIT-4.pdfUNIT-4.pdf
UNIT-4.pdf
 
UNIT-4.pdf
UNIT-4.pdfUNIT-4.pdf
UNIT-4.pdf
 
artificial neural network
artificial neural networkartificial neural network
artificial neural network
 
A Neural Network that Understands Handwriting
A Neural Network that Understands HandwritingA Neural Network that Understands Handwriting
A Neural Network that Understands Handwriting
 
Alphabet Recognition System Based on Artifical Neural Network
Alphabet Recognition System Based on Artifical Neural NetworkAlphabet Recognition System Based on Artifical Neural Network
Alphabet Recognition System Based on Artifical Neural Network
 
OCR for Gujarati Numeral using Neural Network
OCR for Gujarati Numeral using Neural NetworkOCR for Gujarati Numeral using Neural Network
OCR for Gujarati Numeral using Neural Network
 
Implementation and Performance Evaluation of Neural Network for English Alpha...
Implementation and Performance Evaluation of Neural Network for English Alpha...Implementation and Performance Evaluation of Neural Network for English Alpha...
Implementation and Performance Evaluation of Neural Network for English Alpha...
 
Opticalcharacter recognition
Opticalcharacter recognition Opticalcharacter recognition
Opticalcharacter recognition
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1
 
Assignment-1-NF.docx
Assignment-1-NF.docxAssignment-1-NF.docx
Assignment-1-NF.docx
 
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
 
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character RecognitionIRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
 
TensorFlow.pptx
TensorFlow.pptxTensorFlow.pptx
TensorFlow.pptx
 
Text Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A ReviewText Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A Review
 

Recently uploaded

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Recently uploaded (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Ocr using tensor flow

  • 1. Deep Learning OCR using Nimbix POWER8
  • 2. INTRODUCTION • OCR is the transformation of Images of text to Machine encoded text. • A simple API to an OCR library might provide a function which takes as input an image and outputs a string. • In this project we have applied Deep learning Neural Network to solve Optical Character Recognition. • We have made use of Tensorflow and Convolutional Neural Network.
  • 3. MOTIVATION • Optical character recognition is needed when the information should be readable both to humans and to a machine and alternative inputs can not be predefined. • The basic OCR system was invented to convert the data available on papers in to computer process able documents, So that the documents can be editable and reusable. • Traditional OCR techniques are typically multi-stage processes. For example, first the image may be divided into smaller regions that contain the individual characters, second the individual characters are recognized, and finally the result is pieced back together. A difficulty with this approach is to obtain a good division of the original image.
  • 4. Sample Architecture for CNN What are convolution Neural Network • Step 1 – Convolution Operation • Step 1(b) – ReLu layer (Rectified Linear unit) • Step 2 – Pooling • Step 3 – Flattering • Step 4 – Full Connection
  • 5. Fully Connected Layer of CNN model Source : Created by Kirill Eremenko, Hadelin de Ponteves, SuperDataScience Team
  • 6. OCRGen.py STRINGPOWER AI Dataset is generated using the Python Imaging Library (PIL) A fully convolutional network is presented which transforms the input volume into a sequence of character predictions. Predicted Output Fully Connected Layer CSV file
  • 7. Deep OCR Architecture • A fully convolutional network is presented which transforms the input volume into a sequence of character predictions. These character predictions can then be transformed into a string. The architecture of the network is shown below in Figure.
  • 8. • Where N is the number of possible characters. In this example, there are 63 possible characters for uppercase and lowercase characters, digits, and a blank character. The parenthesized values in the convolutional layers are the filter sizes and stride values from top to bottom respectively. The values in the reshape layer are the reshaped dimension. • The input volume is a rectangular RGB image. This first height and width of this volume are reduced across the convolutional layers using striding. The 3rd dimension of this volume increases from 3 channels (RGB) to 1 channel for each character possible. Thus, the volume is transformed from an RGB image into a sequence of vectors. Applying argmax across the channel dimension gives a sequence of 1-hot encoded vectors which can be transformed into a string. SOURCE https://github.com/nicholastoddsmith/pythonml/blob/master/Dee pOCR/TFModel/_classes.txt
  • 9. Result • To facilitate training this network, a dataset is generated using the Python Imaging Library (PIL). Random strings consisting of alphanumeric characters are generated. Using PIL, images are generated for each random string. A CSV file is also generated which contains the file name and the associated random string. Some examples from the generated dataset are shown below in Figure. Training Data Generating Data
  • 11. • Training and cross-validation results are shown
  • 12. Training the Network • To train the network, the CSV file is parsed and the images are loaded into memory. Each target value for the training data is a sequence of 1-hot vectors. Thus the target matrix is a 3D matrix with the three dimensions corresponding to sample, character, and 1-hot encoding respectively. • Next the neural network is constructed using the artificial neural network classifier (ANNC) class from TFANN. The architecture described above is represented in the following lines of code using ANNC • Softmax cross-entropy is used as the loss function which is performed over the 3rd dimension of the output. • Fitting the network and performing predictions is simple using the ANNC class. The prediction is split up using array_split from numpy to prevent out of memory errors.
  • 13. System Details • Distributed Deep Learning (DDL) environment on POWER8 system with IBM PowerAI ML/DL frameworks. • 40 threads POWER8, 256 RAM, 1 x k80 GPU • PushToCompute for compiling POWER8 applications and deploying directly to the Nimbix Cloud