SlideShare uma empresa Scribd logo
1 de 12
Baixar para ler offline
Team Name       :PATTERN CODER


Members        :Amit Kumar



Contact Address : Room No. 272 , Kapili Hostel
                  IIT Guwahati
                  North Guwahati,
                  Assam-781039.




Email id    :
amit.k@iitg.ernet.in,amit.k203@gmail.com




Institute   : Indian Institute Of Technology ,Guwahati
An improved algorithm for
 locating texts in camera
     captured images
Table of Contents

1.Introduction
2.Text detection algorithm
3.Flow diagram of the algorithm
4.Experimental results
5.Conclusion
6.References
Abstract:


Text data in images contain useful information. In this paper, we present an approach to
detect text in color images. The proposed approach is based on combination of edge
detection, connected component analysis at multiple resolutions. First, we utilize an
image edge detection algorithm to extract all possible text edge pixels. Dilation by a
specific structuring element is performed on the edge map. The dilation is followed by
erosion by a specific structuring element. Following some geometrical constraints we get
initial bounding boxes containing text regions. Then connected component analysis is
performed on corresponding binarized image to recover whole text portions.Finally,
multiresolution approach is used to make the approach applicable for large range of font
sizes.


1. Introduction:
The retrieval of text information from color images has gained increasing attention in
recent years. Text appearing in images can provide very useful semantic information and
may be a good key to describe the image content. Text detection can be found in many
applications, such as road sign detection, map interpretation and engineering drawings
interpretations etc. Many papers about text detection from images have been
published[2,4,5,6,7]. Text detection generally can be classified into two categories:
Bottom-up methods: they segment images into regions and group character region into
words[1].
Due to the difficulty of developing efficient segmentation algorithm for text in
complex background, the methods are not robust for detecting text in many camera based
images.
Top-down methods: they first detect text regions in images using filters and then perform
bottom- up techniques inside the text regions[2]. These methods are able to process more
complex images than bottom–up approaches. Top down methods are also divided into
two categories:
  Heuristic methods: they use heuristic filters
  Machine learning methods: they use trained filters.
Shortcomings of many current methods include their inability to perform well in the
case of variant text orientation, size, language and low resolution image, where characters
may be touching.
2.Text detection algorithm:

2.1 Conversion of color image to grayscale image:

Colors in image can be converted to shades of gray by calculating the effective
brightness or luminance of the color and using this value to create a shade of gray that
matches the desired brightness.

2.2 Edge detection:

Edge detection is an important pre-processing step of our method. Using edge as the
prominent feature of our method gives us the opportunity to detect characters with
different fonts and colors since every character present strong edge despite its font or
color, in order to be readable. We used Canny edge detector for our purpose. Canny edge
detector            takes grayscale image on input and returns bi-level image where non-
zero pixels mark detected edges.Canny uses Sobel masks in order to find the edge
magnitude of the image, in gray scale, and then uses no-Maxima suppression and
hysteresis thresholding. With these two post–processing operations         Canny    edge
detector manage to remove nonmaxima pixels, preserving the connectivity of the
contours.

2.3 Dilation:

Dilation is one of the two basic operators in the area of mathematical morphology, the
other being erosion. It is typically applied to binary images. The basic effect of the
operator on a binary image is to gradually enlarge the boundaries of regions of
foreground pixels (i.e. white pixels, typically). Thus areas of foreground pixels grow in
size while holes within those regions become smaller. Here, we are using 5x21 cross-
shaped structuring element. Dilation by this structuring element is performed to connect
the character contours of every text line.


2.4 Erosion:

Erosion is one of the two basic operators in the area of mathematical morphology, the
other being dilation. It is typically applied to binary images. The basic effect of the
operator on a binary image is to erode away the boundaries of regions of foreground
pixels (i.e. white pixels, typically). Thus areas of foreground pixels shrink in size, and
holes within those areas become larger. Here, we are using 11x45 cross-shaped
structuring element.
It results in removing the noise and smoothing the shape of the candidate text areas. By
doing this erosion process every component with height less than 11 or width less than 45
are suppressed.
2.5 Computation of initial bounding boxes of the candidate text areas:

Now after erosion step we compute the bounding boxes containing the white pixel
portion of the image. Bounding boxes just contain the 8-connected white pixel
components inside them. We place bounding boxes on the corresponding color image.
So after this step we get the bounding boxes on the corresponding color image.

2.6 Applying geometrical constraints:
Now we discard some boxes on the following geometrical constraints:
1) Height is lower than a threshold (set to 12)
2) Height is greater than a threshold (set to 48)
3) Ratio of width to height is lower than a threshold (set to 1.5)
After this step we reduce number of bounding boxes.

2.7 Multiresolution analysis:
The whole algorithm till now is applied in a multiresolution fashion to ensure text
detection with size variability[9]. In other words the methodology described above is
applied to image in different scales and finally results are fused to initial resolution. The
size of the element for the morphological operations (dilation, erosion) and the
geometrical constraints give to the algorithm the ability to detect text in a specific range
of character sizes(12-48 pixels). To overcome this problem we adopt multiresolution
approach .The algorithm above is applied to the images in different resolutions and
finally the results are fused to initial resolution. In this way we get a set of bounding
boxes on the color image for each resolution. We took resolution range from 0.1 to 1.5 at
the gapping of 0.1.For example if we have resolution parameter m, then fusing results to
the original resolution means that, size of the resized bounding box(x coordinate, y
coordinate, width, height) will be (x coordinate/m, y coordinate/m, width/m, height/m).
Similarly we do this for all resolutions in the range, resize the bounding boxes and then
fuse them on original image.

2.8 Selection of final bounding boxes:

We discard a smaller bounding box, if it is inside the bigger one. This way we reduce
drastically the number of bounding boxes. And these bounding boxes constitute final
region of interest. The reason behind this step is that, by doing this we can benefit in
terms of running time. Because now we have less number of bounding boxes and that
means less object to deal with without missing any significant text regions.

2.9 Binarization:
Now we binarize the grayscale image to get the corresponding binarized image. We used
Otsu’s method to perform thresholding, or the reduction of a gray level image to a binary
image[3].
2.10 Connected component analysis:
From the bounding boxes obtained in the previous step we perform connected component
analysis to recover the whole text regions. While computing bounding boxes some part of
a character fall outside the bounding box .In order to obtain the whole character from that
left part inside the bounding box we perform the connected component analysis, to obtain
the whole part.
We see corresponding connected component in Otsu binarized image. If any pixel that is
not the background and that falls inside the bounding box, we generate the connected
component containing that particular pixel from the corresponding Otsu binarized image.

2.11 Discarding some connected components on the basis of area:
Here by area we simply mean
The number of pixels that constitute the particular connected component. So the number
of pixels for the particular component is the area of the particular component. And the
area of image is taken as width*length. Width and length both are in pixel dimensions.
Based on suitable threshold we discard some components if their areas are greater than
threshold value. They are discarded also if their areas are less than a suitable threshold
value. Threshold values are taken as a suitable percentage (fraction) of the whole image
area. This way we refine our areas of interest and get more specific areas of interest. Now
there is a problem that is due to binarization. What happens exactly is that, while
performing      binarization some text portions like those which are against white
background or against more intense background, get lost and they become black in the
binarization process and they don’t participate in the further processing. To get rid of
this problem we invert the binarized image obtained after Otsu’s binarization step.
2.12 Inverting the binarized image obtained after the 2.9th Step and
performing the steps 2.10 and 2.11 on them.


   By inverting the image we simply mean that make the white pixels black and black
pixels white. The we perform similar operation of 2.10th , 2.11th steps on the inverted
image.


2.13 Adding the images obtained in steps 2.11 and 2.12

Now we add the images obtained after the 2.11th and 2.12th step to get the final result
image. By adding the images we simply mean that if either of the corresponding pixel in
two images are white make that white in resulting image and if neither of the
corresponding pixels are white then make that black in resulting image. This way we get
final image that is black and white.
In binarized resulting image text are in white pixels against black background.
3.Flow diagram of the algorithm:




         Original        Gray scale
         image           image




        Dilation         Canny
                         edge
                         detection




       Erosion         Bounding
                       boxes
                       selection



                       Geometric
                       -al
                       constraint




These         steps         are         performed          for        each         resolution
value(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0,1.1,1.2,1.3,1.4,1.5).
After this we get bounding boxes for each resolution. Now
We resize them, in order to fuse results to original resolution as explained in step 2.7.

       Binarize        Connected
       gray scale      component
       image           analysis



       Connected       Invert
       component       binarized
       analysis(b)     image



       Add two
       images         C is the
       C=a+b          final image
4.Experimental results:


We implemented this algorithm in MATLAB 6.1 under Microsoft Windows XP
Professional (5.1, Build 2600)
Processor: Intel(R) Pentium(R) D CPU
    2.80 GHz (2 CPUs)
Memory: 1014MB RAM

We tested many color images which include different types of texts. Our algorithm
successfully detects text locations in these images. Our algorithm successfully detects
text in Indian languages script as well as English language script. Here we are showing
two example images and their outputs. In first example image texts are in Bangla. In
second example image texts are in English. In the output images we can see the detected
texts. The detected texts are in white against the black background. These two example
images are natural scene images.




                          Figure 1. Example image 1




                         Figure 2. Output image 1
Figure 3. Example image 2




                           Figure 4. Output image 2




5.Conclusion:
In the results obtained, we can see the false alarms, i.e. white regions which are not text
actually. These can be removed in text recognition step because these regions represent
no text so they are not recognized.
This algorithm works fine in case of good contrast images, especially where texts have
good contrast against the background.
Acknowledgement
This work has been done at the Computer Vision and Pattern Recognition Unit, Indian
Statistical Institute, Kolkata under direct supervision of Ujjwal Bhattacharya.


6.References:

[1] Rainer Lienhart and Frank Stuber, “Automatic text recognition in digital videos”,
Technical Report / Department for Mathematics and Computer Science, University of
Mannheim ; TR-1995-036
[2] Du, Yingzi, Chang, Chein-I Thouin, Paul D. “Automated system for text detection in
individual video Images”, Journal of Electronic Imaging, 12(3), 410 - 422. 2003.
 [3] N.Otsu, "A Threshold Selection Method from Gray-Level Histogram," IEEE Trans.
Systems,
Man, and Cybernetics, vol. 9, pp. 62-66, 1979.
[4] C. Li, X. Ding, and Y. Wu, “Automatic text location in natural scene
images,” Proc. Sixth International Conference on Document Analysis and Recognition,
pp.1069–1073, Sept. 2001.
[5] K. In Kim, K. Jung, and J. Hyung, “Texture-based approach for text detection in
images using support vector machines and continuously adaptive mean shift algorithm,”
IEEE Trans. Pattern Anal. Mach.Intell., vol.25, no.12, pp.1631
1639, Dec. 2003.
[6] X. Tang, X. Gao, J. Liu, and H. Zhang, “A spatial-temporal approach for video
caption detection and recognition,” IEEE Trans. Neural Netw., vol.13, no.4, pp.961–971,
July 2002.
 [7] O. Hori and T. Mita, “A robust video text extraction method for
character recognition,” IEICE Trans. Inf. & Syst. (Japanese Edition),
vol.J84-D-II, no.8, pp.1800–1808, Aug. 2001.
[8] Yangxing LIU, Satoshi GOTO, Takeshi IKENAGA
 “A Contour-Based Robust Algorithm for TextDetection in Color
Images”
IEICE TRANS. INF. & SYST., VOL.E89–D, NO.3 MARCH 2006
[9] M. Anthimopoulos, M. Gatos, I. Pratikakis "Multiresolution text detection in video
frames“, Second international conference on computer vision theory and applications
(VISAPP).Barcelona, Spain March 8-11, 2007

Mais conteúdo relacionado

Mais procurados

Interpolation Technique using Non Linear Partial Differential Equation with E...
Interpolation Technique using Non Linear Partial Differential Equation with E...Interpolation Technique using Non Linear Partial Differential Equation with E...
Interpolation Technique using Non Linear Partial Differential Equation with E...CSCJournals
 
OBJECT SEGMENTATION USING MULTISCALE MORPHOLOGICAL OPERATIONS
OBJECT SEGMENTATION USING MULTISCALE MORPHOLOGICAL OPERATIONSOBJECT SEGMENTATION USING MULTISCALE MORPHOLOGICAL OPERATIONS
OBJECT SEGMENTATION USING MULTISCALE MORPHOLOGICAL OPERATIONSijcseit
 
An implementation of novel genetic based clustering algorithm for color image...
An implementation of novel genetic based clustering algorithm for color image...An implementation of novel genetic based clustering algorithm for color image...
An implementation of novel genetic based clustering algorithm for color image...TELKOMNIKA JOURNAL
 
Comparative Study and Analysis of Image Inpainting Techniques
Comparative Study and Analysis of Image Inpainting TechniquesComparative Study and Analysis of Image Inpainting Techniques
Comparative Study and Analysis of Image Inpainting TechniquesIOSR Journals
 
Noise tolerant color image segmentation using support vector machine
Noise tolerant color image segmentation using support vector machineNoise tolerant color image segmentation using support vector machine
Noise tolerant color image segmentation using support vector machineeSAT Publishing House
 
Digital Image Inpainting: A Review
Digital Image Inpainting: A ReviewDigital Image Inpainting: A Review
Digital Image Inpainting: A ReviewIRJET Journal
 
Resolution Independent 2D Cartoon Video Conversion
Resolution Independent 2D Cartoon Video ConversionResolution Independent 2D Cartoon Video Conversion
Resolution Independent 2D Cartoon Video ConversionEswar Publications
 
Probabilistic model based image segmentation
Probabilistic model based image segmentationProbabilistic model based image segmentation
Probabilistic model based image segmentationijma
 
Hangul Recognition Using Support Vector Machine
Hangul Recognition Using Support Vector MachineHangul Recognition Using Support Vector Machine
Hangul Recognition Using Support Vector MachineEditor IJCATR
 
An application of morphological
An application of morphologicalAn application of morphological
An application of morphologicalNaresh Chilamakuri
 
Block Classification Scheme of Compound Images: A Hybrid Extension
Block Classification Scheme of Compound Images: A Hybrid ExtensionBlock Classification Scheme of Compound Images: A Hybrid Extension
Block Classification Scheme of Compound Images: A Hybrid ExtensionDR.P.S.JAGADEESH KUMAR
 
Study of Image Inpainting Technique Based on TV Model
Study of Image Inpainting Technique Based on TV ModelStudy of Image Inpainting Technique Based on TV Model
Study of Image Inpainting Technique Based on TV Modelijsrd.com
 
Performance of Efficient Closed-Form Solution to Comprehensive Frontier Exposure
Performance of Efficient Closed-Form Solution to Comprehensive Frontier ExposurePerformance of Efficient Closed-Form Solution to Comprehensive Frontier Exposure
Performance of Efficient Closed-Form Solution to Comprehensive Frontier Exposureiosrjce
 

Mais procurados (19)

C04741319
C04741319C04741319
C04741319
 
Interpolation Technique using Non Linear Partial Differential Equation with E...
Interpolation Technique using Non Linear Partial Differential Equation with E...Interpolation Technique using Non Linear Partial Differential Equation with E...
Interpolation Technique using Non Linear Partial Differential Equation with E...
 
OBJECT SEGMENTATION USING MULTISCALE MORPHOLOGICAL OPERATIONS
OBJECT SEGMENTATION USING MULTISCALE MORPHOLOGICAL OPERATIONSOBJECT SEGMENTATION USING MULTISCALE MORPHOLOGICAL OPERATIONS
OBJECT SEGMENTATION USING MULTISCALE MORPHOLOGICAL OPERATIONS
 
An implementation of novel genetic based clustering algorithm for color image...
An implementation of novel genetic based clustering algorithm for color image...An implementation of novel genetic based clustering algorithm for color image...
An implementation of novel genetic based clustering algorithm for color image...
 
Comparative Study and Analysis of Image Inpainting Techniques
Comparative Study and Analysis of Image Inpainting TechniquesComparative Study and Analysis of Image Inpainting Techniques
Comparative Study and Analysis of Image Inpainting Techniques
 
Noise tolerant color image segmentation using support vector machine
Noise tolerant color image segmentation using support vector machineNoise tolerant color image segmentation using support vector machine
Noise tolerant color image segmentation using support vector machine
 
Digital Image Inpainting: A Review
Digital Image Inpainting: A ReviewDigital Image Inpainting: A Review
Digital Image Inpainting: A Review
 
Resolution Independent 2D Cartoon Video Conversion
Resolution Independent 2D Cartoon Video ConversionResolution Independent 2D Cartoon Video Conversion
Resolution Independent 2D Cartoon Video Conversion
 
Probabilistic model based image segmentation
Probabilistic model based image segmentationProbabilistic model based image segmentation
Probabilistic model based image segmentation
 
I07015261
I07015261I07015261
I07015261
 
C1803011419
C1803011419C1803011419
C1803011419
 
B42020710
B42020710B42020710
B42020710
 
Hangul Recognition Using Support Vector Machine
Hangul Recognition Using Support Vector MachineHangul Recognition Using Support Vector Machine
Hangul Recognition Using Support Vector Machine
 
An application of morphological
An application of morphologicalAn application of morphological
An application of morphological
 
Block Classification Scheme of Compound Images: A Hybrid Extension
Block Classification Scheme of Compound Images: A Hybrid ExtensionBlock Classification Scheme of Compound Images: A Hybrid Extension
Block Classification Scheme of Compound Images: A Hybrid Extension
 
Study of Image Inpainting Technique Based on TV Model
Study of Image Inpainting Technique Based on TV ModelStudy of Image Inpainting Technique Based on TV Model
Study of Image Inpainting Technique Based on TV Model
 
H017416670
H017416670H017416670
H017416670
 
IMAGE RETRIEVAL USING QUADRATIC DISTANCE BASED ON COLOR FEATURE AND PYRAMID S...
IMAGE RETRIEVAL USING QUADRATIC DISTANCE BASED ON COLOR FEATURE AND PYRAMID S...IMAGE RETRIEVAL USING QUADRATIC DISTANCE BASED ON COLOR FEATURE AND PYRAMID S...
IMAGE RETRIEVAL USING QUADRATIC DISTANCE BASED ON COLOR FEATURE AND PYRAMID S...
 
Performance of Efficient Closed-Form Solution to Comprehensive Frontier Exposure
Performance of Efficient Closed-Form Solution to Comprehensive Frontier ExposurePerformance of Efficient Closed-Form Solution to Comprehensive Frontier Exposure
Performance of Efficient Closed-Form Solution to Comprehensive Frontier Exposure
 

Semelhante a Sample Paper Techscribe

A binarization technique for extraction of devanagari text from camera based ...
A binarization technique for extraction of devanagari text from camera based ...A binarization technique for extraction of devanagari text from camera based ...
A binarization technique for extraction of devanagari text from camera based ...sipij
 
Using A Application For A Desktop Application
Using A Application For A Desktop ApplicationUsing A Application For A Desktop Application
Using A Application For A Desktop ApplicationTracy Huang
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 
Multispectral Satellite Color Image Segmentation Using Fuzzy Based Innovative...
Multispectral Satellite Color Image Segmentation Using Fuzzy Based Innovative...Multispectral Satellite Color Image Segmentation Using Fuzzy Based Innovative...
Multispectral Satellite Color Image Segmentation Using Fuzzy Based Innovative...Dibya Jyoti Bora
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
 
A Novel Edge Detection Technique for Image Classification and Analysis
A Novel Edge Detection Technique for Image Classification and AnalysisA Novel Edge Detection Technique for Image Classification and Analysis
A Novel Edge Detection Technique for Image Classification and AnalysisIOSR Journals
 
An effective approach to offline arabic handwriting recognition
An effective approach to offline arabic handwriting recognitionAn effective approach to offline arabic handwriting recognition
An effective approach to offline arabic handwriting recognitionijaia
 
A Study of Image Compression Methods
A Study of Image Compression MethodsA Study of Image Compression Methods
A Study of Image Compression MethodsIOSR Journals
 
Enhanced Optimization of Edge Detection for High Resolution Images Using Veri...
Enhanced Optimization of Edge Detection for High Resolution Images Using Veri...Enhanced Optimization of Edge Detection for High Resolution Images Using Veri...
Enhanced Optimization of Edge Detection for High Resolution Images Using Veri...ijcisjournal
 
Finding similarities between structured documents as a crucial stage for gene...
Finding similarities between structured documents as a crucial stage for gene...Finding similarities between structured documents as a crucial stage for gene...
Finding similarities between structured documents as a crucial stage for gene...Alexander Decker
 
Texture Segmentation Based on Multifractal Dimension
Texture Segmentation Based on Multifractal DimensionTexture Segmentation Based on Multifractal Dimension
Texture Segmentation Based on Multifractal Dimensionijsc
 
Texture Segmentation Based on Multifractal Dimension
Texture Segmentation Based on Multifractal Dimension  Texture Segmentation Based on Multifractal Dimension
Texture Segmentation Based on Multifractal Dimension ijsc
 

Semelhante a Sample Paper Techscribe (20)

A binarization technique for extraction of devanagari text from camera based ...
A binarization technique for extraction of devanagari text from camera based ...A binarization technique for extraction of devanagari text from camera based ...
A binarization technique for extraction of devanagari text from camera based ...
 
Using A Application For A Desktop Application
Using A Application For A Desktop ApplicationUsing A Application For A Desktop Application
Using A Application For A Desktop Application
 
I010634450
I010634450I010634450
I010634450
 
Assignment-1-NF.docx
Assignment-1-NF.docxAssignment-1-NF.docx
Assignment-1-NF.docx
 
E1803012329
E1803012329E1803012329
E1803012329
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
Multispectral Satellite Color Image Segmentation Using Fuzzy Based Innovative...
Multispectral Satellite Color Image Segmentation Using Fuzzy Based Innovative...Multispectral Satellite Color Image Segmentation Using Fuzzy Based Innovative...
Multispectral Satellite Color Image Segmentation Using Fuzzy Based Innovative...
 
G04544346
G04544346G04544346
G04544346
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In Videos
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In Videos
 
Ed34785790
Ed34785790Ed34785790
Ed34785790
 
A Novel Edge Detection Technique for Image Classification and Analysis
A Novel Edge Detection Technique for Image Classification and AnalysisA Novel Edge Detection Technique for Image Classification and Analysis
A Novel Edge Detection Technique for Image Classification and Analysis
 
An effective approach to offline arabic handwriting recognition
An effective approach to offline arabic handwriting recognitionAn effective approach to offline arabic handwriting recognition
An effective approach to offline arabic handwriting recognition
 
A Study of Image Compression Methods
A Study of Image Compression MethodsA Study of Image Compression Methods
A Study of Image Compression Methods
 
Enhanced Optimization of Edge Detection for High Resolution Images Using Veri...
Enhanced Optimization of Edge Detection for High Resolution Images Using Veri...Enhanced Optimization of Edge Detection for High Resolution Images Using Veri...
Enhanced Optimization of Edge Detection for High Resolution Images Using Veri...
 
Ijetcas14 372
Ijetcas14 372Ijetcas14 372
Ijetcas14 372
 
Finding similarities between structured documents as a crucial stage for gene...
Finding similarities between structured documents as a crucial stage for gene...Finding similarities between structured documents as a crucial stage for gene...
Finding similarities between structured documents as a crucial stage for gene...
 
Himadeep
HimadeepHimadeep
Himadeep
 
Texture Segmentation Based on Multifractal Dimension
Texture Segmentation Based on Multifractal DimensionTexture Segmentation Based on Multifractal Dimension
Texture Segmentation Based on Multifractal Dimension
 
Texture Segmentation Based on Multifractal Dimension
Texture Segmentation Based on Multifractal Dimension  Texture Segmentation Based on Multifractal Dimension
Texture Segmentation Based on Multifractal Dimension
 

Sample Paper Techscribe

  • 1. Team Name :PATTERN CODER Members :Amit Kumar Contact Address : Room No. 272 , Kapili Hostel IIT Guwahati North Guwahati, Assam-781039. Email id : amit.k@iitg.ernet.in,amit.k203@gmail.com Institute : Indian Institute Of Technology ,Guwahati
  • 2. An improved algorithm for locating texts in camera captured images
  • 3. Table of Contents 1.Introduction 2.Text detection algorithm 3.Flow diagram of the algorithm 4.Experimental results 5.Conclusion 6.References
  • 4. Abstract: Text data in images contain useful information. In this paper, we present an approach to detect text in color images. The proposed approach is based on combination of edge detection, connected component analysis at multiple resolutions. First, we utilize an image edge detection algorithm to extract all possible text edge pixels. Dilation by a specific structuring element is performed on the edge map. The dilation is followed by erosion by a specific structuring element. Following some geometrical constraints we get initial bounding boxes containing text regions. Then connected component analysis is performed on corresponding binarized image to recover whole text portions.Finally, multiresolution approach is used to make the approach applicable for large range of font sizes. 1. Introduction: The retrieval of text information from color images has gained increasing attention in recent years. Text appearing in images can provide very useful semantic information and may be a good key to describe the image content. Text detection can be found in many applications, such as road sign detection, map interpretation and engineering drawings interpretations etc. Many papers about text detection from images have been published[2,4,5,6,7]. Text detection generally can be classified into two categories: Bottom-up methods: they segment images into regions and group character region into words[1]. Due to the difficulty of developing efficient segmentation algorithm for text in complex background, the methods are not robust for detecting text in many camera based images. Top-down methods: they first detect text regions in images using filters and then perform bottom- up techniques inside the text regions[2]. These methods are able to process more complex images than bottom–up approaches. Top down methods are also divided into two categories: Heuristic methods: they use heuristic filters Machine learning methods: they use trained filters. Shortcomings of many current methods include their inability to perform well in the case of variant text orientation, size, language and low resolution image, where characters may be touching.
  • 5. 2.Text detection algorithm: 2.1 Conversion of color image to grayscale image: Colors in image can be converted to shades of gray by calculating the effective brightness or luminance of the color and using this value to create a shade of gray that matches the desired brightness. 2.2 Edge detection: Edge detection is an important pre-processing step of our method. Using edge as the prominent feature of our method gives us the opportunity to detect characters with different fonts and colors since every character present strong edge despite its font or color, in order to be readable. We used Canny edge detector for our purpose. Canny edge detector takes grayscale image on input and returns bi-level image where non- zero pixels mark detected edges.Canny uses Sobel masks in order to find the edge magnitude of the image, in gray scale, and then uses no-Maxima suppression and hysteresis thresholding. With these two post–processing operations Canny edge detector manage to remove nonmaxima pixels, preserving the connectivity of the contours. 2.3 Dilation: Dilation is one of the two basic operators in the area of mathematical morphology, the other being erosion. It is typically applied to binary images. The basic effect of the operator on a binary image is to gradually enlarge the boundaries of regions of foreground pixels (i.e. white pixels, typically). Thus areas of foreground pixels grow in size while holes within those regions become smaller. Here, we are using 5x21 cross- shaped structuring element. Dilation by this structuring element is performed to connect the character contours of every text line. 2.4 Erosion: Erosion is one of the two basic operators in the area of mathematical morphology, the other being dilation. It is typically applied to binary images. The basic effect of the operator on a binary image is to erode away the boundaries of regions of foreground pixels (i.e. white pixels, typically). Thus areas of foreground pixels shrink in size, and holes within those areas become larger. Here, we are using 11x45 cross-shaped structuring element. It results in removing the noise and smoothing the shape of the candidate text areas. By doing this erosion process every component with height less than 11 or width less than 45 are suppressed.
  • 6. 2.5 Computation of initial bounding boxes of the candidate text areas: Now after erosion step we compute the bounding boxes containing the white pixel portion of the image. Bounding boxes just contain the 8-connected white pixel components inside them. We place bounding boxes on the corresponding color image. So after this step we get the bounding boxes on the corresponding color image. 2.6 Applying geometrical constraints: Now we discard some boxes on the following geometrical constraints: 1) Height is lower than a threshold (set to 12) 2) Height is greater than a threshold (set to 48) 3) Ratio of width to height is lower than a threshold (set to 1.5) After this step we reduce number of bounding boxes. 2.7 Multiresolution analysis: The whole algorithm till now is applied in a multiresolution fashion to ensure text detection with size variability[9]. In other words the methodology described above is applied to image in different scales and finally results are fused to initial resolution. The size of the element for the morphological operations (dilation, erosion) and the geometrical constraints give to the algorithm the ability to detect text in a specific range of character sizes(12-48 pixels). To overcome this problem we adopt multiresolution approach .The algorithm above is applied to the images in different resolutions and finally the results are fused to initial resolution. In this way we get a set of bounding boxes on the color image for each resolution. We took resolution range from 0.1 to 1.5 at the gapping of 0.1.For example if we have resolution parameter m, then fusing results to the original resolution means that, size of the resized bounding box(x coordinate, y coordinate, width, height) will be (x coordinate/m, y coordinate/m, width/m, height/m). Similarly we do this for all resolutions in the range, resize the bounding boxes and then fuse them on original image. 2.8 Selection of final bounding boxes: We discard a smaller bounding box, if it is inside the bigger one. This way we reduce drastically the number of bounding boxes. And these bounding boxes constitute final region of interest. The reason behind this step is that, by doing this we can benefit in terms of running time. Because now we have less number of bounding boxes and that means less object to deal with without missing any significant text regions. 2.9 Binarization: Now we binarize the grayscale image to get the corresponding binarized image. We used Otsu’s method to perform thresholding, or the reduction of a gray level image to a binary image[3].
  • 7. 2.10 Connected component analysis: From the bounding boxes obtained in the previous step we perform connected component analysis to recover the whole text regions. While computing bounding boxes some part of a character fall outside the bounding box .In order to obtain the whole character from that left part inside the bounding box we perform the connected component analysis, to obtain the whole part. We see corresponding connected component in Otsu binarized image. If any pixel that is not the background and that falls inside the bounding box, we generate the connected component containing that particular pixel from the corresponding Otsu binarized image. 2.11 Discarding some connected components on the basis of area: Here by area we simply mean The number of pixels that constitute the particular connected component. So the number of pixels for the particular component is the area of the particular component. And the area of image is taken as width*length. Width and length both are in pixel dimensions. Based on suitable threshold we discard some components if their areas are greater than threshold value. They are discarded also if their areas are less than a suitable threshold value. Threshold values are taken as a suitable percentage (fraction) of the whole image area. This way we refine our areas of interest and get more specific areas of interest. Now there is a problem that is due to binarization. What happens exactly is that, while performing binarization some text portions like those which are against white background or against more intense background, get lost and they become black in the binarization process and they don’t participate in the further processing. To get rid of this problem we invert the binarized image obtained after Otsu’s binarization step.
  • 8. 2.12 Inverting the binarized image obtained after the 2.9th Step and performing the steps 2.10 and 2.11 on them. By inverting the image we simply mean that make the white pixels black and black pixels white. The we perform similar operation of 2.10th , 2.11th steps on the inverted image. 2.13 Adding the images obtained in steps 2.11 and 2.12 Now we add the images obtained after the 2.11th and 2.12th step to get the final result image. By adding the images we simply mean that if either of the corresponding pixel in two images are white make that white in resulting image and if neither of the corresponding pixels are white then make that black in resulting image. This way we get final image that is black and white. In binarized resulting image text are in white pixels against black background.
  • 9. 3.Flow diagram of the algorithm: Original Gray scale image image Dilation Canny edge detection Erosion Bounding boxes selection Geometric -al constraint These steps are performed for each resolution value(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0,1.1,1.2,1.3,1.4,1.5). After this we get bounding boxes for each resolution. Now We resize them, in order to fuse results to original resolution as explained in step 2.7. Binarize Connected gray scale component image analysis Connected Invert component binarized analysis(b) image Add two images C is the C=a+b final image
  • 10. 4.Experimental results: We implemented this algorithm in MATLAB 6.1 under Microsoft Windows XP Professional (5.1, Build 2600) Processor: Intel(R) Pentium(R) D CPU 2.80 GHz (2 CPUs) Memory: 1014MB RAM We tested many color images which include different types of texts. Our algorithm successfully detects text locations in these images. Our algorithm successfully detects text in Indian languages script as well as English language script. Here we are showing two example images and their outputs. In first example image texts are in Bangla. In second example image texts are in English. In the output images we can see the detected texts. The detected texts are in white against the black background. These two example images are natural scene images. Figure 1. Example image 1 Figure 2. Output image 1
  • 11. Figure 3. Example image 2 Figure 4. Output image 2 5.Conclusion: In the results obtained, we can see the false alarms, i.e. white regions which are not text actually. These can be removed in text recognition step because these regions represent no text so they are not recognized. This algorithm works fine in case of good contrast images, especially where texts have good contrast against the background.
  • 12. Acknowledgement This work has been done at the Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata under direct supervision of Ujjwal Bhattacharya. 6.References: [1] Rainer Lienhart and Frank Stuber, “Automatic text recognition in digital videos”, Technical Report / Department for Mathematics and Computer Science, University of Mannheim ; TR-1995-036 [2] Du, Yingzi, Chang, Chein-I Thouin, Paul D. “Automated system for text detection in individual video Images”, Journal of Electronic Imaging, 12(3), 410 - 422. 2003. [3] N.Otsu, "A Threshold Selection Method from Gray-Level Histogram," IEEE Trans. Systems, Man, and Cybernetics, vol. 9, pp. 62-66, 1979. [4] C. Li, X. Ding, and Y. Wu, “Automatic text location in natural scene images,” Proc. Sixth International Conference on Document Analysis and Recognition, pp.1069–1073, Sept. 2001. [5] K. In Kim, K. Jung, and J. Hyung, “Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm,” IEEE Trans. Pattern Anal. Mach.Intell., vol.25, no.12, pp.1631 1639, Dec. 2003. [6] X. Tang, X. Gao, J. Liu, and H. Zhang, “A spatial-temporal approach for video caption detection and recognition,” IEEE Trans. Neural Netw., vol.13, no.4, pp.961–971, July 2002. [7] O. Hori and T. Mita, “A robust video text extraction method for character recognition,” IEICE Trans. Inf. & Syst. (Japanese Edition), vol.J84-D-II, no.8, pp.1800–1808, Aug. 2001. [8] Yangxing LIU, Satoshi GOTO, Takeshi IKENAGA “A Contour-Based Robust Algorithm for TextDetection in Color Images” IEICE TRANS. INF. & SYST., VOL.E89–D, NO.3 MARCH 2006 [9] M. Anthimopoulos, M. Gatos, I. Pratikakis "Multiresolution text detection in video frames“, Second international conference on computer vision theory and applications (VISAPP).Barcelona, Spain March 8-11, 2007