SlideShare uma empresa Scribd logo
1 de 12
Baixar para ler offline
7. Page Layout Analysis
Finding text regions on pages from
books, magazines, and newspapers.
Ray Smith, Google Inc.
Tesseract Tutorial: DAS 2014 Tours France
Background
● Historically Tesseract had no page layout analysis, but did have
text-line finding, assuming a single column of text.
● Cube relies on Tesseract's page-layout/line finding.
● Tesseract's existing text-line finding is also weak wrt diacritics,
especially for Arabic and Thai.
● Past methods tend to be:
○(Bottom-up) Insufficiently aware of page-layout rules or
○(Top-down) Insufficiently general.
Tesseract Tutorial: DAS 2014 Tours France
Past Methods: Bottom-Up
● Analyze groups of pixels or connected components to classify
into text/image/graphic/blank/line.
● Spread/smear/anneal groups of pixels by some neighborhood
voting scheme, morphology or voronoi/graph algorithms.
● Find connected components of labels to group pixels into typed
regions.
● Box-up regions into rectangles where possible.
● Morphological approach is very similar.
● Hard to include knowledge like "Columns should usually be the
same size."
Tesseract Tutorial: DAS 2014 Tours France
Past Methods: Top Down
● Often starts with a (possibly pre-trained) model of layout, eg 2-
column journal page.
● Attempts to cut the image into the required parts, either with
recursive vertical/horizontal cuts, or finding rectangles of
whitespace.
● Methods usually fail on non-rectangular regions.
● Methods can often only deal with pages that fit the model.
Tesseract Tutorial: DAS 2014 Tours France
New Method: Hybrid
Bottom-Up-Style
Pixel labelling
Column Finding
via Tab-Stops
Find Cross-Column
and Non-rectangular
Regions
Image
Line-Spacing
Model
Regions/
Text Lines
Tesseract Tutorial: DAS 2014 Tours France
Image-Level Page Layout Analysis
Input Image Detected Lines Detected Images
Tesseract Tutorial: DAS 2014 Tours France
Connected Component Analysis
Text Components Candidate Tab-Stop Detected Tab
Components Segments
Tesseract Tutorial: DAS 2014 Tours France
Writing direction detection: Got Japanese?
Detect Local writing
direction and do tab
finding again...
Tesseract Tutorial: DAS 2014 Tours France
Column Finding
Connected Tabs Validated Tab Candidate Column
Segments Partitions
Tesseract Tutorial: DAS 2014 Tours France
Block Finding
Detected Typed Column Detected
Columns Partitions Blocks
Tesseract Tutorial: DAS 2014 Tours France
Live demo of Page Layout Analysis
Simple magazine page with vertical text:
api/tesseract unlv/mag.3B/1/8001_044.3B.tif test1 inter segdemo strokewidth
Non-rectangular images:
api/tesseract unlv/mag.3B/0/8050_078.3B.tif test1 inter strokewidth
Multi-orientation Japanese:
api/tesseract -l jpn unlv/jpn.tif test1 inter segdemo strokewidth
Text on image:
api/tesseract unlv/mp00052bw.tif test1 inter strokewidth
Tesseract Tutorial: DAS 2014 Tours France
Thanks for Listening!
Questions?

Mais conteúdo relacionado

Semelhante a 7 layout analysis

Faking a Grid - Franco Alvarado (Macmillan Learning), Betsy Granger (Macmilla...
Faking a Grid - Franco Alvarado (Macmillan Learning), Betsy Granger (Macmilla...Faking a Grid - Franco Alvarado (Macmillan Learning), Betsy Granger (Macmilla...
Faking a Grid - Franco Alvarado (Macmillan Learning), Betsy Granger (Macmilla...BookNet Canada
 
graphs of tangent and cotangent function
graphs of tangent and cotangent functiongraphs of tangent and cotangent function
graphs of tangent and cotangent functionRomualdoDayrit1
 
Duarte ms template_download
Duarte ms template_downloadDuarte ms template_download
Duarte ms template_downloadartecxt
 
Goodbye Nightmare : Tops and Tricks for creating Layouts
Goodbye Nightmare : Tops and Tricks for creating LayoutsGoodbye Nightmare : Tops and Tricks for creating Layouts
Goodbye Nightmare : Tops and Tricks for creating LayoutsLuc Bors
 
Method of Dealing with Outline Assignment
Method of Dealing with Outline AssignmentMethod of Dealing with Outline Assignment
Method of Dealing with Outline AssignmentLesa Cote
 

Semelhante a 7 layout analysis (11)

Tasract OCR
Tasract OCRTasract OCR
Tasract OCR
 
Page Layout
Page LayoutPage Layout
Page Layout
 
Faking a Grid - Franco Alvarado (Macmillan Learning), Betsy Granger (Macmilla...
Faking a Grid - Franco Alvarado (Macmillan Learning), Betsy Granger (Macmilla...Faking a Grid - Franco Alvarado (Macmillan Learning), Betsy Granger (Macmilla...
Faking a Grid - Franco Alvarado (Macmillan Learning), Betsy Granger (Macmilla...
 
Calsification.pptx
Calsification.pptxCalsification.pptx
Calsification.pptx
 
Midterm review
Midterm reviewMidterm review
Midterm review
 
graphs of tangent and cotangent function
graphs of tangent and cotangent functiongraphs of tangent and cotangent function
graphs of tangent and cotangent function
 
Duarte ms template_download
Duarte ms template_downloadDuarte ms template_download
Duarte ms template_download
 
Goodbye Nightmare : Tops and Tricks for creating Layouts
Goodbye Nightmare : Tops and Tricks for creating LayoutsGoodbye Nightmare : Tops and Tricks for creating Layouts
Goodbye Nightmare : Tops and Tricks for creating Layouts
 
Goodbye Nightmare: Tips and Tricks for Creating Complex Layouts with Oracle A...
Goodbye Nightmare: Tips and Tricks for Creating Complex Layouts with Oracle A...Goodbye Nightmare: Tips and Tricks for Creating Complex Layouts with Oracle A...
Goodbye Nightmare: Tips and Tricks for Creating Complex Layouts with Oracle A...
 
Method of Dealing with Outline Assignment
Method of Dealing with Outline AssignmentMethod of Dealing with Outline Assignment
Method of Dealing with Outline Assignment
 
Lesson 4
Lesson 4Lesson 4
Lesson 4
 

Mais de Solin TEM

CS_rapport_final_fr_v3_1
CS_rapport_final_fr_v3_1CS_rapport_final_fr_v3_1
CS_rapport_final_fr_v3_1Solin TEM
 
cs15 slide_v2
cs15 slide_v2cs15 slide_v2
cs15 slide_v2Solin TEM
 
4 downloading
4 downloading4 downloading
4 downloadingSolin TEM
 
2 architecture anddatastructures
2 architecture anddatastructures2 architecture anddatastructures
2 architecture anddatastructuresSolin TEM
 
1 intro history
1 intro history1 intro history
1 intro historySolin TEM
 
0 tutorial contents
0 tutorial contents0 tutorial contents
0 tutorial contentsSolin TEM
 
Khmer ocr using gfd_seminar_day
Khmer ocr using gfd_seminar_dayKhmer ocr using gfd_seminar_day
Khmer ocr using gfd_seminar_daySolin TEM
 
Khmer ocr using gfd
Khmer ocr using gfdKhmer ocr using gfd
Khmer ocr using gfdSolin TEM
 
Khmer ocr scientificday_itc
Khmer ocr scientificday_itcKhmer ocr scientificday_itc
Khmer ocr scientificday_itcSolin TEM
 
Khmer ocr itc
Khmer ocr itcKhmer ocr itc
Khmer ocr itcSolin TEM
 

Mais de Solin TEM (10)

CS_rapport_final_fr_v3_1
CS_rapport_final_fr_v3_1CS_rapport_final_fr_v3_1
CS_rapport_final_fr_v3_1
 
cs15 slide_v2
cs15 slide_v2cs15 slide_v2
cs15 slide_v2
 
4 downloading
4 downloading4 downloading
4 downloading
 
2 architecture anddatastructures
2 architecture anddatastructures2 architecture anddatastructures
2 architecture anddatastructures
 
1 intro history
1 intro history1 intro history
1 intro history
 
0 tutorial contents
0 tutorial contents0 tutorial contents
0 tutorial contents
 
Khmer ocr using gfd_seminar_day
Khmer ocr using gfd_seminar_dayKhmer ocr using gfd_seminar_day
Khmer ocr using gfd_seminar_day
 
Khmer ocr using gfd
Khmer ocr using gfdKhmer ocr using gfd
Khmer ocr using gfd
 
Khmer ocr scientificday_itc
Khmer ocr scientificday_itcKhmer ocr scientificday_itc
Khmer ocr scientificday_itc
 
Khmer ocr itc
Khmer ocr itcKhmer ocr itc
Khmer ocr itc
 

Último

➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...amitlee9823
 

Último (20)

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 

7 layout analysis

  • 1. 7. Page Layout Analysis Finding text regions on pages from books, magazines, and newspapers. Ray Smith, Google Inc.
  • 2. Tesseract Tutorial: DAS 2014 Tours France Background ● Historically Tesseract had no page layout analysis, but did have text-line finding, assuming a single column of text. ● Cube relies on Tesseract's page-layout/line finding. ● Tesseract's existing text-line finding is also weak wrt diacritics, especially for Arabic and Thai. ● Past methods tend to be: ○(Bottom-up) Insufficiently aware of page-layout rules or ○(Top-down) Insufficiently general.
  • 3. Tesseract Tutorial: DAS 2014 Tours France Past Methods: Bottom-Up ● Analyze groups of pixels or connected components to classify into text/image/graphic/blank/line. ● Spread/smear/anneal groups of pixels by some neighborhood voting scheme, morphology or voronoi/graph algorithms. ● Find connected components of labels to group pixels into typed regions. ● Box-up regions into rectangles where possible. ● Morphological approach is very similar. ● Hard to include knowledge like "Columns should usually be the same size."
  • 4. Tesseract Tutorial: DAS 2014 Tours France Past Methods: Top Down ● Often starts with a (possibly pre-trained) model of layout, eg 2- column journal page. ● Attempts to cut the image into the required parts, either with recursive vertical/horizontal cuts, or finding rectangles of whitespace. ● Methods usually fail on non-rectangular regions. ● Methods can often only deal with pages that fit the model.
  • 5. Tesseract Tutorial: DAS 2014 Tours France New Method: Hybrid Bottom-Up-Style Pixel labelling Column Finding via Tab-Stops Find Cross-Column and Non-rectangular Regions Image Line-Spacing Model Regions/ Text Lines
  • 6. Tesseract Tutorial: DAS 2014 Tours France Image-Level Page Layout Analysis Input Image Detected Lines Detected Images
  • 7. Tesseract Tutorial: DAS 2014 Tours France Connected Component Analysis Text Components Candidate Tab-Stop Detected Tab Components Segments
  • 8. Tesseract Tutorial: DAS 2014 Tours France Writing direction detection: Got Japanese? Detect Local writing direction and do tab finding again...
  • 9. Tesseract Tutorial: DAS 2014 Tours France Column Finding Connected Tabs Validated Tab Candidate Column Segments Partitions
  • 10. Tesseract Tutorial: DAS 2014 Tours France Block Finding Detected Typed Column Detected Columns Partitions Blocks
  • 11. Tesseract Tutorial: DAS 2014 Tours France Live demo of Page Layout Analysis Simple magazine page with vertical text: api/tesseract unlv/mag.3B/1/8001_044.3B.tif test1 inter segdemo strokewidth Non-rectangular images: api/tesseract unlv/mag.3B/0/8050_078.3B.tif test1 inter strokewidth Multi-orientation Japanese: api/tesseract -l jpn unlv/jpn.tif test1 inter segdemo strokewidth Text on image: api/tesseract unlv/mp00052bw.tif test1 inter strokewidth
  • 12. Tesseract Tutorial: DAS 2014 Tours France Thanks for Listening! Questions?