SlideShare a Scribd company logo
1 of 39
IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Li brary of the
Netherlands.




ABBYY & OCR Improvements for IMPACT


 Michael Fuchs
 Senior Product Marketing Manager
 ABBYY Europe
 fuchs@abbyy.com
IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Li brary of the
Netherlands.




Agenda
          Who is ABBYY?
           Company Overview
           (Short) Product Overview
           ABBYY Technology in the IMPACT project


          OCR & Processing – IMPACT improvements
           Binarisation, Segmentation,
           Recognition
           Dictionary API, Export Formats

          Lessons Learned, Pricing, Pre-Announcement, Q&A



                                                                                                                                             2
ABBYY & IMPACT




ABBYY & OCR for IMPACT                    3
ABBYY Group




   Overview ABBYY Group
       Founded in 1989 as BIT Software
       > 1000 employees in 14 offices worldwide
       Headquarters/R&D in Moscow, Russia

ABBYY & OCR for IMPACT                             4
ABBYY OCR Products – Usage View

                 Desktop/Workgroup              Server/Backend           SDK/Integration

                 User driven processing,     Automated processing,     Automated processing,
                      Ready to use               Ready to use           Development needed
OCR & Document
  Conversion




                 FineReader                  Recognition Server         FineReader Engines
                 (Professional, Corporate,   (Professional, Extended    (Windows, Linux, Mac OS X,
                  Site Licence Edition)      Edition)                    Free BSD, Embedded Systems)
                 Note: No Gothic/Fraktur
                 OCR!                         Gothic/Fraktur OCR        Mobile OCR Engine
                                                & XML Export            (Android, Symbian, Linux,
                 PDF Transformer                   Support!              Windows, Windows Mobile,
                 FotoReader                                              iOS )
                 ScreenshotReader



                       End Users,                Companies,                 Developers,
Users
 are:




                      Companies,             Scan Service Provider,     Scan Service Provider
                       (Libraries)                 Libraries              IMPACT Research

       ABBYY & OCR for IMPACT                                                                       5
What (ABBYY) OCR can read...
   Recognition Languages
       Almost 200 OCR languages
       34 languages with dictionary support and spell check
       Alphabets: Cyrillic, Latin, Greek, Armenian, Hebrew, Thai
       Chinese, Japanese, Korean (CJK) - 4 sets of hieroglyphs
        (Chinese (traditional and simplified), Japanese, Korean)
       Arabic (Technical Preview in the SDK)



   Font Types
       Recognition of mixed font types
        (dot-matrix printer, typewriter, Gothic, etc.)
       OCR-A
       OCR-B
       MICR (E13B)
       CMC-7



ABBYY & OCR for IMPACT                                              6
IMPACT & ABBYY
   ABBYY is the OCR technology provider for IMPACT members

   ABBYY also improved the core technologies for the recognition
    of old documents in IMPACT, focus areas are/were:
        Image pre-processing
        Segmentation
        Character recognition
        Export

   IMPACT members work with the Software Development Kit (SDK)
    FineReader Engine – not the desktop application

   IMPACT focus is/was on research and not in setting up a
    production system ;o)
   Improved technologies are/will be added to current/future products




ABBYY & OCR for IMPACT                                              7
Designed to be not OCRed




ABBYY & OCR for IMPACT     8
Why ABBYY? -                 OCR …

   Original Image
    [perfect quality :o) ]




   Std. OCR *




   ABBYY
    Fraktur OCR*



                              *Recognition Server 3.0 R1 – Gothic/Fraktur disabled and enabled

    ABBYY & OCR for IMPACT                                                                9
ABBYY “History” and Old Fonts Recognition

   FineReader XIX (V7 Technology) 2003
    (METAe result 2000-2003)

   FineReader Engine 9.0 (Release 1) 2008
    (Pre-IMPACT – “State of the Art”)

   FineReader Engine 10            2010
    IMPACT Project Optimizations




ABBYY & OCR for IMPACT                       10
ABBYY and Old European Fonts
Accuracy Comparison:

                                                            Up to 98,2 % on
                                                             good quality
                                                                images




                             2003
                                       2008   2010

ABBYY Technology Version 10 recognition of old European fonts:
     25% more accurate than FRE 9.0
     38% more accurate than FR XIX

    ABBYY & OCR for IMPACT                                                11
OCR Processing Steps
                               &
                  ABBYY Improvements for IMPACT




ABBYY & OCR for IMPACT                            12
Processing Steps
   Step 1. Scanning, Image Loading, Pre-Processing and
    Modification
       Compensating image defects and making the document suited for automatic OCR

   Step 2. Document Layout Analysis
       Layout analysis, detection of document sections like text, images and barcodes

   Step 3. (Optical) Character Recognition
       Automatic recognition of characters, apply selected recognition languages &
        dictionaries

   Step 4. (optional) Verification - by Operators or automated post
    correction
       Manual validation of suspicious characters and words

   Step 5. Document Synthesis and Export
       Generating an output document in the selected format



ABBYY & OCR for IMPACT                                                                   13
Step 1: Image pre-processing




ABBYY & OCR for IMPACT                             14
Step 1: Image pre-processing
Image Loading, Pre-Processing and Modification



     Intelligent background filtering




     Adaptive Binarisation




    General binarisation on an image level can not
    deliver good results for OCR

ABBYY & OCR for IMPACT                               15
Step 1: Image pre-processing
New V10: Binarisation, Textured Background optimisations


                                                Original scan




V9 binarisation




       New V10 binarisation


ABBYY & OCR for IMPACT                                          16
Step 1: Image pre-processing
New V10: Binarisation, Textured Background optimisations
                                            Original scan




V9 binarisation

                         V10 binarisation

ABBYY & OCR for IMPACT                                      17
Step 1: Image pre-processing
New V10: Binarisation for the IMPACT project



       Original             State of Art (V9)        New (V10)




                                                   No text from the
                                                    other page!


ABBYY & OCR for IMPACT                                                 18
Step 2: Document Layout Analysis




ABBYY & OCR for IMPACT                             19
Step 2: Document Layout Analysis
Analyze layout and find text, images, tables and barcodes




ABBYY & OCR for IMPACT                                      20
Step 2: Document Layout Analysis               (old Newspapers)
Segmentation Improvements: Image/Text detection – Example 1/3
        V9 Technology                          V10 Technology




 Part of the column was detected as an image

ABBYY & OCR for IMPACT                                            21
Step 2: Document Layout Analysis (old Newspapers)
Segmentation Improvements: Word Order Detection– Example 2/3
        V9 Technology                      V10 Technology




                                     Less linear word order errors

ABBYY & OCR for IMPACT                                               22
Step 2: Document Layout Analysis (old Newspapers)
Segmentation Improvements: Lost text (no Detection) – Example
3/3
        V9 Technology                      V10 Technology




                                               Less lost text
ABBYY & OCR for IMPACT                                          23
Step 2: Document Layout Analysis
Segmentation Improvements: IMPACT Results over time

       Before IMPACT:
        Overall segmentation improvements
        ●   Better picture detection
        ●   Better separators
        ●   Better page layout reconstruction
        Only a random set of old newspapers available

       After IMPACT:
       IMPACT Segmentation Ground Truth available
       New (internal) DA model for historic newspapers
       New segmentation evaluation methodology
       Evaluation results on newspapers
        ● 40% less split/merge errors
        ● 25% less garbage and lost text




ABBYY & OCR for IMPACT                                    24
Step 3: Text/Character Recognition




ABBYY & OCR for IMPACT                              25
Step 3: Text/Character Recognition
   Samples for Classifiers used in ABBYY technologies
    After line detection, character recognition is applied with different classifiers
        Raster classifier                                Contour classifier




        Structure classifier                    Feature differentiating classifier




ABBYY & OCR for IMPACT                                                                  26
Step 3: Text/Character Recognition
Optimization and new Developments

   Improved Gothic Classifiers
       A significant amount of time was invested in gothic classifier training
       The library selection of ground truth material (historical relevance) was used
       New gothic graphemes were added



   Results
       Good quality images: 2.8% (total) error rate on the used test set which is about
        20% improvement to the “state of art” (V9) = almost comparable to modern
        documents

       Bad quality Images: 7% (total) error rate on the used test set which is about
        30% improvement to the “state of art” (V9)

       Most of the improvements available in ABBYY current products:
        ABBYY FineReader Engine 10 (SDK) & Recognition Server 3.0
        Quality optimization will be continued in future releases and technology cycles
        optimized

ABBYY & OCR for IMPACT                                                                     27
Step 3: Text/Character Recognition
Optimization and new Developments


   Old Slavonic as new OCR Language
    New Development

    Before




    Now




ABBYY & OCR for IMPACT                 28
Quality-Test-Comparison:
           Binarisation & Recognition Improvements




ABBYY & OCR for IMPACT                               29
Binarisation & Recognition Improvements

   How to evaluate the recognition improvements of binarisation?
    Binarisation & recognition quality go hand in hand!



                                                  -> # Errors = 100%
                                                  with V9 binarisation & V9 recognition

                                                  -> # Errors = -5%
                                                  with V9 binarisation & V10 recognition


                                                  -> # Errors = -11%
                                                  with V10 binarisation & V9 recognition


                                                  -> # Errors = -15%
                                                   with V10 binarisation & V10
                                                  recognition



          Binarisation   Recognition Technology

ABBYY & OCR for IMPACT                                                               30
Step 3-5: Dictionaries & Export




ABBYY & OCR for IMPACT                               31
Step 3 – 5: Other Optimizations

   External Dictionary API Tuning
       External Dictionary API was available in the FineReader Engine (SDK)
       Support for any language, any time period
       API was/is heavily used from IMPACT language partners to run quality tests



   New ALTO XML Export Formats
       FineReader Engine 10 R2, December 2010
       Recognition Server 3.0, July 2011




ABBYY & OCR for IMPACT                                                               32
Additional Notes




ABBYY & OCR for IMPACT                      33
Further Information & Trial Versions

   The ABBYY Gothic/Fraktur OCR Portal:
    www.frakturschrift.com




ABBYY & OCR for IMPACT                     34
What IMPACT taught ABBYY about
Libraries & Mass Digitalization projects…
       The Reality
        Masses of books/document are available & already scanned
        It is unclear if Antiqua and/or Gothic/Fraktur fonts are used in the documents
        Pre-Sorting is impossible, it would be too time/cost expensive

       ABBYY Europe's Answer
              Reduced the pricing for mixed “Old” + “Modern” font OCR
        projects
              The pricing is now ready for “mass processing”

       Examples Recognition Server 3.0 with “Gothic” enabled
        10.000 pages – 299 Euro – available online
        500.000 pages* – 5.000 Euro = 1 Euro cent per page = ca 2.000 books a 250
         pages
        Over 3 Mio pages* - ca 0,52 Euro cent per page = 12.000 books a 1,25 € (250
         pages)
        Over 10 Mio pages* - ca. 40.000 books = ca. 0,5 € per book

              ... No more excuses for not A4, bigger formats are counted as multiple pages 35
ABBYY & OCR for IMPACT             * page size is
                                                  OCRing :o)
Pre-Announcement
ABBYY Online OCR Services with Gothic/Fraktur

       The ABBYY Gothic/Fraktur OCR Portal:
        finereader.abbyyonline.com
        Historic OCR added just last week
        Web GUI to upload documents and
         get results
        Simple to use
        Low Volume, ad hoc Usage
        Instant results, quality evaluation
        Pay as you go

       ABBYY Online OCR SDK
        OCR Service with API and XML Output
        Runs on Windows Azure
        Currently Closed Beta Test
        Public Beta Test Q1/2012


ABBYY & OCR for IMPACT                          36
Summary




ABBYY & OCR for IMPACT             37
The whole is greater than the
                 sum of its parts
                         (Aristotle)

ABBYY & OCR for IMPACT                     38
Thank you for your attention!

                                  Questions?




       Michael Fuchs
       Senior Product Marketing Manager
       ABBYY Europe
       fuchs@abbyy.com


ABBYY & OCR for IMPACT                         39

More Related Content

Viewers also liked

IMPACT Final Conference - Majlis Bremer Laamanen
IMPACT Final Conference - Majlis Bremer LaamanenIMPACT Final Conference - Majlis Bremer Laamanen
IMPACT Final Conference - Majlis Bremer LaamanenIMPACT Centre of Competence
 
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocr
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocrIMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocr
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocrIMPACT Centre of Competence
 
IMPACT/myGrid Hackathon - Taverna Server as a Portal
IMPACT/myGrid Hackathon - Taverna Server as a PortalIMPACT/myGrid Hackathon - Taverna Server as a Portal
IMPACT/myGrid Hackathon - Taverna Server as a PortalIMPACT Centre of Competence
 
IMPACT Final Conference - Language Parallel Sessions - Landsbergen
IMPACT Final Conference - Language Parallel Sessions -  LandsbergenIMPACT Final Conference - Language Parallel Sessions -  Landsbergen
IMPACT Final Conference - Language Parallel Sessions - LandsbergenIMPACT Centre of Competence
 
IMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to TavernaIMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to TavernaIMPACT Centre of Competence
 
IMPACT/myGrid Hackathon - Introduction to IMPACT
IMPACT/myGrid Hackathon - Introduction to IMPACTIMPACT/myGrid Hackathon - Introduction to IMPACT
IMPACT/myGrid Hackathon - Introduction to IMPACTIMPACT Centre of Competence
 

Viewers also liked (17)

IMPACT Final Conference - Khalil Rouhana
IMPACT Final Conference - Khalil  RouhanaIMPACT Final Conference - Khalil  Rouhana
IMPACT Final Conference - Khalil Rouhana
 
IMPACT Final Conference - Majlis Bremer Laamanen
IMPACT Final Conference - Majlis Bremer LaamanenIMPACT Final Conference - Majlis Bremer Laamanen
IMPACT Final Conference - Majlis Bremer Laamanen
 
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocr
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocrIMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocr
IMPACT Final Conference - Research Parallel Sessions - 03 typewritten ocr
 
IMPACT Final Conference - Paul Fogel
IMPACT Final Conference - Paul FogelIMPACT Final Conference - Paul Fogel
IMPACT Final Conference - Paul Fogel
 
IMPACT Final Conference - Aly Conteh
IMPACT Final Conference - Aly ContehIMPACT Final Conference - Aly Conteh
IMPACT Final Conference - Aly Conteh
 
IMPACT Final Conference - Clemens Neudecker
IMPACT Final Conference - Clemens NeudeckerIMPACT Final Conference - Clemens Neudecker
IMPACT Final Conference - Clemens Neudecker
 
IMPACT/myGrid Hackathon - Taverna Server as a Portal
IMPACT/myGrid Hackathon - Taverna Server as a PortalIMPACT/myGrid Hackathon - Taverna Server as a Portal
IMPACT/myGrid Hackathon - Taverna Server as a Portal
 
IMPACT Final Conference - Asaf Tzadok
IMPACT Final Conference - Asaf TzadokIMPACT Final Conference - Asaf Tzadok
IMPACT Final Conference - Asaf Tzadok
 
IMPACT Final Conference - Language Parallel Sessions - Landsbergen
IMPACT Final Conference - Language Parallel Sessions -  LandsbergenIMPACT Final Conference - Language Parallel Sessions -  Landsbergen
IMPACT Final Conference - Language Parallel Sessions - Landsbergen
 
IMPACT Final Conference - Muehlberger - FEP
IMPACT Final Conference - Muehlberger - FEPIMPACT Final Conference - Muehlberger - FEP
IMPACT Final Conference - Muehlberger - FEP
 
IMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to TavernaIMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to Taverna
 
IMPACT/myGrid Hackathon - Introduction to IMPACT
IMPACT/myGrid Hackathon - Introduction to IMPACTIMPACT/myGrid Hackathon - Introduction to IMPACT
IMPACT/myGrid Hackathon - Introduction to IMPACT
 
IMPACT Final Conference - Claus Gravenhorst
IMPACT Final Conference - Claus GravenhorstIMPACT Final Conference - Claus Gravenhorst
IMPACT Final Conference - Claus Gravenhorst
 
IMPACT Final Conference - Ulrich Reffle
IMPACT Final Conference - Ulrich ReffleIMPACT Final Conference - Ulrich Reffle
IMPACT Final Conference - Ulrich Reffle
 
IMPACT Final Conference - Stefan Pletschacher
IMPACT Final Conference - Stefan PletschacherIMPACT Final Conference - Stefan Pletschacher
IMPACT Final Conference - Stefan Pletschacher
 
IMPACT Final Conference - Jesse de Does
IMPACT Final Conference - Jesse de DoesIMPACT Final Conference - Jesse de Does
IMPACT Final Conference - Jesse de Does
 
IMPACT Final Conference - Katrien Depuydt
IMPACT Final Conference - Katrien DepuydtIMPACT Final Conference - Katrien Depuydt
IMPACT Final Conference - Katrien Depuydt
 

Similar to IMPACT Final Conference - Michael Fuchs

Bratislava WS - Fuchs - Abbyy - OCR overview_pdf
Bratislava WS - Fuchs - Abbyy - OCR overview_pdfBratislava WS - Fuchs - Abbyy - OCR overview_pdf
Bratislava WS - Fuchs - Abbyy - OCR overview_pdfIMPACT Centre of Competence
 
Digital image processing / Procesiranje digitalnih slik / 2. 9. 2009
Digital image processing / Procesiranje digitalnih slik / 2. 9. 2009Digital image processing / Procesiranje digitalnih slik / 2. 9. 2009
Digital image processing / Procesiranje digitalnih slik / 2. 9. 2009National and University Library
 
Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2Senturus
 
IRJET-Raspberry Pi Based Reader for Blind People
IRJET-Raspberry Pi Based Reader for Blind PeopleIRJET-Raspberry Pi Based Reader for Blind People
IRJET-Raspberry Pi Based Reader for Blind PeopleIRJET Journal
 
IBC 2010 press conference
IBC 2010 press conferenceIBC 2010 press conference
IBC 2010 press conferenceQuantel
 
OCR 's Functions
OCR 's FunctionsOCR 's Functions
OCR 's Functionsprithvi764
 
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f3
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f330fab7f5 f95f-2d10-8ba7-8edb4d69b9f3
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f3Yogeeswar Reddy
 
Integra Micro Software Services (P) Ltd. - Imaging Expertise
Integra Micro Software Services (P) Ltd. - Imaging ExpertiseIntegra Micro Software Services (P) Ltd. - Imaging Expertise
Integra Micro Software Services (P) Ltd. - Imaging Expertisesreesinbox
 
IRJET- Portable Camera based Assistive Text and Label Reading for Blind Persons
IRJET- Portable Camera based Assistive Text and Label Reading for Blind PersonsIRJET- Portable Camera based Assistive Text and Label Reading for Blind Persons
IRJET- Portable Camera based Assistive Text and Label Reading for Blind PersonsIRJET Journal
 
The New Reporting Experience in IBM Cognos Analytics: Demos of Our Favorite N...
The New Reporting Experience in IBM Cognos Analytics: Demos of Our Favorite N...The New Reporting Experience in IBM Cognos Analytics: Demos of Our Favorite N...
The New Reporting Experience in IBM Cognos Analytics: Demos of Our Favorite N...Senturus
 
IntrospeQt iCapture Connect for Alfresco
IntrospeQt iCapture Connect for AlfrescoIntrospeQt iCapture Connect for Alfresco
IntrospeQt iCapture Connect for AlfrescoSrikant Tallapragada
 
Create real value in your business process by automated data and form extraction
Create real value in your business process by automated data and form extractionCreate real value in your business process by automated data and form extraction
Create real value in your business process by automated data and form extractionMarvin Heng
 
Adobe in Technical Communication and Instructional Design
Adobe in Technical Communication and Instructional DesignAdobe in Technical Communication and Instructional Design
Adobe in Technical Communication and Instructional DesignScott Abel
 
IBM Cognos 10.2 What's New?
IBM Cognos 10.2 What's New?IBM Cognos 10.2 What's New?
IBM Cognos 10.2 What's New?Senturus
 
How PM Helped Build a Billion Dollar Business
How PM Helped Build a Billion Dollar BusinessHow PM Helped Build a Billion Dollar Business
How PM Helped Build a Billion Dollar BusinessSVPMA
 
Waking App Ltd: Next Gen AR Toolset
Waking App Ltd: Next Gen AR ToolsetWaking App Ltd: Next Gen AR Toolset
Waking App Ltd: Next Gen AR ToolsetAugmentedWorldExpo
 
IRJET- Optical Character Recognition for Blind using Raspberry Pi
IRJET- Optical Character Recognition for Blind using Raspberry PiIRJET- Optical Character Recognition for Blind using Raspberry Pi
IRJET- Optical Character Recognition for Blind using Raspberry PiIRJET Journal
 

Similar to IMPACT Final Conference - Michael Fuchs (20)

Bratislava WS - Fuchs - Abbyy - OCR overview_pdf
Bratislava WS - Fuchs - Abbyy - OCR overview_pdfBratislava WS - Fuchs - Abbyy - OCR overview_pdf
Bratislava WS - Fuchs - Abbyy - OCR overview_pdf
 
Digital image processing / Procesiranje digitalnih slik / 2. 9. 2009
Digital image processing / Procesiranje digitalnih slik / 2. 9. 2009Digital image processing / Procesiranje digitalnih slik / 2. 9. 2009
Digital image processing / Procesiranje digitalnih slik / 2. 9. 2009
 
Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2
 
IRJET-Raspberry Pi Based Reader for Blind People
IRJET-Raspberry Pi Based Reader for Blind PeopleIRJET-Raspberry Pi Based Reader for Blind People
IRJET-Raspberry Pi Based Reader for Blind People
 
IBC 2010 press conference
IBC 2010 press conferenceIBC 2010 press conference
IBC 2010 press conference
 
OCR 's Functions
OCR 's FunctionsOCR 's Functions
OCR 's Functions
 
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f3
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f330fab7f5 f95f-2d10-8ba7-8edb4d69b9f3
30fab7f5 f95f-2d10-8ba7-8edb4d69b9f3
 
Embarcadero C++Builder XE3 Datasheet
Embarcadero C++Builder XE3 DatasheetEmbarcadero C++Builder XE3 Datasheet
Embarcadero C++Builder XE3 Datasheet
 
Integra Micro Software Services (P) Ltd. - Imaging Expertise
Integra Micro Software Services (P) Ltd. - Imaging ExpertiseIntegra Micro Software Services (P) Ltd. - Imaging Expertise
Integra Micro Software Services (P) Ltd. - Imaging Expertise
 
IRJET- Portable Camera based Assistive Text and Label Reading for Blind Persons
IRJET- Portable Camera based Assistive Text and Label Reading for Blind PersonsIRJET- Portable Camera based Assistive Text and Label Reading for Blind Persons
IRJET- Portable Camera based Assistive Text and Label Reading for Blind Persons
 
The New Reporting Experience in IBM Cognos Analytics: Demos of Our Favorite N...
The New Reporting Experience in IBM Cognos Analytics: Demos of Our Favorite N...The New Reporting Experience in IBM Cognos Analytics: Demos of Our Favorite N...
The New Reporting Experience in IBM Cognos Analytics: Demos of Our Favorite N...
 
eCognition 8 Highlights
eCognition 8 HighlightseCognition 8 Highlights
eCognition 8 Highlights
 
IntrospeQt iCapture Connect for Alfresco
IntrospeQt iCapture Connect for AlfrescoIntrospeQt iCapture Connect for Alfresco
IntrospeQt iCapture Connect for Alfresco
 
Create real value in your business process by automated data and form extraction
Create real value in your business process by automated data and form extractionCreate real value in your business process by automated data and form extraction
Create real value in your business process by automated data and form extraction
 
Adobe in Technical Communication and Instructional Design
Adobe in Technical Communication and Instructional DesignAdobe in Technical Communication and Instructional Design
Adobe in Technical Communication and Instructional Design
 
IBM Cognos 10.2 What's New?
IBM Cognos 10.2 What's New?IBM Cognos 10.2 What's New?
IBM Cognos 10.2 What's New?
 
How PM Helped Build a Billion Dollar Business
How PM Helped Build a Billion Dollar BusinessHow PM Helped Build a Billion Dollar Business
How PM Helped Build a Billion Dollar Business
 
Index
IndexIndex
Index
 
Waking App Ltd: Next Gen AR Toolset
Waking App Ltd: Next Gen AR ToolsetWaking App Ltd: Next Gen AR Toolset
Waking App Ltd: Next Gen AR Toolset
 
IRJET- Optical Character Recognition for Blind using Raspberry Pi
IRJET- Optical Character Recognition for Blind using Raspberry PiIRJET- Optical Character Recognition for Blind using Raspberry Pi
IRJET- Optical Character Recognition for Blind using Raspberry Pi
 

More from IMPACT Centre of Competence

More from IMPACT Centre of Competence (20)

Session6 01.helmut schmid
Session6 01.helmut schmidSession6 01.helmut schmid
Session6 01.helmut schmid
 
Session1 03.hsian-an wang
Session1 03.hsian-an wangSession1 03.hsian-an wang
Session1 03.hsian-an wang
 
Session7 03.katrien depuydt
Session7 03.katrien depuydtSession7 03.katrien depuydt
Session7 03.katrien depuydt
 
Session7 02.peter kiraly
Session7 02.peter kiralySession7 02.peter kiraly
Session7 02.peter kiraly
 
Session6 04.giuseppe celano
Session6 04.giuseppe celanoSession6 04.giuseppe celano
Session6 04.giuseppe celano
 
Session6 03.sandra young
Session6 03.sandra youngSession6 03.sandra young
Session6 03.sandra young
 
Session6 02.jeremi ochab
Session6 02.jeremi ochabSession6 02.jeremi ochab
Session6 02.jeremi ochab
 
Session5 04.evangelos varthis
Session5 04.evangelos varthisSession5 04.evangelos varthis
Session5 04.evangelos varthis
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Session5 02.tom derrick
Session5 02.tom derrickSession5 02.tom derrick
Session5 02.tom derrick
 
Session5 01.rutger vankoert
Session5 01.rutger vankoertSession5 01.rutger vankoert
Session5 01.rutger vankoert
 
Session4 04.senka drobac
Session4 04.senka drobacSession4 04.senka drobac
Session4 04.senka drobac
 
Session3 04.arnau baro
Session3 04.arnau baroSession3 04.arnau baro
Session3 04.arnau baro
 
Session3 03.christian clausner
Session3 03.christian clausnerSession3 03.christian clausner
Session3 03.christian clausner
 
Session3 02.kimmo ketunnen
Session3 02.kimmo ketunnenSession3 02.kimmo ketunnen
Session3 02.kimmo ketunnen
 
Session3 01.clemens neudecker
Session3 01.clemens neudeckerSession3 01.clemens neudecker
Session3 01.clemens neudecker
 
Session2 04.ashkan ashkpour
Session2 04.ashkan ashkpourSession2 04.ashkan ashkpour
Session2 04.ashkan ashkpour
 
Session2 03.juri opitz
Session2 03.juri opitzSession2 03.juri opitz
Session2 03.juri opitz
 
Session2 02.christian reul
Session2 02.christian reulSession2 02.christian reul
Session2 02.christian reul
 
Session2 01.emad mohamed
Session2 01.emad mohamedSession2 01.emad mohamed
Session2 01.emad mohamed
 

Recently uploaded

Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 

Recently uploaded (20)

Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 

IMPACT Final Conference - Michael Fuchs

  • 1. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Li brary of the Netherlands. ABBYY & OCR Improvements for IMPACT Michael Fuchs Senior Product Marketing Manager ABBYY Europe fuchs@abbyy.com
  • 2. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Li brary of the Netherlands. Agenda  Who is ABBYY?  Company Overview  (Short) Product Overview  ABBYY Technology in the IMPACT project  OCR & Processing – IMPACT improvements  Binarisation, Segmentation,  Recognition  Dictionary API, Export Formats  Lessons Learned, Pricing, Pre-Announcement, Q&A 2
  • 3. ABBYY & IMPACT ABBYY & OCR for IMPACT 3
  • 4. ABBYY Group  Overview ABBYY Group  Founded in 1989 as BIT Software  > 1000 employees in 14 offices worldwide  Headquarters/R&D in Moscow, Russia ABBYY & OCR for IMPACT 4
  • 5. ABBYY OCR Products – Usage View Desktop/Workgroup Server/Backend SDK/Integration User driven processing, Automated processing, Automated processing, Ready to use Ready to use Development needed OCR & Document Conversion FineReader Recognition Server FineReader Engines (Professional, Corporate, (Professional, Extended (Windows, Linux, Mac OS X, Site Licence Edition) Edition) Free BSD, Embedded Systems) Note: No Gothic/Fraktur OCR! Gothic/Fraktur OCR Mobile OCR Engine & XML Export (Android, Symbian, Linux, PDF Transformer Support! Windows, Windows Mobile, FotoReader iOS ) ScreenshotReader End Users, Companies, Developers, Users are: Companies, Scan Service Provider, Scan Service Provider (Libraries) Libraries IMPACT Research ABBYY & OCR for IMPACT 5
  • 6. What (ABBYY) OCR can read...  Recognition Languages  Almost 200 OCR languages  34 languages with dictionary support and spell check  Alphabets: Cyrillic, Latin, Greek, Armenian, Hebrew, Thai  Chinese, Japanese, Korean (CJK) - 4 sets of hieroglyphs (Chinese (traditional and simplified), Japanese, Korean)  Arabic (Technical Preview in the SDK)  Font Types  Recognition of mixed font types (dot-matrix printer, typewriter, Gothic, etc.)  OCR-A  OCR-B  MICR (E13B)  CMC-7 ABBYY & OCR for IMPACT 6
  • 7. IMPACT & ABBYY  ABBYY is the OCR technology provider for IMPACT members  ABBYY also improved the core technologies for the recognition of old documents in IMPACT, focus areas are/were:  Image pre-processing  Segmentation  Character recognition  Export  IMPACT members work with the Software Development Kit (SDK) FineReader Engine – not the desktop application  IMPACT focus is/was on research and not in setting up a production system ;o)  Improved technologies are/will be added to current/future products ABBYY & OCR for IMPACT 7
  • 8. Designed to be not OCRed ABBYY & OCR for IMPACT 8
  • 9. Why ABBYY? - OCR …  Original Image [perfect quality :o) ]  Std. OCR *  ABBYY Fraktur OCR* *Recognition Server 3.0 R1 – Gothic/Fraktur disabled and enabled ABBYY & OCR for IMPACT 9
  • 10. ABBYY “History” and Old Fonts Recognition  FineReader XIX (V7 Technology) 2003 (METAe result 2000-2003)  FineReader Engine 9.0 (Release 1) 2008 (Pre-IMPACT – “State of the Art”)  FineReader Engine 10 2010 IMPACT Project Optimizations ABBYY & OCR for IMPACT 10
  • 11. ABBYY and Old European Fonts Accuracy Comparison: Up to 98,2 % on good quality images 2003 2008 2010 ABBYY Technology Version 10 recognition of old European fonts:  25% more accurate than FRE 9.0  38% more accurate than FR XIX ABBYY & OCR for IMPACT 11
  • 12. OCR Processing Steps & ABBYY Improvements for IMPACT ABBYY & OCR for IMPACT 12
  • 13. Processing Steps  Step 1. Scanning, Image Loading, Pre-Processing and Modification  Compensating image defects and making the document suited for automatic OCR  Step 2. Document Layout Analysis  Layout analysis, detection of document sections like text, images and barcodes  Step 3. (Optical) Character Recognition  Automatic recognition of characters, apply selected recognition languages & dictionaries  Step 4. (optional) Verification - by Operators or automated post correction  Manual validation of suspicious characters and words  Step 5. Document Synthesis and Export  Generating an output document in the selected format ABBYY & OCR for IMPACT 13
  • 14. Step 1: Image pre-processing ABBYY & OCR for IMPACT 14
  • 15. Step 1: Image pre-processing Image Loading, Pre-Processing and Modification  Intelligent background filtering  Adaptive Binarisation General binarisation on an image level can not deliver good results for OCR ABBYY & OCR for IMPACT 15
  • 16. Step 1: Image pre-processing New V10: Binarisation, Textured Background optimisations Original scan V9 binarisation New V10 binarisation ABBYY & OCR for IMPACT 16
  • 17. Step 1: Image pre-processing New V10: Binarisation, Textured Background optimisations Original scan V9 binarisation V10 binarisation ABBYY & OCR for IMPACT 17
  • 18. Step 1: Image pre-processing New V10: Binarisation for the IMPACT project  Original  State of Art (V9)  New (V10)  No text from the other page! ABBYY & OCR for IMPACT 18
  • 19. Step 2: Document Layout Analysis ABBYY & OCR for IMPACT 19
  • 20. Step 2: Document Layout Analysis Analyze layout and find text, images, tables and barcodes ABBYY & OCR for IMPACT 20
  • 21. Step 2: Document Layout Analysis (old Newspapers) Segmentation Improvements: Image/Text detection – Example 1/3 V9 Technology V10 Technology Part of the column was detected as an image ABBYY & OCR for IMPACT 21
  • 22. Step 2: Document Layout Analysis (old Newspapers) Segmentation Improvements: Word Order Detection– Example 2/3 V9 Technology V10 Technology Less linear word order errors ABBYY & OCR for IMPACT 22
  • 23. Step 2: Document Layout Analysis (old Newspapers) Segmentation Improvements: Lost text (no Detection) – Example 3/3 V9 Technology V10 Technology Less lost text ABBYY & OCR for IMPACT 23
  • 24. Step 2: Document Layout Analysis Segmentation Improvements: IMPACT Results over time  Before IMPACT:  Overall segmentation improvements ● Better picture detection ● Better separators ● Better page layout reconstruction  Only a random set of old newspapers available  After IMPACT:  IMPACT Segmentation Ground Truth available  New (internal) DA model for historic newspapers  New segmentation evaluation methodology  Evaluation results on newspapers ● 40% less split/merge errors ● 25% less garbage and lost text ABBYY & OCR for IMPACT 24
  • 25. Step 3: Text/Character Recognition ABBYY & OCR for IMPACT 25
  • 26. Step 3: Text/Character Recognition  Samples for Classifiers used in ABBYY technologies After line detection, character recognition is applied with different classifiers Raster classifier Contour classifier Structure classifier Feature differentiating classifier ABBYY & OCR for IMPACT 26
  • 27. Step 3: Text/Character Recognition Optimization and new Developments  Improved Gothic Classifiers  A significant amount of time was invested in gothic classifier training  The library selection of ground truth material (historical relevance) was used  New gothic graphemes were added  Results  Good quality images: 2.8% (total) error rate on the used test set which is about 20% improvement to the “state of art” (V9) = almost comparable to modern documents  Bad quality Images: 7% (total) error rate on the used test set which is about 30% improvement to the “state of art” (V9)  Most of the improvements available in ABBYY current products: ABBYY FineReader Engine 10 (SDK) & Recognition Server 3.0 Quality optimization will be continued in future releases and technology cycles optimized ABBYY & OCR for IMPACT 27
  • 28. Step 3: Text/Character Recognition Optimization and new Developments  Old Slavonic as new OCR Language New Development Before Now ABBYY & OCR for IMPACT 28
  • 29. Quality-Test-Comparison: Binarisation & Recognition Improvements ABBYY & OCR for IMPACT 29
  • 30. Binarisation & Recognition Improvements  How to evaluate the recognition improvements of binarisation?   Binarisation & recognition quality go hand in hand! -> # Errors = 100% with V9 binarisation & V9 recognition -> # Errors = -5% with V9 binarisation & V10 recognition -> # Errors = -11% with V10 binarisation & V9 recognition -> # Errors = -15% with V10 binarisation & V10 recognition Binarisation Recognition Technology ABBYY & OCR for IMPACT 30
  • 31. Step 3-5: Dictionaries & Export ABBYY & OCR for IMPACT 31
  • 32. Step 3 – 5: Other Optimizations  External Dictionary API Tuning  External Dictionary API was available in the FineReader Engine (SDK)  Support for any language, any time period  API was/is heavily used from IMPACT language partners to run quality tests  New ALTO XML Export Formats  FineReader Engine 10 R2, December 2010  Recognition Server 3.0, July 2011 ABBYY & OCR for IMPACT 32
  • 33. Additional Notes ABBYY & OCR for IMPACT 33
  • 34. Further Information & Trial Versions  The ABBYY Gothic/Fraktur OCR Portal: www.frakturschrift.com ABBYY & OCR for IMPACT 34
  • 35. What IMPACT taught ABBYY about Libraries & Mass Digitalization projects…  The Reality  Masses of books/document are available & already scanned  It is unclear if Antiqua and/or Gothic/Fraktur fonts are used in the documents  Pre-Sorting is impossible, it would be too time/cost expensive  ABBYY Europe's Answer Reduced the pricing for mixed “Old” + “Modern” font OCR projects The pricing is now ready for “mass processing”  Examples Recognition Server 3.0 with “Gothic” enabled  10.000 pages – 299 Euro – available online  500.000 pages* – 5.000 Euro = 1 Euro cent per page = ca 2.000 books a 250 pages  Over 3 Mio pages* - ca 0,52 Euro cent per page = 12.000 books a 1,25 € (250 pages)  Over 10 Mio pages* - ca. 40.000 books = ca. 0,5 € per book ... No more excuses for not A4, bigger formats are counted as multiple pages 35 ABBYY & OCR for IMPACT * page size is OCRing :o)
  • 36. Pre-Announcement ABBYY Online OCR Services with Gothic/Fraktur  The ABBYY Gothic/Fraktur OCR Portal: finereader.abbyyonline.com  Historic OCR added just last week  Web GUI to upload documents and get results  Simple to use  Low Volume, ad hoc Usage  Instant results, quality evaluation  Pay as you go  ABBYY Online OCR SDK  OCR Service with API and XML Output  Runs on Windows Azure  Currently Closed Beta Test  Public Beta Test Q1/2012 ABBYY & OCR for IMPACT 36
  • 37. Summary ABBYY & OCR for IMPACT 37
  • 38. The whole is greater than the sum of its parts (Aristotle) ABBYY & OCR for IMPACT 38
  • 39. Thank you for your attention! Questions? Michael Fuchs Senior Product Marketing Manager ABBYY Europe fuchs@abbyy.com ABBYY & OCR for IMPACT 39