SlideShare a Scribd company logo
1 of 53
An Introduction to
Document
Scanning
Business Document Scanning 101:
From the Data Capture Prospective
So you
have a lot
of this?
And you’ve decided
this is the answer.
So you need a crash
course in scanning
Lessons:
Lesson 1: Simplex or Duplex
Lesson 2: Resolution
Lesson 3: Color Depth
Lesson 4: File Formats
Lesson 5: Indexing
Lesson 6: Document Prep and Estimating Volumes
Homework: Learn More About Data Capture and Document Management
Lesson 1: Simplex or Duplex
Are the documents single or double-sided?
This may seem obvious but…
You many not want documents such as
purchase invoices scanned in duplex where
the back of the document only contains terms
and conditions.
On the other hand, if the documents have
high legal importance you may want every
conceivable item of information captured
such as small signatures or notes on the back.
Duplex scanning requires
more scanning
time/processing and
results in larger files.
And you don’t have to be
a genius to know that is
more costly.
Lesson 2: Resolution
So what is resolution and why does it matter?
Resolution is expressed as the number of dots
per inch (dpi) or less frequently pixels. Pixel
refers to “picture element” per inch (ppi) which
make up the image or really at what the image
was sampled.
What is Resolution?
Implications of Resolution
This graphic contains
two images, a “0” as a
grayscale image and an
“x” as black and white.
Implications of Resolution
• If we halved the size of the grid horizontally and
vertically (doubled the resolution), the pixels would
appear smoother and produce a better quality image,
the inverse would be true if we doubled the size of the
squares.
• If we kept the squares the same size but reduced the
size of the characters significantly the resolution is
insufficient.
Implications of Resolution
• The higher the resolution, the better the image
quality.
• For small characters, increase the resolution to
capture them effectively
So:
And, the higher the resolution,
the slower the scan and the
larger the file.
And, the higher the resolution,
the slower the scan and the
larger the file.
Which means higher scanning
and file storage costs, Einstein.
Typical Scanning Resolutions
• Web graphic – 96 dpi
• Standard archive document – 200 dpi
• Document required for optical character
recognition (OCR) – 300 dpi
• Plans/drawings for vectorization – 400 dpi
• Documents required for historical archiving –
600 dpi
Resolution is generally determined by intended
use.
Lesson 3: Color Depth
Documents scanned in black and white are
always scanned as grayscale within the
scanner. The scanner then applies a process
known as thresholding to the image to produce
the black and white image.
Thresholding simply determines when a pixel
should be black or white.
Understanding Black and White
Grayscale is used when the image contains
color or grayscale data and the tone of the
image needs to be retained, i.e. photographs or
shaded graphics.
Understanding Grayscale
Color is obviously used when the image
contains color data. Some users wish to retain
important color information for example, land
boundaries or graphical data, and not
letterhead logos, highlighters, etc.
Understanding Color
Bits per
pixel
File Storage Requirements
24 8 1
Bits per
pixel
File Storage Requirements
24 8 1
So the storage requirements for a grayscale image is 8
times larger than a black and white, and color
requirements are 24 times more than black and white.
And, remember Einstein, larger files equals higher costs.
Lesson 4: File Formats
TIFF
JPEG
PDF
For an in-depth look visit: PDF v. TIFF
• Well established format
• Most often used for black and white documents
• Supports multiple pages
• Interpreted correctly by most applications with a
caution on certain color implementations
• “Group 4” format refers to the compression method
used on black and white images which is a “lossless”
compression where original data is not lost in
compression/decompression.
Understanding TIFF*
TIFF
*Tagged Image File Format
• Well established format by Adobe
• Supports color, grayscale, and black and white
• Supports multiple pages
• Generally stored using Group 4 and JPEG
compression although supports other formats too.
• Used when more advanced features are needed
within the file such as embedded Optical Character
Recognition (OCR), hyperlinking, digital signing
and other security features.
Understanding PDF*
PDF
*Portable Document Format
Searchable PDF:
Understanding PDF Variations
PDF
Many scanning applications can create searchable
PDF files. Here, the scanner applies OCR technology
to make the file text searchable. Your application
may label this as “make searchable”, “apply OCR”,
“text-under-image” or “searchable PDF.” If selected,
your file will be text searchable or text selectable
within the Acrobat viewer and many other programs
that search PDF files
PDF/A:
Understanding PDF Variations
PDF
PDF/A is an ISO-standard for digital preservation or
archiving of electronic documents.
It differs from standard PDF by omitting features not
necessary for long-term archiving, such as font
linking.
Growing in international government and industry
segments, including legal systems, libraries,
newspapers, and regulated industries.
Understanding JPEG
JPEG
*Joint Photographic Expert Group
• Well established format
• Most often used for photographs and graphics
• Supports single page only
• A “lossy” compression format, that is, some of the
data is lost during compression. however it provides
good compression ratios for grayscale and color
images.
Compression and File Size
*Comparison courtesy of Wikipedia
OMG,
right?
JPEG
Compression and File Size
*Comparison courtesy of Wikipedia
OMG,
right?
The bottom line: experiment with your
images and file size. A middle quality
scan may meet your needs and save
tremendous file space.
Lesson 5: Indexing
For an in-depth look visit: What is Document Indexing?
What is Indexing?
Document indexing (sometimes referred to as
metadata) enables a users to quickly and
efficiently locate their documents, either
through a folder structure, database or
electronic document management system.
Avoid a disaster
Avoid a disaster
Great care should be taken to design an efficient indexing
scheme. If the design is not devised correctly at the outset,
trying to rectify it later can be both difficult and costly.
Sometimes it makes sense to replicate the current manual
method for document location to create a familiar, but faster
system.
Don’t worry, there is automation
Technologies such as
• Barcode recognition
• OCR
• Batch processing
• Data Mining, Text Mining
can save time and money by automating indexing and
more.
Using Barcodes for Indexing
Intelligent data
capture software
can extract data
from barcodes to
create and send
index information
to a document
management
system.
For an in-depth look at barcodex in data capture
visit: What Can Barcodes Do For Me?
With OCR, make your image-based file fully
text searchable or extract data from a zone for
indexing.
Using OCR for Indexing
With zonal OCR, document
areas are identified for
automatic OCR capture.
Additionally, drag-and-drop
OCR allows an operator to
highlight document text
which is automatically OCR'd
and dropped into index
fields.
TIPS for OCR
• Scan at 300 dpi for greater accuracy
and ensure that small text is captured.
• Limit the use of color on documents.
• Pre-process the image with image
enhancement software (available in
many data capture products, learn
more).
Intelligent data capture solutions often use batch processing that
lets you process a whole folder of documents at a time. Some
products can “watch folders,” and process files as they are
scanned into the folder.
What is Batch Processing?
For an in-depth look visit: What is Batch Document Processing?
Intelligent data capture solutions often use batch processing that
lets you process a whole folder of documents at a time. Some
products can “watch folders,” and process files as they are
scanned into the folder.
What is Batch Processing?
Processing can include indexing, file routing, file splitting,
and cleaning/enhancing the scans. Learn more.
Lesson 6:
Document
Prep and
Estimating
Volumes
Preparation, quality control and indexing are the
most time consuming elements of any scanning
job and usually the most costly.
TIPS for OCR
Typically a good operator can prepare 750-1000
documents per hour, however a number of
factors may drop throughput to 300 or 500.
Odd Size Document Type
sales receipts, photos,
plans/drawings,
Bindings
three ring, spiral, glue,
folder
Fasteners
staples, paper clips binder
clips, rubber bands
Attachments
Post-its, tabs
Factors that Influence Document Prep
Estimating Volumes and
Storage
Type
Paper
Folders Ring Binder
Lever arch
folder
Transfer
Cases
Bankers
Boxes Archive Boxes
Filing
Cabinets
Simplex
(avg #s)
30 to 100 200 500 500 500 2500 3000/drawer
Duplex
(avg #s)
60 to 200 400 1000 1000 1000 5000 6000/drawer
Learn more about estimating volumes
Homework: Learn More About
Data Capture and Document
Management
More
Document Management
Determine if you require a full document
management system or do you just need a
simple search and retrieval system?
Can I use it as a stepping stone while I
evaluate my document management
system?
More
Learn More
Call us for information on:
How to digitize medical or dental records.
The best way to scan medical or dental records.
Scanning paper records.
Document scanning for medical or dental records.
Going paperless at the medical or dental office.
How to capture medical or dental records efficiently.
Scanning medical or dental records with Fujitsu ScanSnap.
Touchscreen scanning of medical or dental records.
How to improve your medical or dental workflow with document scanning.
Scanning to EMR or scanning to EDR
How to maximize your Fujitsu ScanSnap
Using your ScanSnap for a basic document management system
Using barcodes and the Fujitsu ScanSnap
Scanning with the Fujitsu ScanSnap
Automating workflow with the Fujitsu ScanSnap
Automating document management capture
Scanning into Dentrix
Indexing into Dentrix
Understanding basic Document Scanning
Things your teacher never told you about Document Scanning
An introduction to Document Scanning
Scanning Fundamentals for the average Joe
By DocuFi
Makers of ImageRamp Data Capture Solutions
30 years’ Experience in the Document Imaging
Market
Proven Fujitsu ISV Partner
Find out more at ImageRamp and
www.docufi.com
Image Credits
• Pjohnkeane, Requirements, requirements, requirements, http://bit.ly/1fcULDf
• Doug Waldron, “Files (85)”, http://bit.ly/1bfciII
• UBC Learning Commons, “Scanner_icon-1024x671”, http://bit.ly/1eewI4P
• Knile Lucy, you have some sorting to do! http://bit.ly/19bSgjF
• Michael 1952, SJSA Fifth Grade - I Fell in Love With The Teacher, http://bit.ly/1eevu9A
• Ton Haex, ”Einstein show.... “, http://bit.ly/LVqeBi
• Loco Steve, “Sunrise under scrutiny”, http://bit.ly/1eevSVv
• Tax Credits, “ Coins”, http://bit.ly/1mtQj5j
• j_baer, ”Ubuntu Color Wheel”, http://bit.ly/1jARikx
• Marcin Wichary, Alphabetical, http://bit.ly/1aILOku
• David Erickson e-strategyblog.com, “Hindenburg Disaster”, http://bit.ly/1jASeFF
• William Warby wwarby,” Gears”, http://bit.ly/1dwtU1S
• Alan Cleaver,” watching”, http://bit.ly/1h1k9k7
• Zoetnet, “overflowing,” http://bit.ly/KHW9Em
• Seattle Municipal Archives, “Comptroller's Office employees, 1960”, http://bit.ly/1eBvLGE
• Seattle Municipal Archives , “City Light worker with office machine, 1954”,
http://bit.ly/1eBw3NM
• Patrick Hoesly, “Thank you” http://bit.ly/17xKErE
All images are owned or licensed by DocuFi with acknowledgement given to:

More Related Content

What's hot

Basic Component of Document Management System Software
Basic Component of Document Management System SoftwareBasic Component of Document Management System Software
Basic Component of Document Management System SoftwareDigismartek
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projectszsrlibrary
 
Document management system
Document management systemDocument management system
Document management systemRaghu Raja
 
Big Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBig Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBernard Marr
 
Organizational Benefits Of Document Management System
Organizational Benefits Of Document Management SystemOrganizational Benefits Of Document Management System
Organizational Benefits Of Document Management SystemClare White
 
Multimedia applications
Multimedia applicationsMultimedia applications
Multimedia applicationssmoky_stu
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Document Management System (DMS)
Document Management System (DMS)Document Management System (DMS)
Document Management System (DMS)Hiran Wickramainghe
 
Data Governance Best Practices and Lessons Learned
Data Governance Best Practices and Lessons LearnedData Governance Best Practices and Lessons Learned
Data Governance Best Practices and Lessons LearnedDATAVERSITY
 
Multimedia database
Multimedia databaseMultimedia database
Multimedia databaseRashmi Agale
 

What's hot (20)

Current trends in DBMS
Current trends in DBMSCurrent trends in DBMS
Current trends in DBMS
 
Basic Component of Document Management System Software
Basic Component of Document Management System SoftwareBasic Component of Document Management System Software
Basic Component of Document Management System Software
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projects
 
Multimedia
MultimediaMultimedia
Multimedia
 
Introduction to Document Management
Introduction to Document ManagementIntroduction to Document Management
Introduction to Document Management
 
Document management system
Document management systemDocument management system
Document management system
 
Big data
Big dataBig data
Big data
 
Big Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBig Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must Know
 
Organizational Benefits Of Document Management System
Organizational Benefits Of Document Management SystemOrganizational Benefits Of Document Management System
Organizational Benefits Of Document Management System
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Multimedia applications
Multimedia applicationsMultimedia applications
Multimedia applications
 
Mutimedia databases
Mutimedia databasesMutimedia databases
Mutimedia databases
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Text mining
Text miningText mining
Text mining
 
Multimedia db system
Multimedia db systemMultimedia db system
Multimedia db system
 
Document Management System (DMS)
Document Management System (DMS)Document Management System (DMS)
Document Management System (DMS)
 
Data Governance Best Practices and Lessons Learned
Data Governance Best Practices and Lessons LearnedData Governance Best Practices and Lessons Learned
Data Governance Best Practices and Lessons Learned
 
Tesxt mining
Tesxt miningTesxt mining
Tesxt mining
 
Semantic web
Semantic webSemantic web
Semantic web
 
Multimedia database
Multimedia databaseMultimedia database
Multimedia database
 

Viewers also liked

Why you need to use document scanning management system for business?
Why you need to use document scanning management system for business?Why you need to use document scanning management system for business?
Why you need to use document scanning management system for business?Digismartek
 
Scanning & document management
Scanning & document managementScanning & document management
Scanning & document managementGautam Ganguly
 
Document scanning and capture (local, central, outsource) what's working best
Document scanning and capture (local, central, outsource) what's working bestDocument scanning and capture (local, central, outsource) what's working best
Document scanning and capture (local, central, outsource) what's working bestVander Loto
 
Scanning Document Types | Record Nations
Scanning Document Types | Record NationsScanning Document Types | Record Nations
Scanning Document Types | Record NationsRecord Nations
 
Apa itu soft copy
Apa itu soft copyApa itu soft copy
Apa itu soft copyjohnthj
 

Viewers also liked (15)

What is Intelligent Document and Data Capture? A look at the technologies to ...
What is Intelligent Document and Data Capture? A look at the technologies to ...What is Intelligent Document and Data Capture? A look at the technologies to ...
What is Intelligent Document and Data Capture? A look at the technologies to ...
 
Image Scanning Services
Image Scanning ServicesImage Scanning Services
Image Scanning Services
 
Why you need to use document scanning management system for business?
Why you need to use document scanning management system for business?Why you need to use document scanning management system for business?
Why you need to use document scanning management system for business?
 
What is Data Capture
What is Data CaptureWhat is Data Capture
What is Data Capture
 
RU
RURU
RU
 
Scanning & document management
Scanning & document managementScanning & document management
Scanning & document management
 
Document scanning and capture (local, central, outsource) what's working best
Document scanning and capture (local, central, outsource) what's working bestDocument scanning and capture (local, central, outsource) what's working best
Document scanning and capture (local, central, outsource) what's working best
 
What is Document Indexing? A tutorial for intelligent data capture.
What is Document Indexing? A tutorial for intelligent data capture.What is Document Indexing? A tutorial for intelligent data capture.
What is Document Indexing? A tutorial for intelligent data capture.
 
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File FormatsPDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
 
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
 
Scanning Document Types | Record Nations
Scanning Document Types | Record NationsScanning Document Types | Record Nations
Scanning Document Types | Record Nations
 
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
 
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
 
What can barcodes do for me? A look at barcodes in Document Management/EMR da...
What can barcodes do for me? A look at barcodes in Document Management/EMR da...What can barcodes do for me? A look at barcodes in Document Management/EMR da...
What can barcodes do for me? A look at barcodes in Document Management/EMR da...
 
Apa itu soft copy
Apa itu soft copyApa itu soft copy
Apa itu soft copy
 

Similar to An Introduction to Document Scanning, Understanding Your Requirements

Document Automation and Integration Webinar For CVision
Document Automation and Integration Webinar For CVisionDocument Automation and Integration Webinar For CVision
Document Automation and Integration Webinar For CVisionChris Riley ☁
 
Grootschalige digitalisering van archivalia
Grootschalige digitalisering van archivaliaGrootschalige digitalisering van archivalia
Grootschalige digitalisering van archivaliaMarc Holtman
 
Praveen
PraveenPraveen
Praveenrjmktg
 
Document management system
Document management systemDocument management system
Document management systemAbhishek Agrawal
 
Asset Management and Workflow
Asset Management and WorkflowAsset Management and Workflow
Asset Management and WorkflowVirtu Institute
 
Document Management System Overview
Document Management System OverviewDocument Management System Overview
Document Management System OverviewSaif Enterprise
 
Developing a plan for your imaging project
Developing a plan for your imaging projectDeveloping a plan for your imaging project
Developing a plan for your imaging projectTAB
 
Scanning and Digitization
Scanning and DigitizationScanning and Digitization
Scanning and DigitizationMike Sleigh
 
Understanding EDP (Electronic Data Processing) Environment
Understanding EDP (Electronic Data Processing) EnvironmentUnderstanding EDP (Electronic Data Processing) Environment
Understanding EDP (Electronic Data Processing) EnvironmentAdetula Bunmi
 
Smartfish Presentation 2007
Smartfish Presentation 2007Smartfish Presentation 2007
Smartfish Presentation 2007waynehooper
 
Backing Up And Working With Digital Documents
Backing Up And Working With Digital DocumentsBacking Up And Working With Digital Documents
Backing Up And Working With Digital DocumentsNancy Duhon
 

Similar to An Introduction to Document Scanning, Understanding Your Requirements (20)

What is Batch Document Processing? A tutorial for document capture.
What is Batch Document Processing?  A tutorial for document capture.What is Batch Document Processing?  A tutorial for document capture.
What is Batch Document Processing? A tutorial for document capture.
 
Document Automation and Integration Webinar For CVision
Document Automation and Integration Webinar For CVisionDocument Automation and Integration Webinar For CVision
Document Automation and Integration Webinar For CVision
 
Batch Document Processing with ImageRamp Batch
Batch Document Processing with ImageRamp BatchBatch Document Processing with ImageRamp Batch
Batch Document Processing with ImageRamp Batch
 
Automatic file naming and routing for scanned documents and existing files.
Automatic file naming and routing for scanned documents and existing files.  Automatic file naming and routing for scanned documents and existing files.
Automatic file naming and routing for scanned documents and existing files.
 
Grootschalige digitalisering van archivalia
Grootschalige digitalisering van archivaliaGrootschalige digitalisering van archivalia
Grootschalige digitalisering van archivalia
 
Document management tools and techniques
Document management tools and techniquesDocument management tools and techniques
Document management tools and techniques
 
Praveen
PraveenPraveen
Praveen
 
Document management system
Document management systemDocument management system
Document management system
 
Asset Management and Workflow
Asset Management and WorkflowAsset Management and Workflow
Asset Management and Workflow
 
Folder Watching For Automated Document Capture, Batch Scanning
Folder Watching For Automated Document Capture, Batch ScanningFolder Watching For Automated Document Capture, Batch Scanning
Folder Watching For Automated Document Capture, Batch Scanning
 
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
 
Document Management System Overview
Document Management System OverviewDocument Management System Overview
Document Management System Overview
 
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned ImagesImprove OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
 
Developing a plan for your imaging project
Developing a plan for your imaging projectDeveloping a plan for your imaging project
Developing a plan for your imaging project
 
Scanning and Digitization
Scanning and DigitizationScanning and Digitization
Scanning and Digitization
 
Understanding EDP (Electronic Data Processing) Environment
Understanding EDP (Electronic Data Processing) EnvironmentUnderstanding EDP (Electronic Data Processing) Environment
Understanding EDP (Electronic Data Processing) Environment
 
Automated Data Capture and Extraction with ChronoScan for Automated Metadata ...
Automated Data Capture and Extraction with ChronoScan for Automated Metadata ...Automated Data Capture and Extraction with ChronoScan for Automated Metadata ...
Automated Data Capture and Extraction with ChronoScan for Automated Metadata ...
 
8 Document Capture Must Haves, a Document Management Tutorial
8 Document Capture Must Haves, a Document Management Tutorial8 Document Capture Must Haves, a Document Management Tutorial
8 Document Capture Must Haves, a Document Management Tutorial
 
Smartfish Presentation 2007
Smartfish Presentation 2007Smartfish Presentation 2007
Smartfish Presentation 2007
 
Backing Up And Working With Digital Documents
Backing Up And Working With Digital DocumentsBacking Up And Working With Digital Documents
Backing Up And Working With Digital Documents
 

More from DocuFi, offering HAI and Infection Prevention Analytics (6)

HAIvia Mobile for Infection Prevention Data Capture and Forms Management (for...
HAIvia Mobile for Infection Prevention Data Capture and Forms Management (for...HAIvia Mobile for Infection Prevention Data Capture and Forms Management (for...
HAIvia Mobile for Infection Prevention Data Capture and Forms Management (for...
 
Automated Document Indexing with ImageRamp
Automated Document Indexing with ImageRampAutomated Document Indexing with ImageRamp
Automated Document Indexing with ImageRamp
 
Custom Capture Tool Development
Custom Capture Tool DevelopmentCustom Capture Tool Development
Custom Capture Tool Development
 
Tips to Solve Common Problems Reading Barcodes
Tips to Solve Common Problems Reading BarcodesTips to Solve Common Problems Reading Barcodes
Tips to Solve Common Problems Reading Barcodes
 
Intelligent Data Capture Just Got Better, What's New in ImageRamp 6
Intelligent Data Capture Just Got Better, What's New in ImageRamp 6Intelligent Data Capture Just Got Better, What's New in ImageRamp 6
Intelligent Data Capture Just Got Better, What's New in ImageRamp 6
 
Transformation in the Electric Utility Industry, Redevelopment of Decommissio...
Transformation in the Electric Utility Industry, Redevelopment of Decommissio...Transformation in the Electric Utility Industry, Redevelopment of Decommissio...
Transformation in the Electric Utility Industry, Redevelopment of Decommissio...
 

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

An Introduction to Document Scanning, Understanding Your Requirements

  • 1. An Introduction to Document Scanning Business Document Scanning 101: From the Data Capture Prospective
  • 2. So you have a lot of this?
  • 3. And you’ve decided this is the answer.
  • 4. So you need a crash course in scanning
  • 5. Lessons: Lesson 1: Simplex or Duplex Lesson 2: Resolution Lesson 3: Color Depth Lesson 4: File Formats Lesson 5: Indexing Lesson 6: Document Prep and Estimating Volumes Homework: Learn More About Data Capture and Document Management
  • 6. Lesson 1: Simplex or Duplex Are the documents single or double-sided? This may seem obvious but…
  • 7. You many not want documents such as purchase invoices scanned in duplex where the back of the document only contains terms and conditions. On the other hand, if the documents have high legal importance you may want every conceivable item of information captured such as small signatures or notes on the back.
  • 8. Duplex scanning requires more scanning time/processing and results in larger files.
  • 9. And you don’t have to be a genius to know that is more costly.
  • 11. So what is resolution and why does it matter?
  • 12. Resolution is expressed as the number of dots per inch (dpi) or less frequently pixels. Pixel refers to “picture element” per inch (ppi) which make up the image or really at what the image was sampled. What is Resolution?
  • 13. Implications of Resolution This graphic contains two images, a “0” as a grayscale image and an “x” as black and white.
  • 14. Implications of Resolution • If we halved the size of the grid horizontally and vertically (doubled the resolution), the pixels would appear smoother and produce a better quality image, the inverse would be true if we doubled the size of the squares. • If we kept the squares the same size but reduced the size of the characters significantly the resolution is insufficient.
  • 15. Implications of Resolution • The higher the resolution, the better the image quality. • For small characters, increase the resolution to capture them effectively So:
  • 16. And, the higher the resolution, the slower the scan and the larger the file.
  • 17. And, the higher the resolution, the slower the scan and the larger the file. Which means higher scanning and file storage costs, Einstein.
  • 18. Typical Scanning Resolutions • Web graphic – 96 dpi • Standard archive document – 200 dpi • Document required for optical character recognition (OCR) – 300 dpi • Plans/drawings for vectorization – 400 dpi • Documents required for historical archiving – 600 dpi Resolution is generally determined by intended use.
  • 20. Documents scanned in black and white are always scanned as grayscale within the scanner. The scanner then applies a process known as thresholding to the image to produce the black and white image. Thresholding simply determines when a pixel should be black or white. Understanding Black and White
  • 21. Grayscale is used when the image contains color or grayscale data and the tone of the image needs to be retained, i.e. photographs or shaded graphics. Understanding Grayscale
  • 22. Color is obviously used when the image contains color data. Some users wish to retain important color information for example, land boundaries or graphical data, and not letterhead logos, highlighters, etc. Understanding Color
  • 23. Bits per pixel File Storage Requirements 24 8 1
  • 24. Bits per pixel File Storage Requirements 24 8 1 So the storage requirements for a grayscale image is 8 times larger than a black and white, and color requirements are 24 times more than black and white. And, remember Einstein, larger files equals higher costs.
  • 25. Lesson 4: File Formats TIFF JPEG PDF For an in-depth look visit: PDF v. TIFF
  • 26. • Well established format • Most often used for black and white documents • Supports multiple pages • Interpreted correctly by most applications with a caution on certain color implementations • “Group 4” format refers to the compression method used on black and white images which is a “lossless” compression where original data is not lost in compression/decompression. Understanding TIFF* TIFF *Tagged Image File Format
  • 27. • Well established format by Adobe • Supports color, grayscale, and black and white • Supports multiple pages • Generally stored using Group 4 and JPEG compression although supports other formats too. • Used when more advanced features are needed within the file such as embedded Optical Character Recognition (OCR), hyperlinking, digital signing and other security features. Understanding PDF* PDF *Portable Document Format
  • 28. Searchable PDF: Understanding PDF Variations PDF Many scanning applications can create searchable PDF files. Here, the scanner applies OCR technology to make the file text searchable. Your application may label this as “make searchable”, “apply OCR”, “text-under-image” or “searchable PDF.” If selected, your file will be text searchable or text selectable within the Acrobat viewer and many other programs that search PDF files
  • 29. PDF/A: Understanding PDF Variations PDF PDF/A is an ISO-standard for digital preservation or archiving of electronic documents. It differs from standard PDF by omitting features not necessary for long-term archiving, such as font linking. Growing in international government and industry segments, including legal systems, libraries, newspapers, and regulated industries.
  • 30. Understanding JPEG JPEG *Joint Photographic Expert Group • Well established format • Most often used for photographs and graphics • Supports single page only • A “lossy” compression format, that is, some of the data is lost during compression. however it provides good compression ratios for grayscale and color images.
  • 31. Compression and File Size *Comparison courtesy of Wikipedia OMG, right? JPEG
  • 32. Compression and File Size *Comparison courtesy of Wikipedia OMG, right? The bottom line: experiment with your images and file size. A middle quality scan may meet your needs and save tremendous file space.
  • 33. Lesson 5: Indexing For an in-depth look visit: What is Document Indexing?
  • 34. What is Indexing? Document indexing (sometimes referred to as metadata) enables a users to quickly and efficiently locate their documents, either through a folder structure, database or electronic document management system.
  • 36. Avoid a disaster Great care should be taken to design an efficient indexing scheme. If the design is not devised correctly at the outset, trying to rectify it later can be both difficult and costly. Sometimes it makes sense to replicate the current manual method for document location to create a familiar, but faster system.
  • 37. Don’t worry, there is automation Technologies such as • Barcode recognition • OCR • Batch processing • Data Mining, Text Mining can save time and money by automating indexing and more.
  • 38. Using Barcodes for Indexing Intelligent data capture software can extract data from barcodes to create and send index information to a document management system. For an in-depth look at barcodex in data capture visit: What Can Barcodes Do For Me?
  • 39. With OCR, make your image-based file fully text searchable or extract data from a zone for indexing.
  • 40. Using OCR for Indexing With zonal OCR, document areas are identified for automatic OCR capture. Additionally, drag-and-drop OCR allows an operator to highlight document text which is automatically OCR'd and dropped into index fields.
  • 41. TIPS for OCR • Scan at 300 dpi for greater accuracy and ensure that small text is captured. • Limit the use of color on documents. • Pre-process the image with image enhancement software (available in many data capture products, learn more).
  • 42. Intelligent data capture solutions often use batch processing that lets you process a whole folder of documents at a time. Some products can “watch folders,” and process files as they are scanned into the folder. What is Batch Processing? For an in-depth look visit: What is Batch Document Processing?
  • 43. Intelligent data capture solutions often use batch processing that lets you process a whole folder of documents at a time. Some products can “watch folders,” and process files as they are scanned into the folder. What is Batch Processing? Processing can include indexing, file routing, file splitting, and cleaning/enhancing the scans. Learn more.
  • 45. Preparation, quality control and indexing are the most time consuming elements of any scanning job and usually the most costly.
  • 46. TIPS for OCR Typically a good operator can prepare 750-1000 documents per hour, however a number of factors may drop throughput to 300 or 500.
  • 47. Odd Size Document Type sales receipts, photos, plans/drawings, Bindings three ring, spiral, glue, folder Fasteners staples, paper clips binder clips, rubber bands Attachments Post-its, tabs Factors that Influence Document Prep
  • 48. Estimating Volumes and Storage Type Paper Folders Ring Binder Lever arch folder Transfer Cases Bankers Boxes Archive Boxes Filing Cabinets Simplex (avg #s) 30 to 100 200 500 500 500 2500 3000/drawer Duplex (avg #s) 60 to 200 400 1000 1000 1000 5000 6000/drawer Learn more about estimating volumes
  • 49. Homework: Learn More About Data Capture and Document Management More
  • 50. Document Management Determine if you require a full document management system or do you just need a simple search and retrieval system? Can I use it as a stepping stone while I evaluate my document management system?
  • 52. Call us for information on: How to digitize medical or dental records. The best way to scan medical or dental records. Scanning paper records. Document scanning for medical or dental records. Going paperless at the medical or dental office. How to capture medical or dental records efficiently. Scanning medical or dental records with Fujitsu ScanSnap. Touchscreen scanning of medical or dental records. How to improve your medical or dental workflow with document scanning. Scanning to EMR or scanning to EDR How to maximize your Fujitsu ScanSnap Using your ScanSnap for a basic document management system Using barcodes and the Fujitsu ScanSnap Scanning with the Fujitsu ScanSnap Automating workflow with the Fujitsu ScanSnap Automating document management capture Scanning into Dentrix Indexing into Dentrix Understanding basic Document Scanning Things your teacher never told you about Document Scanning An introduction to Document Scanning Scanning Fundamentals for the average Joe By DocuFi Makers of ImageRamp Data Capture Solutions 30 years’ Experience in the Document Imaging Market Proven Fujitsu ISV Partner Find out more at ImageRamp and www.docufi.com
  • 53. Image Credits • Pjohnkeane, Requirements, requirements, requirements, http://bit.ly/1fcULDf • Doug Waldron, “Files (85)”, http://bit.ly/1bfciII • UBC Learning Commons, “Scanner_icon-1024x671”, http://bit.ly/1eewI4P • Knile Lucy, you have some sorting to do! http://bit.ly/19bSgjF • Michael 1952, SJSA Fifth Grade - I Fell in Love With The Teacher, http://bit.ly/1eevu9A • Ton Haex, ”Einstein show.... “, http://bit.ly/LVqeBi • Loco Steve, “Sunrise under scrutiny”, http://bit.ly/1eevSVv • Tax Credits, “ Coins”, http://bit.ly/1mtQj5j • j_baer, ”Ubuntu Color Wheel”, http://bit.ly/1jARikx • Marcin Wichary, Alphabetical, http://bit.ly/1aILOku • David Erickson e-strategyblog.com, “Hindenburg Disaster”, http://bit.ly/1jASeFF • William Warby wwarby,” Gears”, http://bit.ly/1dwtU1S • Alan Cleaver,” watching”, http://bit.ly/1h1k9k7 • Zoetnet, “overflowing,” http://bit.ly/KHW9Em • Seattle Municipal Archives, “Comptroller's Office employees, 1960”, http://bit.ly/1eBvLGE • Seattle Municipal Archives , “City Light worker with office machine, 1954”, http://bit.ly/1eBw3NM • Patrick Hoesly, “Thank you” http://bit.ly/17xKErE All images are owned or licensed by DocuFi with acknowledgement given to: