In this webinar, we’ll show you how to properly scope a PoC with a core framework, what is needed before the PoC, how to compare outcomes to success criteria & evaluate PoC results, and more.
Want to follow along with the webinar replay? Download it here for FREE: https://info.aiim.org/key-factors-for-poc-success
[Webinar Slides] 5 Key Factors for Document Automation Proof of Concept Success
1. Underwritten by:Presented by:
#AIIMYour Digital Transformation Begins with
Intelligent Information Management
5 Key Factors for Document Automation
Proof of Concept Success
Presented October 16, 2019
5 Key Factors for Document Automation
Proof of Concept Success
An AIIM Webinar Presented
October 16, 2019
2. Underwritten by:Presented by:
#AIIMYour Digital Transformation Begins with
Intelligent Information ManagementYour Digital Transformation begins with
Intelligent Information Management
3. Underwritten by:Presented by:
Tips for Participating in Today’s Webinar
Group Chat – to talk with each other, found in the icons along
the bottom.
Q&A – for questions to the speakers, held for the end
(and tech help).
Check out the Resources available to you today, in the box
to the right of the slides.
A Survey will open at the conclusion of the webinar – we
value your feedback on how we did today.
4. Underwritten by:Presented by:
Today’s Speakers
Richard Medina
Co-Founder and Principal Consultant
Doculabs
Greg Council
VP, Marketing & Product Management
Parascript
Host: Theresa Resek, CIP
Director
AIIM
6. Underwritten by:Presented by:
In This Webinar, We Will Show You:
What automation using advanced capture technology really
means and why it is different
How to properly scope a PoC with a core framework (activities,
deliverables, and considerations)
What is needed prior to the PoC in terms of resource commitments,
inputs, goals, scope and success criteria
How to compare outcomes to success criteria and evaluate PoC results
How to identify the right participants and what to require from
participating vendors
8. Underwritten by:Presented by:
Four Proof of Concept (PoC) Steps
General
Preparation
Detailed
Preparation
Testing
Results
Evaluation &
Next Steps
9. Underwritten by:Presented by:
General Preparation
General
Preparation
Detailed
Preparation
Testing
Results
Evaluation &
Next Steps
1 2 3 4 5 6
Assess your
capture
operation.
Clarify your
general future
state capture
approach.
Clarify the
problems to
solve and the
relevant
solutions.
Clarify what
you're trying
to “prove.”
Select the
vendors
and tools
to evaluate.
Select the
use cases
to evaluate.
10. Underwritten by:Presented by:
Your Capture Operation May Be a Mess
Legend
Paper
Remote staff
scan paper
documents on
MFPs
Electronic
Documents
Document
Repositories
E-sign
Remote
(MFP)
Processing
Document
Processing
Email
Attachments
Printer
Remote staff print electronic documents and
email attachments, and then scan the paper
documents on MFPs
Scanner
Scanner
Some staff submit
paper documents via
fax machines.
Fax
Today, Silanis e-signed documents are
configured to route directly into ECM. E-
signature is not configured to flow through
Kofax or NSI.
Some processes route images
directly to ECM. Others
route them to the Capture Platform.
Paper
Paper
Scanner
Some staff scan documents on personal
scanners
Document repositories
receive documents from
MFP systems or
enterprise capture
platform
ECM can trigger business systems or
workflows, which can access documents
from document repositories.
Capture Platform-
processed documents
and metadata are sent
to enterprise document
repositories.
Exception
handling
Document flow
Some documents use
OCRd
Fax
Some external users
submit documents
via fax
Operations Staff
Most exception handling processes are not
automated, and required back-office operations to
work with staff or customers to resolve
Business
Processes
Electronic
Documents
RightFax
RightFax
11. Underwritten by:Presented by:
That Includes an Exception Handling Mess
Legend
Paper
Remote staff
scan paper
documents on
MFPs
Electronic
Documents
Document
Repositories
E-sign
Remote
(MFP)
Processing
Document
Processing
Email
Attachments
Printer
Remote staff print electronic documents and
email attachments, and then scan the paper
documents on MFPs
Scanner
Scanner
Some staff submit
paper documents via
fax machines.
Fax
Today, Silanis e-signed documents are
configured to route directly into ECM. E-
signature is not configured to flow through
Kofax or NSI.
Some processes route images
directly to ECM. Others
route them to the Capture Platform.
Paper
Paper
Scanner
Some staff scan documents on personal
scanners
Document repositories
receive documents from
MFP systems or
enterprise capture
platform
ECM can trigger business systems or
workflows, which can access documents
from document repositories.
Capture Platform-
processed documents
and metadata are sent
to enterprise document
repositories.
Exception
handling
Document flow
Some documents use
OCRd
Fax
Some external users
submit documents
via fax
Operations Staff
Most exception handling processes are not
automated, and required back-office operations to
work with staff or customers to resolve
Business
Processes
Electronic
Documents
RightFax
RightFax
12. Underwritten by:Presented by:
Look for Labor Reduction
Service Unit Ex. Peer Industry For-profit In-house
Sort & Prep Per hour $ 35.50 $ 35.28 $ 34.15 $ 36.40
Scanning Per image $ 0.0546 $ 0.0398 $ 0.0298 $ 0.0497
Data Entry (onshore)
Tier 1 (<10 keystrokes) Per index $ 0.0650 $ 0.0891 $ 0.0665 $ 0.1117
Tier 2 (11-30
Keystrokes)
Per index $ 0.0950 $ 0.2264 $ 0.1630 $ 0.2898
Tier 3 (ave. = 50) Per index $ 0.3750 $ 0.4589 $ 0.3460 $ 0.5717
Retrieval / Lookup / Research
Research (Tier 1) Per hour $ 47.75 $ 49.70 $ 49.00 $ 50.40
Retrieval (Tier 2) Per hour $ 63.75 $ 58.50 $ 59.40 $ 57.60
Automated Recognition Per image $ 0.0225 $ 0.0206 $ 0.0132 $ 0.0280
13. Underwritten by:Presented by:
Clarify Your General Future State Capture Approach
Moderate Document Automation dramatically improves the efficiency of current processes, most of which
involve paper. Advanced Document Automation redesigns the processes to leapfrog paper-based challenges.
Advanced Transformation for CaptureModerate Transformation for Capture
Multi-channel
Capture
Multi-format
Capture
E-forms E-signature
Classification
Extraction /
Enrichment
Validation / QA
Conversion /
Export
Content
Analytics
Process
Management /
Automation
Task
Automation
(RPA/RDA)
Administration,
Monitoring,
Reporting
Multi-channel
Capture
Multi-format
Capture
E-forms E-signature
Classification
Extraction /
Enrichment
Validation /
QA
Conversion /
Export
Content
Analytics
Process
Management
/ Automation
Task
Automation
(RPA/RDA)
Administration,
Monitoring,
Reporting
Legend
H M L
14. Underwritten by:Presented by:
Document Automation Technology Without Jargon or Magic
Document automation tools have improved in recognition and in learning:
How the systems
recognize the text
• To extract data, classify docs, or
paginate/separate pages and docs
• Movement from how young children read
to how adults read – using more context
How they improve that recognition
• How they improve the accuracy of their “opinion”
in light of evidence
• Whether the human supervisor must do everything
(e.g., drawing bigger zones in the template), or
must provide them with positive and negative
feedback (“these are correct answers” or “these
are wrong answers”), or whether it’s a mix of
assisted and automated, or whether the system
can improve its recognition automatically 100%
by itself
15. Underwritten by:Presented by:
Document Automation Should Do 4 Things
Any solution must do 4 things to be acceptable:
1 2 3 4
It must work well enough
to adequately reduce or
eliminate human
intervention.
It must either
1) have enough functional
breadth to be standalone,
or
2) fit easily into a larger
environment that provides
the functional depth (your
capture platform).
It must fit into the larger
process environment
beyond the recognition
stages (your
downstream BPM or
case management
environment).
It must provide an
adequate user experience
for humans who are
assisting the automation.
We have recently empirically evaluated some solutions, and some fulfill
these requirements more favorably than we expected.
16. Underwritten by:Presented by:
Clarify the Problems You Want to Solve
Complexity
Classification
Capabilities
Extraction Capabilities Validation/QA Capabilities
“Easy” • Page classification
• English machine text – structured layout
• Mark and signature detection
• Barcode recognition
• Doc type, page number and
order, signature presence,
date validation, field
completeness
• Data validation with source
system, name/entity validation
(against e.g. registry)
“Moderate”
• Document classification
(single page type may
belong to multiple
documents)
• English machine text – unstructured
• Multilingual machine text –structured
• Multilingual machine text – unstructured
• Document field value
consistency
“Difficult” • Package classification
• English hand – structured
• English hand – unstructured
• Multilingual hand- structured
• Multilingual hand – unstructured
• Metadata enrichment (e.g. sentiment analysis)
• Signature correctness
• Package completeness
• Package field value
consistency
17. Underwritten by:Presented by:
Select the Vendors and Tools to Evaluate
Tools differ primarily on deployment
readiness versus innovation:
Category #1 – Mature “platform”
vendors
Category #2 – Mature “solution”
vendors with innovative offerings
Category #3 – New “solution”
vendors with innovative offerings
Category #4 – New “engine/tool”
vendors with innovative offerings
CATEGORY LEADERS
CATEGORY CENTRAL
TENDENCY
18. Underwritten by:Presented by:
Clarify What You Want the “PoC” to Prove
Determine the
feasibility and
applicability of the
tested solutions as
part of your inbound
operation.
Identify:
• Strengths
• Weaknesses
• Best fit usage
scenarios
Determine basic
environment
requirements and
scalability.
Determine
initial view
of skills
requirements.
Determine
initial view of
setup time and
activity.
For comparison, determine
requirements and performance
of your platform or baseline
systems under comparable
circumstances.
19. Underwritten by:Presented by:
Select the Use Cases You Want to Evaluate
Complexity Classification Capabilities Extraction Capabilities Validation/QA Capabilities
“Easy” • Page classification
• English machine text – structured layout
• Mark and signature detection
• Barcode recognition
• Doc type, page number and
order, signature presence,
date validation, field
completeness
• Data validation with source
system, name/entity validation
(against e.g., registry)
“Moderate”
• Document classification
(single page type may
belong to multiple
documents)
• English machine text – unstructured
• Multilingual machine text –structured
• Multilingual machine text – unstructured
• Document field value
consistency
“Difficult” • Package classification
• English hand – structured
• English hand – unstructured
• Multilingual hand- structured
• Multilingual hand – unstructured
• Metadata enrichment (e.g., sentiment
analysis)
• Signature correctness
• Package completeness
• Package field value
consistency
20. Underwritten by:Presented by:
Detailed Preparation
Detailed
Preparation
Testing
Results
Evaluation &
Next Steps
1 2 3 4 5 6
Select the
document
sets to
evaluate.
Define the
roles and
participants.
Install and
configure
software
in test
environment.
Prepare the
document
sets.
Define
outputs and
“Ground
Truth.”
• Use auto and/
or human.
• Define
whether “raw”
or “normalized”
7
Create
templates
for each
doc type
(as needed).
Perform
dry run
with
practice
forms.
General
Preparation
21. Underwritten by:Presented by:
Testing
Detailed
Preparation
Testing
Results
Evaluation &
Next Steps
General
Preparation 1 2 3 4 5 6
Run using
auto
software
(no human
intervention).
Compare
extracted
data to the
“Ground
Truth.”
Send all
items
(or items
with low
confidence)
to humans.
Humans
QA and/or
key.
Compare
corrected
output to the
“Ground
Truth.”
7
Vendor –
or you –
can tweak
or train
(optional).
Run test
again.
22. Underwritten by:Presented by:
Evaluate the Results
Detailed
Preparation
Testing
Results
Evaluation &
Next Steps
General
Preparation 1 2 3 4 5 6
Identify
strengths,
weaknesses,
best fit usage
scenarios.
Determine
basic
environment
requirements
and scalability.
Determine
initial view
of skills
requirements
Determine
initial view of
setup time
and activity.
Compare
with the
requirements
and
performance
of your
platform or
baseline
systems under
comparable
circumstances.
Determine
your next
steps.
23. Underwritten by:Presented by:
Example: Test Results Summary and Error Analysis
Result Type Fields Extraction Accuracy Comments
Auto System 9255 91.5% No Human Intervention
Auto w/ Assistance 9255 97.0% Auto System Plus Keying
Baseline Platform 9255 65.5% (Typed only) No handwriting
24. Underwritten by:Presented by:
How to Improve Your Accuracy Results
Opportunities to improve the results:
To maximize machine learning: requires significantly more volume to be run.
1
Inconsistent keying: can easily be improved with process instructions for consistency
(e.g. rules for periods, spaces, interpretation).2
Auto extraction mistakes: some of these misidentification mistakes can be addressed
with constraints on the fields (e.g. alpha filters, simple lookups, complex lookups).3
Doc automation may not effectively address:
Image quality issues: upstream issue with upstream recommendations; better image cleanup (deskewing, despeckling,
optimized contrast and density for machine recognition) should reduce at least some errors (similar looking characters,
characters overlapping lines, etc.).4
Goofy writing: where an individual wrote outside the expected zones or wrote bizarre text; this will likely always be an
issue, but can get overall incidence down with guidance to agents and customers.
5
25. Underwritten by:Presented by:Underwritten by: Presented by:
Richard Medina
CONTACT RICH
http://bit.ly/ask-rich-doc-auto
Thank You – For More Information
Download the Guide
http://bit.ly/doc-auto-poc
27. Advanced Capture Is Not CRM
The purpose and value
of advanced capture is
automating as many
document tasks as
possible at the highest
attainable levels of
accuracy.
Yet many evaluations and
Proof of Concepts (PoCs)
focus on features and
functionality.
So, is understanding
features really
important if you
cannot verify the true
performance of the
system?
28. Underwritten by:Presented by:
Poll Question
What level of automation can be expected from 99% accuracy?
a. 99%
b. Greater than 90%, but less than 99%
c. Less than 90%
d. Don’t know
29. Accuracy Is Only Half of the Equation
1
2
If a solution can deliver 99% accuracy,
how much automation can be expected?
You can’t tell unless you know the other important
attribute – how many “predictions” (e.g., what document
type or field value answers can be provided).
You need to understand the % of answers that can be
identified by the software as accurate or inaccurate.
and
30. Underwritten by:Presented by:
Know the Proper Measurements
Only document classification and single-field data extraction should be
measured at the page or document level. Multi-field extraction must be
measured at the field level.
Successful receipt-level
extraction on
Outcome
1 field 98+%
2 fields 85+%
3 fields 50%
4 fields 25%
5 fields <15%
Successful field-level
extraction
Outcome
95% of all Fields 80+%
85% of all Fields 85+%
75% of all Fields 90%
50% of all Fields <15%
31. Underwritten by:Presented by:
Solutions Expected to Deliver Precision Require Precise PoCs
Number of Samples Amount of Automation
10 Samples Not more than 5%
100 Samples Not more than 50%
1000 Samples Not more than 70%
2000 Samples Not more than 85%
5000 Samples >85%
Regardless of Machine Learning, you need reliable and
representative sample sets.
Representative sample sets take into account both document
types and document variance within these document types.
32. Not all “machine learning” is real
Much of machine learning is based upon “adaptive
expert systems” (e.g. rules-based) not machine
learning algorithms.
This is important because rules are brittle and
knowledge bases that store them unwieldy.
33. Underwritten by:Presented by:
The Parascript Approach – Smart Learning
1
2
3
Smart
Learning
Smart Learning provides a
significant reduction in
pre-production and post
production configuration
and tuning.
CONFIGURATION & TUNING
Smart Learning allows you
to focus on what you do
with your quality data, not
on how to get it.
LEVERAGING QUALITY DATA
Smart Learning turns
advanced capture into a
data science problem vs.
features and technical skills.
DATA SCIENCE
34. Walking the Talk
We are interested
in helping you with
your advanced
capture PoC and
we’re willing to
invest in it to show
you the power of
Smart Learning.
Bring your data
and we will do
the rest.
1
2