Looking to automate document processing in your healthcare organization? Learn from Provectus & AWS experts how to make data capture, conversion, and analytics more efficient. Process and manage documents faster and on a larger scale with AI & Machine Learning.
In this presentation, we offer management and engineering perspectives on document processing with AI, to help you explore available options. Whether you are looking for a ready-made solution or plan to build a custom solution of your own, this webinar will help you find the best fit for your healthcare use cases.
Choosing the Right Document Processing Solution for Healthcare Organizations
1. Choosing the Right Document
Processing Solution for
Healthcare Organizations
Presented by:
Iskandar Sitdikov, ML Solutions Architect @ Provectus
Stepan Pushkarev, CTO @ Provectus
Andy Schuetz, PhD, Sr. Solutions Architect @ AWS
2. Webinar Objectives
1. Provide an overview of the market for document processing solutions
2. Outline critical factors for choosing the right document processing solution
for your healthcare use case
1. Strategize on whether you should look for a ready-made solution to purchase,
or to build a custom solution of your own
1. Get qualified for the Provectus IDP Solution Discovery Program
3. 1. Introduction
2. Healthcare use cases
3. Document processing in 60 seconds
4. Solutions map, advantages and problems
5. Evaluation
Agenda
4. Introductions
Iskandar Sitdikov
ML Solutions Architect,
Provectus
Andy Schuetz, PhD
Sr. Solutions Architect,
Healthcare and Life Science,
AWS
Stepan Pushkarev
Chief Technology
Officer, Provectus
5. AI-first Consultancy & Solutions Provider
500 employees and
growing
Established in 2010
HQ in Palo Alto
Offices in North
America, LATAM, and
Europe
Machine Learning DevOps
Big Data Analytics
We are obsessed about leveraging cloud, data, and AI to reimagine the way
businesses operate, compete, and deliver customer value
6. Our Clients
Innovative Tech Vendors (ISV & DNB)
Seeking for niche expertise to differentiate
and win the market
Midsize to Large Enterprises
Seeking to accelerate innovation, achieve
operational excellence
8. Use cases: Clinical notes, medical
records, insurance medical claims,
clinical studies, medical imaging
reports, lab reports, and transfers.
Administrative overhead to process
data from these types of documents is
huge.
Main benefits: Operational speed and
cost reduction. In our practice, we see
2-8x+ reduction in costs compared to a
fully manual process and 30%+ savings
in comparison to legacy OCR solutions.
Healthcare Use Cases
9. General goal is to spot main entities in
the document (paragraphs, forms, tables,
etc.) and then successfully identify
written text in them (segmentation and
OCR).
Both problems can be resolved
separately or using end-to-end networks.
IDP / CV
10. Context search on data from OCR + segmentation
Forms and tables greatly impact overall performance. Data extraction from forms is resolved (due to a
straightforward key-value structure). Tables are still a pain point for all data extractors. For unstructured texts,
deep networks are a solution at this point. Ex: BERT — good for finding key-value (question / answer) pairs
in context.
IDP / Data Extraction
11. Evaluation of the document
processing model is a task in
progress.
Results with a low-confidence
score and missing information
are forwarded to human experts.
Samples of successfully extracted
information are also forwarded to
human experts for evaluation.
IDP / Evaluation and Monitoring
12. Data lake + Ontology specifications
Fast Healthcare Interoperability Resources (FHIR)
is a standard describing data formats and
elements and an application programming
interface for exchanging electronic health
records. The standard was created by the Health
Level Seven International healthcare standards
organization.
IDP / Storage
13. Automation encapsulates all processes mentioned above
and unites them into one single product, featuring:
● Document capture
● Model lifecycle
○ Labeling
○ (Re)Training
○ Evaluation
○ Monitoring
● Human-in-the-loop
● Integrations
● System monitoring
IDP / Automation
14. IDP is more than just OCR. To resolve the problem in-house, you need
to take care of data capture, data ingestion, preprocessing, OCR, data
extraction, evaluation, and further integrations to destination
systems.
Bottleneck: Tables and unstructured text
IDP / Takeaways
16. Documents are everywhere... and solutions for document processing are everywhere, too!
Competitive Landscape
17. Major technology platforms offer general-
purpose technology components for
document processing, such as:
● Amazon Textract + Comprehend
● Google Document AI
● Microsoft Azure Form Recognizer
Solutions: Cloud Vendors
Pros:
● Cloud infrastructure and integration
● Long lifespan and support
● Constant development
Cons:
● General purpose a.k.a require
additional work to extract necessary
information and integrate with
current workflows
18. This is a “younger” group of up-and-coming
vendors who have built solutions using AI-
native platforms to tackle the most demanding
automation challenges. Generally, they can
handle documents that are more complex or
have greater variation. As a result, they often
can deliver a greater business impact than
older technologies. Since they are free from
legacy technical debt, it is easier for them to
build next-gen, future-oriented solutions.
Solutions: Startups
Pros:
● Modern tech
● Constant development
● More focused applications
● Support — For a new independent player, support is
one of the highest priorities to gain customer
loyalty
Cons:
● Only few startups in this market can survive
competition with big vendors
● Challenging to customize
● May not align with your cloud strategy
● Support — On the other hand, new startups might
struggle with support
19. Legacy vendors typically build IDP
solutions on top of legacy platforms.
Niche vendors that are focused on limited
types of documents and use cases. You
might find hidden gems here!
Vendors that restructure your documents
workflow by introducing standard types of
documents, which are really easy to
process.
Solutions: Other Vendors
Pros:
● Wide variety of integrations
● Niche use cases
● Large portfolio of clients
Cons:
● In some cases, they rely outdated, less
performant technologies
● Document flow restructure
20. System Integrators may offer IDP
as part of their portfolio of
solutions. Their IDP offering may
be a solution from another IDP
vendor or developed in-house.
Solutions: System Integrators
32. What to Choose?
Now, you have all the information about
possible go-to solutions in your market
segment. What’s next?
You need to fairly compare each and every
solution to choose one that fits and aligns
with your use case the most.
Deep evaluation is key to making the right
decision.
33. Data
● EDA (exploratory data analysis) — Knowing your
data is the key to success
● Sample data based on EDA
● Use this data as the evaluation dataset for
measuring performance of solutions on the
market / in the segment
Metrics
● F1, Accuracy, Recall, etc.
● Key, value extraction
● Table data
● Language, character recognition, spelling,
handwritten text
Provectus Evaluation Methodology
34. Evaluation / Composite Index
Name Score
Provider 1 0.64
Provider 2 0.81
Provider 3 0.78
Composite Index
37. TCO and Case Study: Under NDA Client
General TCO structure:
● Infrastructure (data pipelines, storage, control panel)
● CV, NLP, Human-in-the-loop
● R&D costs (if building in house)
● Support
TCO targets for end-to-end solution: ~20-30 cents per
document for simple use cases and 50+ cents for specific
“complex” documents
Result: The cost of processing one document was reduced
from 24 to 11 cents, since the right OCR/CV vendor was
selected (it saved almost 10 cents per document). Also,
serverless architecture was leveraged to reduce
infrastructure costs.
OCR/CV solutions performance vs. cost: For a
given use case, the most expensive solution
delivered the worst result. A second to best
result was demonstrated by the vendor with
the second to cheapest solution.
38. Takeaways
1. Ecosystem matters: Data integration with built-in industry specific connectors, data pipelines,
OCR, NLP, security, storage, and a human-in-the-loop workflow — All these elements should be
integrated with each other for optimal performance.
1. Use unbiased benchmarking framework for evaluating real performance
of different providers, based on your use case and datasets.
1. Work with Provectus to reduce your Document Processing costs
a. By 2-8x compared to manual workflows
b. By 30%+ compared to legacy OCR solutions
c. By 10%+ compared to modern cloud solutions
39. 125 University Avenue
Suite 295, Palo Alto
California, 94301
provectus.com
Questions, details?
We would be happy to answer!