The document discusses improving data quality and accuracy. It begins with an introduction of the speakers, Seth Maislin and Greg Council. Maislin then discusses establishing governance over company data and linking data quality metrics to business objectives and key performance indicators. Council discusses measuring the functional and performance abilities of data extraction systems, particularly focusing on accuracy at the data element level rather than just character recognition rates. He outlines Parascript's programs for optimizing systems and processes to ensure high quality data extraction.
6. Underwri(en by: Presented by:
Governing Data “ARrac(veness”
§ Data quality is mulKdimensional.
§ Which features ma(er most? Why? (How do you set targets?)
§ It’s nearly impossible to opKmize all dimensions at once.
• accurate
• available
• certain/precise
• clean
• complete
• consistent across
sources
• integrity
• formaRed
(conformity)
• reasonable/logical
• relevant/valid
• reliable
• (mely/current
• traceable/audited
8. Underwri(en by: Presented by:
Data Is an Asset
• something you can sell
• direct input for a business process
• contextual input for making a business decision
• workflow trigger
• macro-level self-awareness (governance)
To know if your data are good enough, you don’t measure the data.
You measure its effects.
Data governance requires an integrated system of metrics and KPIs.
11. Underwri(en by: Presented by:
Use Cases: How to Leverage Financial Data
• accurate and Kmely accounts management
• be(er spend alignment with business strategy; innovaKon recogniKon
• improved decision making from data-driven ROI; smarter R&D
• acKve asset management (human, technology, physical, digital)
• smart business culture design
• intelligent asset and partner acquisiKon, compeKKve procurement
• strong risk management
• improved compliance
• customer and marketplace intelligence (correlaKons, shocks, trends, gaps)
• streamlined operaKons
12. Underwri(en by: Presented by:
Building the linkage is the hardest part.
1. Establish a system of company-wide governance.
§ IdenKfy and understand the company’s data domains
§ Document processes and assign accountabiliKes
§ Establish a Center of Data Excellence, and socialize
§ Implement metrics to measure governance itself
2. Decide what ma(ers at your level.
§ Know the strategic KPIs for each part of the business
§ IdenKfy gaps between exisKng and necessary metrics
§ Inventory and profile your data; calculate baselines
§ Set targets collaboraKvely and find synergies
3. Build your data projects roadmap.
L I N K A G E
22. Underwri(en by: Presented by: Underwri(en by: Presented by:
Discussing “Accuracy”
§ Is OCR accuracy relevant? Yes but….
• It doesn’t tell you data-level error of a system; typically only character-level accuracy.
• It doesn’t inform you of the cost savings or producKvity gains.
o For example, “How many data elements do not require any intervenKon or review?”
§ Your data accuracy is not the same as the OCR accuracy.
§ Most (or all of) OCR accuracy is measured at the character level.
§ Your form data is important at a word or field level.
If OCR output is 99% accurate at the character level, then considering a page with 500 words with an average
of 6 le(ers per word, 30 characters will be incorrect. Those 30 errors can be distributed across mulKple
words leading to a higher actual error rate. A word-level error rate could be 6%, not 1%. If part of the 6%
error includes an account number of SSN, that’s a big problem.