On this slides, we tried to give an overview of advanced Data quality management (ADQM). To understand about DQ why important, and all those steps of DQ management.
2. Agenda
• Motivation / Introduction
• Data Quality Definitions
• Foundation of Data Quality
• Data Quality Assessments
• Measuring Data Quality
• DQ-Organisation
• Data Policies
• Data Governance
• DQ Policies
• Data Profiling
Kiel University of Applied Sciences
3. Introduction
Today is world of heterogeneity.
We have different technologies.
We operate on different platforms.
We have large amount of data being generated
everyday in all sorts of organizations and
Enterprises.
And we do have problems with data.
Kiel University of Applied Sciences
4. The previous slide we discuss
about introduction part and
data quality definitions.
If you missed it please check
that slide
Kiel University of Applied Sciences
5. Foundation of Data Quality
I. Data Production System
II. IQ-Dimensions
III. IQ-Categories / IQ-Pattern
Kiel University of Applied Sciences
6. Maintenance of data quality
• Data quality results from the process of going
through the data and scrubbing it, standardizing it,
and de duplicating records, as well as doing some of
the data enrichment.
Maintain complete data.
Clean up your data by standardizing it using rules.
Use fancy algorithms to detect duplicates. Eg: ICS
and Informatics Computer System.
Avoid entry of duplicate leads and contacts.
Merge existing duplicate records.
Use roles for security.
Kiel University of Applied Sciences
7. Data Production System
• Data collector
• Data custodain
• Data consummer
Kiel University of Applied Sciences
9. IQ Dimanssion
• Relevance
• Accuracy
• Timellness
• Compliteness
• Coherence
• Format
• Accessibility
• Compatibillity
• Security
• Validity
• Accessibility
• Appropriate Amount of Data
• Believability
• Concise Representation
• Consistent Representation
• Ease of Manipulation
• Free of Error
• Interpretability
• Objectivity
• Relevancy
• Understandability
• Value-Added
Kiel University of Applied Sciences
10. Information Quality Dimensions
Dimensions
• Accessibility
The extent to which data is available, or easily and quickly
retrievable
• Appropriate Amount of Data
The extent to which the volume of data is appropriate for
the task at hand
• Believability
The extent to which data is regarded as true and credible
• Completeness
The extent to which data is not missing and is of sufficient
breadth and depth for the task at hand
Kiel University of Applied Sciences
11. • Concise Representation
The extent to which data is compactly represented
• Consistent Representation
The extent to which data is presented in the same
format
• Ease of Manipulation
The extent to which data is easy to manipulate and
apply to different tasks
• Free of Error
The extent to which data is correct and reliable
Kiel University of Applied Sciences
12. • Interpretability
The extent to which data is in appropriate languages,
symbols, and units, and the definitions are clear
• Objectivity
The extent to which data is unbiased, unprejudiced, and
impartial
• Relevancy
The extent to which data is applicable and helpful for the
task at hand
• Security
The extent to which access to data is restricted
appropriately to maintain its security
Kiel University of Applied Sciences
13. • Timeliness
The extent to which data is sufficiently up-to-date
for the task at hand
• Understandability
The extent to which data is easily comprehended
• Value-Added
The extent to which data is beneficial and
provides advantages from its use
Kiel University of Applied Sciences
14. Questions
• How do organisations define data quality?
• What data quality problems arise in
organizations?
• How do organizations identify, analyze, and
resolve data quality problems?
• How do organizations encourage employees to
work on a proactive management of DQ / IQ?
• Are there common data quality patterns?
• Across Organisations
• Across DQ-projects
Kiel University of Applied Sciences
15. IQ Categories / Patterns
Intrinsic IQ
• Information have quality in their own right
Contextual IQ
• Information quality must be considered within
the context of the task
Accessibility IQ / Representational IQ
• Emphasize the importance of the role of
systems
Kiel University of Applied Sciences
16. Intrinsic IQ
• Mismatch between several sources of the
“same” data
• “consistency” vs. “accuracy”
• Believability issues
• Poor reputation of sources
• Poor reputation for quality
• Subjective production of data
• Human judgment / knowledge in coding
Kiel University of Applied Sciences
18. Contextual IQ
Mismatch between information available and what
information is relevant for information consumers
• Missing data –the easy case
• Data bundling and analyzability –the hard case
Issue is aggregation
• Across record (transaction) analysis of data
• e.g. Corporate Actions in banking
• Often across distributed systems
Incompatible, distributed systems (HMO)
Kiel University of Applied Sciences
20. Accessability IQ / Representational IQ
Technical Accessibility
• Physical access
• Computing resources
Time to Access / Ease of Access:
• Amount of data
• Privacy, confidentiality
Interpretability and Understandability:
• Coding
Representation and its Analyzability:
• Image and text data
Kiel University of Applied Sciences
21. Accessability IQ / Representational IQ
Kiel University of Applied Sciences