Mais conteúdo relacionado Semelhante a Straight Talk to Demystify Data Lineage (20) Straight Talk to Demystify Data Lineage1. © 2019 IDERA, Inc. All rights reserved.
STRAIGHT TALK TO
DEMYSTIFY DATA LINEAGE
2. © 2019 IDERA, Inc. All rights reserved. 2
DRIVING ENTERPRISE DATA GOVERNANCE
▪ Key drivers for instituting data governance:
• Improved information utilization
• Better data quality
• Improved interoperability
• Improved technical operationalization
• Reduced operational costs
• Streamlined design and development
• Improved business accountability
• Compliance with data use agreements
• Compliance with regulatory demands
• Improved business results
• Trustworthy analytics
• Trustworthy reporting
3. © 2019 IDERA, Inc. All rights reserved.
OBJECTIVES OF A DATA GOVERNANCE PROGRAM
Understand and interpret
business data
dependencies
Define and
approve data
policies
Develop procedures for
operationalization
Continuously
monitor
compliance
4. © 2019 IDERA, Inc. All rights reserved.
DATA LINEAGE POWERS DATA GOVERNANCE
▪ Data lineage methods help to
develop a map of the enterprise
data landscape
▪ Data lineage provides a holistic
description of each data object’s
• Sources
• Information pipelines
• Transformations
• Methods of access
• Controls
• All other fundamental aspects of
information utility
5. © 2019 IDERA, Inc. All rights reserved.
ASPECTS OF DATA LINEAGE
Business
lineage
Technical
lineage
Procedural
lineage
The semantic aspects
of tracing data meaning
and usage semantics
The structural aspects
of data element
concepts and their use
across the enterprise
A trace of data's
journey through
different systems and
data stores, providing
an audit trail of the
changes along the way
Data lineage combines
three different aspects of
corporate metadata:
6. © 2019 IDERA, Inc. All rights reserved.
TECHNIQUES SUPPORTING LINEAGE
Policy management
Glossary
Business Process Model
7. © 2019 IDERA, Inc. All rights reserved.
BUSINESS LINEAGE
▪ Inventory and description of
business characteristics of data
assets captured within a data
catalog, accumulating
information such as:
• Data asset description
• Business glossary
• Data asset location
• Data sensitivity
• Access rights
8. © 2019 IDERA, Inc. All rights reserved.
TECHNICAL/STRUCTURAL LINEAGE
▪ Catalogs which data element concepts are used
▪ Notes how data element concepts are
manifested as data elements within specific
data assets
▪ Not limited to static data sets
• Data in motion
• Manifestation of data element concepts in dynamic
contexts such as reports and feature sets for analysis
9. © 2019 IDERA, Inc. All rights reserved.
PROCEDURAL LINEAGE
▪ Identify the original
introduction of data elements
▪ Establish the process flow
for data elements that are
central to data policy
compliance
▪ Draft a mapping of data
element use to the business
application touch points
▪ Determine where data
instances are created,
updated, or just read
▪ Document transformations
applied
10. © 2019 IDERA, Inc. All rights reserved.
BENEFITS OF DATA LINEAGE
▪ Analyzing data dependencies
▪ Validating semantic consistency
▪ Impact analysis
▪ Data quality root cause analysis
▪ Integrating data controls
▪ Enforcing regulatory compliance
▪ Protecting sensitive data
Resulting in:
▪ Better data quality
▪ Better business decisions
11. © 2019 IDERA, Inc. All rights reserved.
ANALYZING DATA DEPENDENCIES
▪ Unexposed data dependencies
introduce risks in ensuring high-
quality usable data
• Reports, dashboards, and
analyses may appear to be
derived from data sets from
isolated systems, but in many
cases there is a chain of
processing that ultimately
originates with data taken from a
shared data source
• Multiple data sets may be
populated using data from distinct
yet structurally and semantically
equivalent sources
?
=
12. © 2019 IDERA, Inc. All rights reserved.
VALIDATING SEMANTIC CONSISTENCY
Social Security
Number
Identifier
Unique number assigned by
Social Security Administration
Authentication
Last four digits of number
assigned by the Social Security
Administration
Identifier
Unique number assigned by the
company
Customer ID
13. © 2019 IDERA, Inc. All rights reserved.
IMPACT ANALYSIS
▪ External drivers and directives may demand
changes to organizational information systems
▪ Data lineage allows forward-dependency tracing
to identify downstream systems impacted by
changes to
• Business term definitions
• Data element specifications
• Augmentation of data element semantics
• Changes in business process flow
14. © 2019 IDERA, Inc. All rights reserved.
ISSUE ROOT CAUSE ANALYSIS
▪ Data lineage maps the
information production
flow
▪ A data steward can
use the lineage maps
to reverse-trace back
through the data
production flow
▪ Enables identification
of the point of
introduction of a data
error
15. © 2019 IDERA, Inc. All rights reserved.
INTEGRATED DATA CONTROLS
▪ Identification of “problem
spots” and key phases in
business information flows
highlight opportunities for
integrated data controls
▪ Data controls validate data
flowing through selected
processing phases
▪ Alerts are generated when
invalid data values are
identified
16. © 2019 IDERA, Inc. All rights reserved.
ENFORCING DATA POLICIES
▪ Data policies can be formulated to reflect
externally-imposed data compliance requirements
▪ Business lineage is used to
• Capture external policy definitions
• Standardize semantics across different application usage
of shared data element concepts
▪ Technical lineage allows for
• Standardized specifications for data element validation
• Institution of audit controls for demonstrating compliance
▪ Examples:
• GDPR
• CCPA
• 12 CFR Part 11
• HIPAA Privacy Rule
17. © 2019 IDERA, Inc. All rights reserved.
PROTECTING SENSITIVE DATA
▪ Business lineage traces origin and levels of data
sensitivity
▪ Coupled with procedural lineage allows for
insertion of data protection techniques
• Encryption at rest
• Encryption in motion
• Data masking
• Access controls
18. © 2019 IDERA, Inc. All rights reserved.
SOME QUESTIONS DATA LINEAGE CAN ANSWER
▪ To understand organizational data
• What’s important?
• Where is it? (can be may places)
• Where did it come from?
• How is it used (business processes)?
• What is the chain of custody?
• What are the business rules?
▪ To support governance
• How do I identify private information?
• How long should I keep the information?
• Master Data Management classification
• Data quality
• Is it fit for purpose?
• What changed and why?
19. © 2019 IDERA, Inc. All rights reserved.
CONSIDERATIONS
▪ Data lineage augments the corporate toolkit for deploying data
governance
▪ Look for products that simplify the data steward’s consumption
of data lineage mappings, and have:
• The ability to enable users to see the flow of data through the data
production lifecycle
• A mechanism for enumerating the data sources for the different data
pipelines
• The ability to identify data elements and link them to data models and
to metadata for data element concepts and business glossaries
• A method of documenting data transformations and allowing data
professionals to review those transformations across a variety of data
pipelines
• The capability of interoperating with existing ETL/data integration tools
to import data pipelines along with their collected transformations.
• A means for collaboration around data pipelines and associated
metadata
• The ability to display a visual presentation allowing data stewards to
review the data lineage
20. © 2019 IDERA, Inc. All rights reserved.
THANKS!
Any questions?
Learn more at:
www.idera.com
20