Without a complete and accurate understanding of how data flows throughout the organization, it is extremely difficult to establish the processes and metrics necessary for a successful data governance program. Best-in-class data lineage that provides multi-layered views of the data (cross-system, end-to-end and inner-lineage) plays a critical role in knowledge transfer, issue identification, information on the use of sources/resources, impact analysis, & definition clarity - all extremely necessary for best-in-class data governance.
In this presentation, you'll hear it straight from the horse's mouth as Anilh Rameshwar, Data Architect at Zego, shares exactly how automated data lineage provides his department with unprecedented visibility into their data, which is absolutely critical for the organization's data governance efforts.
3. Challenges
o Trust the data:
§For Compliance
§As Part of daily routine
§Data quality
o Manage migration to a new database
o Data Governance:
§Risk mitigation
§Control data pipelines
The Challenges
4. Zego’s Data-eng initiatives:
• Build a data lake in Snowflake
§ Four core applications (one MySQL db, two PostgreSQL db’s, one SQLServer db)
§ Data pipelines for external partners (Salesforce, payment processors, etc.)
• BI solutions include
• Looker
• Tableau
• Snowsight
5. Multiple areas of concern
• Major Governance Concern - Inadequate data security
controls- sensitive data was neither identified nor protected
• Unable to trace changes of source or code nor transformations
• Derived columns with ambiguous names and unknown
definitions
• No single source of truth
• No visibility into data consumption and usage patterns
6. Serving Data Governance
• Current system – serves only the financial department
• Future system – with data lineage and data catalog can:
– Control and manage data pipeline
–Get complete visibility into the entire data environment
–Risk Mitigation
7. Usage Example – Impact Analysis
a change in source data
• Resulted in a dramatic decline of key
aggregate metrics in certain reports.
• Investigation took a full week to track down
a single stored procedure
8. Usage Example – Data Governance
• Zego’s Data Governance program is relatively new
• Using Snowflakes tagging feature in conjunction with
Octopai’s automated data catalog
9. Why You need best-in-class lineage
for best-in-class data governance.
• Multiple source to target –
Cover as many systems as possible
• Visual Data Lineage–
Easy to analyze and comprehend data flows
• Automatic Catalog with Integrated Lineage –
Access to data from various sources