Data quality and regulations are perpetual drivers for Data Governance solutions that systematically monitor the execution of data policy. And yet, there is along road ahead to achieve Data Governance: the term is still relatively unknown, there is no political forum in the form of a Data Governance Council, and software support is moderate. Time for change ! Data Governance requires automation on the one hand and a wide adoption of business to ICT on the other.
In this lecture, we set out the basic principles to successful develop Data Governance. By way of example, we show how to translate this in Collibra's Data Governance Center. We pay particular attention to identifying and modelling data policies and rules, and to empowering them on the basis of data stewardship and configurable workflows across silos and functions in the organization. The example is drawn from the Flanders Research Information Space, where data quality is critical to drive and boost pan-European Research policy.
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
Data Stewardship and Governance: how to reach global adoption and systematic monitoring of data policy through software ?
1. Data Governance and Data Stewardship
on how to reach global adoption and systematic monitoring of data policy through
software
Dr. Pieter De Leenheer
Co-founder
!
2. What we talk about when we talk about
no Data Governance
The Problem!
I wish these guys
spoke our language !
I can’t understand
this report !!
This doesn’t seem
right. Are we sure this
data is correct ?!
Who approved this?!
This is an exception !
to the rule !!
This rule is different
in our country !!
I’ve never seen this
code! Who introduced
this ?!
3. Data Management Challenges
• Data Service = data sharing agreement across organization
silos, policies, regulations, semantic assumptions
• No clear balance between data ownership and control:
•
•
responsibilities are not set
for each data point : increasing exposure to risk regarding quality
and policy compliance
! ask Alice, she knows
Regulatory+compliance+risks+con3nue+to+persist+and+remain+a+solid+driver+for+governance,+
risk+and+compliance+technologies.+However,+more+hype+is+being+generated+by+external+risks+
posed+by+third+par3es,+suppliers+and+customers.+
(Gartner!Hype!Cycle!on!Risk!and!Compliance!Tech,!2013)!!
4. Flanders Research Information Space
•
Providing Scientific Research Information and
Services
• Easy
• Transparent
• Open
• Timely
• Unambiguous
•
Supported by Data Governance
•
Qualitative meta data: e.g., definition for project,
funding codes, mappings, classifications, etc.
•
Roles and responsibilities for Information
Providers and Stiweto
•
Collaborative workflows between Information
Providers and Stiweto
By courtesy of G. Van Grootel, EWI
6. Context & Necessity
• Services are increasingly
• knowledge-intensive relying on millions of data points from
•
•
•
Partners
Third parties
Customers
• co-produced in federated, decentralised, multi-tier settings
• multi-disciplinary:
• Algorithm: e.g., Big Data Analytics
• Infrastructure: e.g., Internet of Things
• Service Innovation Methods: e.g, Living Labs
• Marketing: e.g., Service-dominant Logic
• sufficient…..? No
8. Data Stewardship & Governance
• Ownership + Responsibility => Power + Control
1. (global) data stewardship
!
!
Requirement 1: people who define data policy
E.g., multilingualism policy
2. (systematic) data governance
!
!
Requirement 2: processes that enforce data policy
E.g., every project abstract must be in English and Dutch
• Now let’s build software for it…
“New+Informa3on+infrastructure+technologies+must+enable+organiza3ons+to+
define,+organize,+share,+integrate+and+govern+data+and+content+to+create+business+
value”++
(Gartner!Hype!Cycle!on!Informa@on!Infrastructure!Tech!2013).!
10. ..and not all data points are create equal
Borrowed from Predrag Dizdarevic (Element 22 NYC)
Business Lines
Equities, Fixed Income,
Wealth Management ...
Critical Data
Elements
Corporate Functions
External
Auditors,
Clients,
Counterparties
...
Critical Data
Elements
Critical Data
Elements
Regulations
Critical Data
Elements
Dodd Frank Act, Basel III,
FATCA ...
Risk,
Compliance,
Finance ...
11. Can technology globalise and systematise data
policy scoping, definition and enforcement which is
by nature a human process?
13. Tools
Policy
Multilingualism
Business Rule
Business Rule
Abstract must be
in Dutch
Abstract must be
in English
?!
?!
?!
?!
Business Term
Researcher
Business Term
Publication
?!
?!
?!
Business Term
?!
Actie er Onder...
Business Term
Project
Code Value
4250
Code Value
G3
Code Value
4.3
Funding Community
Funding Sources Glossary
Generation 1 Funding Codes (Codelist)
code
Code Value
contains
4250
Business term
Actie ter
ondersteuning
van de
Strategische
prioriteiten van
de Federale
overheid
Code Set
Generation 1 Funding Codes
Business term
POD
wetenschapsb
eleid Federale
Impulsprogra
mma's
Generation 2 Funding Codes (Codelist)
Code Value
code
4.3
contains
Code Value
368
contains
Code Set
Generation 2 Funding Codes
Funding Stream Codes (Codelist)
Code Value
Code Set
G3
Funding Stream Codes
Accounting Codes (Codelist)
Code Value
Code Set
xxxx
Accounting Codes
14. Tools
Funding Community
Generation 1 Funding Codes (Codelist)
Funding Sources Glossary
code
Code Value
contains
4250
Business term
Policy
Multilingualism
Business Rule
Business Rule
Abstract must be
in Dutch
Abstract must be
in English
?!
?!
Business Term
Business Term
Researcher
Project
Business Term
Business Term
Publication
Actie er Onder...
?!
?!
Code Value
4250
Actie ter
ondersteuning
van de
Strategische
prioriteiten van
de Federale
overheid
Business term
POD
wetenschapsb
eleid Federale
Impulsprogra
mma's
Generation 2 Funding Codes (Codelist)
Code Value
code
4.3
contains
Code Value
368
Code Value
G3
Code Set
Generation 1 Funding Codes
contains
Code Set
Generation 2 Funding Codes
Funding Stream Codes (Codelist)
Code Set
G3
Code Value
4.3
Code Value
Funding Stream Codes
Accounting Codes (Codelist)
Code Value
Code Set
xxxx
Accounting Codes
15. Example for Funding Source Terms and Codes
Funding Community
Data Governance Council
Funding Sources Glossary
Generation 1 Funding Codes (Codelist)
code
Code Value
contains
4250
Business term
Actie ter
ondersteuning
van de
Strategische
prioriteiten van
de Federale
overheid
Code Set
Generation 1 Funding Codes
Business term
POD
wetenschapsb
eleid Federale
Impulsprogra
mma's
Generation 2 Funding Codes (Codelist)
Code Value
code
4.3
contains
Code Value
368
contains
Code Set
Generation 2 Funding Codes
Funding Stream Codes (Codelist)
Code Value
Code Set
G3
Funding Stream Codes
Accounting Codes (Codelist)
Code Value
Code Set
xxxx
Accounting Codes
16. Load, Define & Enforce
Data Governance Council: Governance Operating Model
Load…!
Data Governance
Organization
Roles &
Responsibilities
Processes &
Workflow
Establishes & drives
Collibra Platform
Asset Types &
Traceability
Collibra Data Stewardship
Manager (DSM)
Reports & Escalates
Collibra Business
Semantics Glossary (BSG)
Data Stewardship Activities
Collibra Reference Data
Accelerator (RDA)
Policy
Management
Business
Rules
Data Quality
Rules
Data Quality
Reporting
Issue
Management
Business &
Data Definitions
Business
Traceability
Hierarchy
Management
Semantic
Modeling
Mapping
Specifications
Reference Data
Authoring
Scope,!
select,!
define!
Reference Data
Crosswalks
Master Data
Stewardship
Data Quality Profiling
Aligns & Coordinates
Monitors & Remediates
IT / Operational Data Management Activities
enforce!
Metadata
Lineage
Metadata
Scanning
Data Quality
Development
DQ Defect
Resolution
Data
Modeling
Data
Integration
...
Other Data Management
Vendor products
17. 5 Modeling Concepts in DGC Operating Model
Domains logically group assets (according to their function, project, or knowledge area) and are
owned by exactly one community. It has a domain type that specifies which asset types can be
created in the domain.
E.g., Customer Domain groups all assets related to customer relationship management
E.g., Enterprise Rules and Policies Domain collects all valid policies and rules in the organisation
Communities are groups of people. They often correspond to
functional divisions in a company and should be aligned with
the company's governance organization. A community can
control/own various domains.
E.g., Finance Community includes relevant people in the finance
function, and controls the Customer Domain.
Assets are fundamental building blocks or resources
for which you want to capture information. An asset
belongs to exactly one domain. An asset has a unique
name within its domain..
E.g., Personal Privacy Policy, Customer, ISO 3166,
CRM, Customer Gender Disclosure Issue
Community Name
Domain
relation
Asset
Relations semantically relate 2 assets
E.g., between assets “Customer” and “CRM”: “Customer has system of
record / is system of record for CRM”
E.g., between assets “Customer” and “Gender”: “Customer has gender /
gender of Gender”
Attribute
Attributes are literal values such as strings or numbers that do
not form an asset on their own right.
E.g., the Description attribute for asset “Customer” is “Person that
placed at least one order for at least one product with Bank and
Insurance”
18. DGC Asset Types
Asset Types allow you to formally specify what type an asset is, as a
kind of template. They are assigned to one or more Domain Types.
E.g., Business Term is type for “Customer” and “Gender”
We distinguish between 4
E.g., Code Value is type for “CG_NA”;
main types of asset, and 1
E.g., System is type for “CRM”
special type called Issue
Asset
Governance
Asset
includes asset types such as
Policy and Rule
Business
Asset
Data Asset
Technology
Asset
Issue
includes asset types such as System and
Database
subsumes asset types such as Code Value
subsumes asset types such as Business Term, KPI, and Report
19. Traceability of Assets across Domains
Working Group on Rules and Policies
Enterprise Rules and Policies
Policy
Personal
Privacy Policy
Issue
Gender
Disclosure Issue
violates
governs / complies to
Finance
Customer Domain
"Person or […]
and Insurance"
description
Business Term
has gender
Customer
has system of record
Business Term
Gender
allowed value
Enterprise Architecture
System
CRM
Application Assets
Code Value
Code Value
Code Value
CG_NA
CG_FE
CG_MA
CRM Application Reference Data
Assigning types to assets, relations,
domains gives meaning; and brings
a better understanding of different
viewpoints on DG
!
29. FRIS Data Governance: Funding Sources Glossary Scenario
Funding Sources Glossary (FSG)
data governance officers in the Council
are delegated by each institute
Data Governance Council
…!
ECOOM
UGent
…!
VUB
30. FRIS Data Governance: Funding Sources Glossary Scenario (2)
•
5 (fictional) workflows for different phases in the lifecycle of a term:
candidate > proposed > draft > in-review > accepted
Funding Sources Glossary (FSG)
5
mapping accepted FSG terms
Ticket
Request
on-boarding candidate FSG term
1
Data Governance Council
delegating proposed FSG term
3
Create
2
draft term
Import
ECOOM
4
UGent
4
VUB
4
Discover
approving in-review term
approving in-review term
approving in-review term
31. on-boarding, delegating and drafting a Funding Source term
candidate > proposed > draft > in-review > accepted
33. Demonstration in the DGC Software Tool
•
5 workflows for different phases in the lifecycle of a term:
candidate > proposed > draft > in-review > accepted
Funding Sources Glossary (FSG)
5
1. Start-user who requests: Bob Brown
Ticket
Request
2. DGO Secretary motivates request:
Mike Jones
mapping accepted FSG terms
on-boarding candidate FSG term
3. Officers vote onboarding: John Fishe
1
Data Governance Council
delegating proposed FSG term
3
Create
4. DGO Secretary moves the
onboarded term: Mike Jones
2
draft term
Import
VUB
4
UGent
4
ECOOM
4
Discover
5. Steward drafts term: Pieter DL
approving in-review term
approving in-review term
6. Subject Matter Expert reviews: John West
7. Stakeholder comments: Judy Clarke
8. Co-Stewards vote : Mary Smith
approving in-review term
34. Conclusion
• FRIS Service = Qualitative Data Sharing
• Qualitative => Unambiguous, Timely, Accurate, Open, Complete,
Consistent, Valid, etc.
• Data Stewardship highlights Responsibility aspect of Data Ownership
• Data Governance programs enforces Data Quality Policy and
Regulations
• Data Governance Technologies are promising to handle these issues
that hamper service innovation
35. Conclusions
• Services are data-intensive
• Their coproduction requires data sharing across organisation policies /
modelling assumptions / regulations
• Data Stewardship highlights responsibility aspect of Data Power
• Data Governance programs enforces data policy and regulations
• Data Governance Technologies are promising to overcome these issues
that hamper service innovation
The problem emerges from too eager data management practice within silos. Silos increasingly have to serve each other’s data and they are confronted with a new range of data quality challenges that come from outside the silo.
Een goede cae om dit te illustreren is de Vlaamse departement van Economie, Wetenschap en Innovatie.
Creata awareness that power includes responsibilities
Data ownership implies Data Power and control
Responsibility is usually ignored yet necessary condition for this implication to be acceptable
historicall
Should we have a separate item for FACT TYPE (apart rom RELATION)?
Note to myself: for now 4 main asset types: but this should be better categories because assets could belong to different viewpoints
Question: Community, Individual, Resource, Event belong where ???
Customer reportability REPORT belangrijk te vernoemen
-sho the existing assets in Lima
-show responsibilities in demo
-show communities and domains in demo
-per step show statuses
-show tracebility aftwards