Every organization produces and consumes data. Because data is so important to day to day operations, data trends are hitting the mainstream and businesses are adopting buzzwords such as Big Data, NoSQL, data scientist, etc., to seek solutions for their fundamental issues. Few realize that the importance of any solution, regardless of platform or technology, relies on the data model supporting it. Data modeling is not an optional task for an organization’s data effort. It is a vital activity that supports the solutions driving your business.
This webinar will address fundamental data modeling methodologies, as well as trends around the practice of data modeling itself. We will discuss abstract models and entity frameworks, as well as the general shift from data modeling being segmented to becoming more integrated with business practices.
Learning Objectives:
How are anchor modeling, data vault, etc. different and when should I apply them?
Integrating data models to business models and the value this creates
Application development (Data first, code first, object first)
What Are The Drone Anti-jamming Systems Technology?
Data-Ed Webinar: Data Modeling Fundamentals
1. Peter Aiken, Ph.D.
Data Modeling Fundamentals
10124 W. Broad Street, Suite C
Glen Allen, Virginia 23060
804.521.4056
2. Data Modeling Fundamentals
2Copyright 2016 by Data Blueprint Slide #
This presentation provides you with an understanding of
the data modeling and data development components
of data management. Participants will understand how
the analysis, design, implementation, deployment, and
maintenance of data solutions should be approached in
order to maximize the full value of the enterprise data
resources and activities. Architecting in quality is
imperative at this level and complements a subset of
project activities within the system development
lifecycle (SDLC) focused on defining data requirements,
designing data solution components, and implementing
these components. Participants will understand the
difficulties organizations experience when interacting
with data development efforts and how to best
incorporate these efforts into specific data projects.
Date: June 14, 2016
Time: 2:00 PM ET/11:00 AM PT
Presenter: Peter Aiken, Ph.D. & John Sells
3. Executive Editor at DATAVERSITY.net
3Copyright 2016 by Data Blueprint Slide #
Shannon Kempe
4. Commonly Asked Questions
4Copyright 2016 by Data Blueprint Slide #
1) Will I get copies of the
slides after the event?
2) Is this being recorded?
5. Get Social With Us!
5Copyright 2016 by Data Blueprint Slide #
Like Us on Facebook
www.facebook.com/
datablueprint
Post questions and
comments
Find industry news, insightful
content
and event updates.
Join the Group
Data Management &
Business Intelligence
Ask questions, gain insights
and collaborate with fellow
data management
professionals
Live Twitter Feed
Join the conversation!
Follow us:
@datablueprint
@paiken
Ask questions and
submit your comments:
#dataed
6. • 30+ years in data management
• Repeated international recognition
• Founder, Data Blueprint (datablueprint.com)
• Associate Professor of IS (vcu.edu)
• DAMA International (dama.org)
• 9 books and dozens of articles
• Experienced w/ 500+ data
management practices
• Multi-year immersions:
– US DoD (DISA/Army/Marines/DLA)
– Nokia
– Deutsche Bank
– Wells Fargo
– Walmart
– …
Peter Aiken, Ph.D.
• DAMA International President 2009-2013
• DAMA International Achievement Award 2001 (with
Dr. E. F. "Ted" Codd
• DAMA International Community Award 2005
PETER AIKEN WITH JUANITA BILLINGS
FOREWORD BY JOHN BOTTEGA
MONETIZING
DATA MANAGEMENT
Unlocking the Value in Your Organization’s
Most Important Asset.
The Case for the
Chief Data Officer
Recasting the C-Suite to Leverage
Your MostValuable Asset
Peter Aiken and
Michael Gorman
6Copyright 2016 by Data Blueprint Slide #
7. John Sells
• Data consultant with a background in Project
Management, Data Management, Verification
and Validation, as well as Application
Development
• Certified Data Management Professional
• Experience working with large global clients
across many business functions
• Skill-set includes in-depth analysis of clients’
business processes, analysis of data and data
sources, and development and communication
of data-centric tailored solutions that add
business value
• Expertise focuses on eliciting business and
technical requirements and facilitating
communication between the business users and
technical experts, including all levels of
management
• Helped clients improve data flow logistics,
develop data quality programs, implement data
governance programs, and design and
implement data warehouses and BI platforms for
organizational divisions.
7Copyright 2016 by Data Blueprint Slide #
8. 8Copyright 2016 by Data Blueprint Slide #
Data Modeling Fundamentals
1. Data Management Overview
2. Why data modeling & what is it?
3. The power of the purpose statement
4. Understanding how to contribute to
organizational challenges beyond
traditional data modeling
5. Guiding problem analyses
using data analysis
6. Using data modeling in conjunction with
architecture/engineering techniques
7. How to utilize data modeling in support of
business strategy
8. Take Aways, References & Q&A
Tweeting now:
#dataed
9.
UsesUsesReuses
What is data management?
9Copyright 2016 by Data Blueprint Slide #
Sources
Data
Engineering
Data
Delivery
Data
Storage
Specialized Team Skills
Data Governance
Understanding the current
and future data needs of an
enterprise and making that
data effective and efficient in
supporting
business activities
Aiken, P, Allen, M. D., Parker, B., Mattia, A.,
"Measuring Data Management's Maturity:
A Community's Self-Assessment"
IEEE Computer (research feature April 2007)
Data management practices connect
data sources and uses in an
organized and efficient manner
• Engineering
• Storage
• Delivery
• Governance
When executed,
engineering, storage, and
delivery implement governance
Note: does not well-depict data reuse
10.
What is data management?
10Copyright 2016 by Data Blueprint Slide #
Sources
Data
Engineering
Data
Delivery
Data
Storage
Specialized Team Skills
Resources
(optimized for reuse)
Data Governance
AnalyticInsight
Specialized Team Skills
11. Data$Management$
Strategy
Data Management Goals
Corporate Culture
Data Management Funding
Data Requirements Lifecycle
Data
Governance
Governance Management
Business Glossary
Metadata Management
Data
Quality
Data Quality Framework
Data Quality Assurance
Data
Operations
Standards and Procedures
Data Sourcing
Platform$&$
Architecture
Architectural Framework
Platforms & Integration
Supporting$
Processes
Measurement & Analysis
Process Management
Process Quality Assurance
Risk Management
Configuration Management
Component Process$Areas
DMM℠ Structure of
5 Integrated
DM Practice Areas
Data architecture
implementation
Data
Governance
Data
Management
Strategy
Data
Operations
Platform
Architecture
Supporting
Processes
Maintain fit-for-purpose data,
efficiently and effectively
11Copyright 2016 by Data Blueprint Slide #
Manage data coherently
Manage data assets professionally
Data life cycle
management
Organizational support
Data
Quality
12. You can accomplish
Advanced Data Practices
without becoming proficient
in the Foundational Data
Practices however
this will:
• Take longer
• Cost more
• Deliver less
• Present
greater
risk
(with thanks to
Tom DeMarco)
Data Management Practices Hierarchy
Advanced
Data
Practices
• MDM
• Mining
• Big Data
• Analytics
• Warehousing
• SOA
Foundational Data Practices
Data Platform/Architecture
Data Governance Data Quality
Data Operations
Data Management Strategy
Technologies
Capabilities
Copyright 2016 by Data Blueprint Slide # 12
15. 15Copyright 2016 by Data Blueprint Slide #
Data Modeling Fundamentals
1. Data Management Overview
2. Why data modeling & what is it?
3. The power of the purpose statement
4. Understanding how to contribute to
organizational challenges beyond
traditional data modeling
5. Guiding problem analyses
using data analysis
6. Using data modeling in conjunction with
architecture/engineering techniques
7. How to utilize data modeling in support of
business strategy
8. Take Aways, References & Q&A
Tweeting now:
#dataed
17. Why Modeling
17Copyright 2016 by Data Blueprint Slide #
• Would you build a house without an
architecture sketch?
• Model is the sketch of the system to be
built in a project.
• Would you like to have an estimate how
much your new house is going to cost?
• Your model gives you a very good idea of
how demanding the implementation work
is going to be!
• If you hired a set of constructors from all
over the world to build your house, would
you like them to have a common
language?
• Model is the common language for the
project team.
• Would you like to verify the proposals of
the construction team before the work gets
started?
• Models can be reviewed before thousands
of hours of implementation work will be
done.
• If it was a great house, would you like to
build something rather similar again, in
another place?
• It is possible to implement the system to
various platforms using the same model.
• Would you drill into a wall of your house
without a map of the plumbing and electric
lines?
• Models document the system built in a
project. This makes life easier for the
support and maintenance!
18. Use Models to
18Copyright 2016 by Data Blueprint Slide #
• Store and formalize information
• Filter out extraneous detail
• Define an essential set of
information
• Help understand complex system behavior
• Gain information from the process of developing and
interacting with the model
• Evaluate various scenarios or other outcomes indicated by
the model
• Monitor and predict system responses to changing
environmental conditions
19. • Goal must be shared IT/business understanding
– No disagreements = insufficient communication
• Data sharing/exchange is largely and highly automated and
thus dependent on successful engineering
– It is critical to engineer a sound foundation of data modeling basics
(the essence) on which to build advantageous data technologies
• Modeling characteristics change over the course of analysis
– Different model instances may be useful to different analytical problems
• Incorporate motivation (purpose statements) in all modeling
– Modeling is a problem defining as well as a problem solving activity - both are inherent to
architecture
• Use of modeling is much more important than selection of a specific modeling method
• Models are often living documents
– It easily adapts to change
• Models must have modern access/interface/search technologies
– Models need to be available in an easily searchable manner
• Utility is paramount
– Adding color and diagramming objects customizes models and allows for a more engaging and
enjoyable user review process
Data Modeling for Business Value
19Copyright 2016 by Data Blueprint Slide #
Inspired by: Karen Lopez http://www.information-management.com/newsletters/enterprise_architecture_data_model_ERP_BI-10020246-1.html?pg=2
20. Data Modeling Ensures Interoperability
• Who makes decisions about the range and scope of
common data usage?
20Copyright 2016 by Data Blueprint Slide #
Program F
Program E
Program D
Program G
Program H
Program I
Application
domain 2Application
domain 3
21. database
architecture
engineering
effort
DataData
DataData
Data
Data
Data
Focus of a
software
architecture
engineering
effort Program A
Program B
Program C
Program F
Program E
Program D
Program G
Program H
Program I
Application
domain 1
Application
domain 2Application
domain 3
Data
Focus of a
Data
Data
Data Architecture Focus has Greater Potential Business Value
• Broader focus
than either
software
architecture or
database
architecture
• Analysis scope is
on the system
wide use of data
• Problems caused
by data exchange
or interface
problems
• Architectural
goals more
strategic than
operational
21Copyright 2016 by Data Blueprint Slide #
24. Data Modeling and Data Architecture
• Data modeling is used to articulate data architecture
components
• Data architectures are comprised of components – usually
expressed as models
• Styles of data modeling exist – this is a challenge
– IE or information engineering
– IDEF1X used by DoD
– ORM or object role modeling
– UML or unified modeling language
• Data models are useful
– In stand-alone mode
– As components of a larger information architecture
24Copyright 2016 by Data Blueprint Slide #
25. 25Copyright 2016 by Data Blueprint Slide #
Data Modeling Fundamentals
1. Data Management Overview
2. Why data modeling & what is it?
3. The power of the purpose statement
4. Understanding how to contribute to
organizational challenges beyond
traditional data modeling
5. Guiding problem analyses
using data analysis
6. Using data modeling in conjunction with
architecture/engineering techniques
7. How to utilize data modeling in support of
business strategy
8. Take Aways, References & Q&A
Tweeting now:
#dataed
26. Standard definition reporting does not provide conceptual context
26Copyright 2016 by Data Blueprint Slide #
Bed
Something you sleep in
27. Entity: BED
Data Asset Type: Principal Data Entity
Purpose: This is a substructure within the room
substructure of the facility location. It contains
information about beds within rooms.
Source: Maintenance Manual for File and Table
Data (Software Version 3.0, Release 3.1)
Attributes: Bed.Description
Bed.Status
Bed.Sex.To.Be.Assigned
Bed.Reserve.Reason
Associations: >0-+ Room
Status: Validated
The Power of the Purpose Statement
27Copyright 2016 by Data Blueprint Slide #
• A purpose statement describing
why the organization is
maintaining information about
this business concept
• Sources of information about it
• A partial list of the attributes or
characteristics of the entity
• Associations with other data
items; this one is read as "One
room contains zero or many
beds"
29. Data map of DISPOSITION
• At least one but possibly more system USERS enter the DISPOSITION facts into the system.
• An ADMISSION is associated with one and only one DISCHARGE.
• An ADMISSION is associated with zero or more FACILITIES.
• An ADMISSION is associated with zero or more PROVIDERS.
• An ADMISSION is associated with one or more ENCOUNTERS.
• An ENCOUNTER may be recorded by a system USER.
• An ENCOUNTER may be associated with a PROVIDER.
• An ENCOUNTER may be associated with one or more DIAGNOSES.
29Copyright 2016 by Data Blueprint Slide #
ADMISSION Contains information about patient admission
history related to one or more inpatient episodes
DIAGNOSIS Contains the International Disease Classification
(IDC) of code representation and/or description of a
patient's health related to an inpatient code
DISCHARGEA table of codes describing disposition types
available for an inpatient at a FACILITY
ENCOUNTER Tracking information related to inpatient
episodes
FACILITY File containing a list of all facilities in regional health
care system
PROVIDER Full name of a member of the FACILITY team
providing services to the patient
USER Any user with access to create, read, update, and
delete DISPOSITION data
30. 30Copyright 2016 by Data Blueprint Slide #
Data Modeling Fundamentals
1. Data Management Overview
2. Why data modeling & what is it?
3. The power of the purpose statement
4. Understanding how to contribute to
organizational challenges beyond
traditional data modeling
5. Guiding problem analyses
using data analysis
6. Using data modeling in conjunction with
architecture/engineering techniques
7. How to utilize data modeling in support of
business strategy
8. Take Aways, References & Q&A
Tweeting now:
#dataed
31. • Models
• Are usually for the
purpose of
understanding
• Can be
– Equations
– Simulations
including video games
– Physical models
– Mental models
Models as an Aid to Understanding
31Copyright 2016 by Data Blueprint Slide #
32. What is a model?
32Copyright 2016 by Data Blueprint Slide #
draw
critique
test
dialog
select
decide
filter
summarize
design
rank
review cluster
generate evaluate
list
visible to
participants
Structure for
organizing things
Framework for
decision making
Requires tools for problem solving and
decision making
Easy to review and
validate
graphic
text
Prototype and mockup
Framework for understanding and design
Source: Ellen Gottesdiener www.ebgconsulting.com
33. Don’t Tell Them You Are Modeling!
33Copyright 2016 by Data Blueprint Slide #
• Just write some
stuff down
• Then arrange it
• Then make
some
appropriate
connections
between your
objects
34. Keep them focused on the purpose
34Copyright 2016 by Data Blueprint Slide #
• The reason we are locked in
this room is to:
– Mission: Review proposal from
voice over IP providers
• Outcome: Walk out the door with the
top two proposals selected and
scheduled personal presentations from
each.
– Mission: Discuss logo ideas for
the Bore No More movement
• Outcome: We will walk out the door
when we identify the top three traits
that represent the Bore No More brand.
– Mission: Update all employees
on the retirement plan options
• Outcomes: Confirm that all team
members took part in the meeting and
have access to review their plans
privately with a financial consultant.
35. 35Copyright 2016 by Data Blueprint Slide #
Data Modeling Fundamentals
1. Data Management Overview
2. Why data modeling & what is it?
3. The power of the purpose statement
4. Understanding how to contribute to
organizational challenges beyond
traditional data modeling
5. Guiding problem analyses
using data analysis
6. Using data modeling in conjunction with
architecture/engineering techniques
7. How to utilize data modeling in support of
business strategy
8. Take Aways, References & Q&A
Tweeting now:
#dataed
37. Entity Relationship View
37Copyright 2016 by Data Blueprint Slide #
(adapted from [Davis 1990])
entity thing about which we maintain
information
object entity encapsulated with attributes
and functions
C U S T O M E R soda
machine
coin
return
deposits
selects
given to
dispenses
coins
38. Modeling In Support of Requirements
Person Job Class
Employee Position
BR1) Zero, one, or more
EMPLOYEES can be associated
with one PERSON
BR2) Zero, one, or more EMPLOYEES
can be associated with one POSITION
38Copyright 2016 by Data Blueprint Slide #
Job Sharing
Moon Lighting
39. 39Copyright 2016 by Data Blueprint Slide #
Data Modeling Fundamentals
1. Data Management Overview
2. Why data modeling & what is it?
3. The power of the purpose statement
4. Understanding how to contribute to
organizational challenges beyond
traditional data modeling
5. Guiding problem analyses
using data analysis
6. Using data modeling in conjunction with
architecture/engineering techniques
7. How to utilize data modeling in support of
business strategy
8. Take Aways, References & Q&A
Tweeting now:
#dataed
42. ANSI-SPARK 3-Layer Schema
42Copyright 2016 by Data Blueprint Slide #
For example, a changeover to a new
DBMS technology. The database
administrator should be able to change
the conceptual or global structure of the
database without affecting the users.
1. Conceptual - Allows independent
customized user views:
– Each should be able to access the same
data, but have a different customized
view of the data.
2. Logical - This hides the physical
storage details from users:
– Users should not have to deal with
physical database storage details. They
should be allowed to work with the data
itself, without concern for how it is
physically stored.
3. Physical - The database administrator
should be able to change the
database storage structures without
affecting the users’ views:
– Changes to the structure of an
organization's data will be required. The
internal structure of the database should
be unaffected by changes to the physical
aspects of the storage.
43. Conceptual Models
• Business
focused
• Entity level
• Provides focus,
scope, and
guidance to
modeling effort
• Sometimes
thrown away -
rarely maintained
43Copyright 2016 by Data Blueprint Slide #
44. Logical Models
• Required to achieve the transition
from conceptual to physical
• Developed to the attribute level via
3rd normal form - to a define level
of understandability
• Logical models are developed to be
refined to until it becomes a
solution - sometimes purchased (as
in EDW) always requires tailoring
• Used to guarantee the rigor of the
data structures by formally describing the relationship between data
items in a strong fashion - more often maintained
44Copyright 2016 by Data Blueprint Slide #
45. Physical Models
• Becomes the blueprints for
physical construction of the
solution
• Blueprints are used for future
maintenance of the solution
45Copyright 2016 by Data Blueprint Slide #
46. Model Evolution (better explanation)
46Copyright 2016 by Data Blueprint Slide #
As-is To-be
Technology
Independent/
Logical
Technology
Dependent/
Physical
abstraction
Other logical
as-is data
architecture
components
47. As Is Information
Requirements
Assets
As Is Data Design Assets As Is Data Implementation
Assets
ExistingNew
Modeling in Various Contexts
O2 Recreate
Data Design
Reverse Engineering
Forward engineering
O5 Reconstitute
Requirements
O9
Reimplement
Data
To Be Data
Implementation
Assets
O8
Redesign
Data
O4
Recon-
stitute
Data
Design
O3 Recreate
Requirements
O6
Redesign
Data
To Be
Design
Assets
O7 Re-
develop
Require-
ments
To Be
Requirements
Assets
O1 Recreate Data
Implementation
Metadata
47Copyright 2016 by Data Blueprint Slide #
48. Model Evolution Framework
48Copyright 2016 by Data Blueprint Slide #
Conceptual Logical Physical
Goal
Validated
Not Validated
Every change can
be mapped to a
transformation in
this framework!
50. 50Copyright 2016 by Data Blueprint Slide #
Data Modeling Fundamentals
1. Data Management Overview
2. Why data modeling & what is it?
3. The power of the purpose statement
4. Understanding how to contribute to
organizational challenges beyond
traditional data modeling
5. Guiding problem analyses
using data analysis
6. Using data modeling in conjunction with
architecture/engineering techniques
7. How to utilize data modeling in support of
business strategy
8. Take Aways, References & Q&A
Tweeting now:
#dataed
51. How do Data Models Support Organizational Strategy?
• Consider the opposite question:
– Were your systems explicitly designed to
be integrated or otherwise work together?
– If not then what is the likelihood that they
will work well together?
– In all likelihood your organization is spending between 20-40% of its
IT budget compensating for poor data structure integration
– They cannot be helpful as long as their structure is unknown
• Two answers
– Achieving efficiency and effectiveness goals
– Providing organizational dexterity for rapid implementation
51Copyright 2016 by Data Blueprint Slide #
52. Design Styles – 3NF
• A mathematical data design technique founded in the early 70s by E.F.
Codd.
• Organizes data in simple
rows and columns - Entities
• Creates connections
between the entities called
relationships to show how
the data is inter-related
• 3NF removes data
redundancies – a piece of
data is stored only once
• 3NF is based on mathematics, give the same facts to different
modelers; the models they produce should be very similar.
• Creates a visual (Entity Relation Diagram - ERD) which may be
understood by less technical personnel
• 3NF is the modeling style most popularly used for operationally focused
data stores.
52Copyright 2016 by Data Blueprint Slide #
53. Design Styles – Dimensional
• Created and refined by Ralph
Kimball in the 80s.
• Organizes data in Facts
and Dimensions. Fact
tables record the events
(what) within the business domain
and the Dimension tables describe
who, when, how and where.
• The data design style was created to
exploit the capabilities of the relational database to retrieve
and report against large volumes of data.
• Dimensional modeling sacrifices storage efficiency for
analytical processing speed
• There are 2 variations to Dimensional Modeling: Star Schema
and Snowflake
53Copyright 2016 by Data Blueprint Slide #
54. Design Styles – Data Vault
• One of the newer relational database modeling techniques
• Data Vault modeling was conceived in the 1990s by Dan
Linstedt
• Data Vault models are designed for central data
warehouses that store non-volatile, time-variant, atomic
data
• Relationships are defined through Link structures which
promote flexibility and extensibility
54Copyright 2016 by Data Blueprint Slide #
55. Data Models Used to Support Strategy
• Flexible, adaptable data structures
• Cleaner, less complex code
• Ensure strategy effectiveness measurement
• Build in future capabilities
• Form/assess merger and acquisitions strategies
55Copyright 2016 by Data Blueprint Slide #
Employee
Type
Employee
Sales
Person
Manager
Manager
Type
Staff
Manager
Line
Manager
Adapted from Clive Finkelstein Information Engineering Strategic Systems Development 1992
56. Mission and Purpose
• Develop, deliver and support products and services which
satisfy the needs of customers in markets
where we can achieve
a return on investment
at least 20% annually
within two years of
market entry
56Copyright 2016 by Data Blueprint Slide #
60. Next Step
60Copyright 2016 by Data Blueprint Slide #
Market
Market
Customer
Product
Need
Need
Customer
Product
Market
Need
ProductCustomer
Customer
Need
Market
Product
61. Subsequent Step for Business Value
61Copyright 2016 by Data Blueprint Slide #
Market
Market
Performance
Product
Performance
Need
Customer
Performance
Need
Performance
ProductCustomer
Performance
62. Questions?
It’s your turn!
Use the chat feature or Twitter (#dataed) to submit
your questions to Peter & John now!
+ =
62Copyright 2016 by Data Blueprint Slide #
63. Upcoming Events
Governing the Business Vocabulary – aligning the requirements
of the business and IT to achieve a shared understanding of
data across an organization
June 27, 2016 @ 8:30 AM ET
San Diego, CA
http://www.debtechint.com
Data Quality Success Stories
July 12, 2016 @ 2:00 PM ET/11:00 AM PT
Sign up here:
www.datablueprint.com/webinar-schedule
or www.dataversity.net
63Copyright 2016 by Data Blueprint Slide #