It is quite possible to use Agile techniques for creating and maintaining a data architecture. Doing so will dramatically reduce the risk of failed data warehouse projects. This webinar will give you a quick overview of the benefits and challenges of Agile Data Modeling, Evolutionary Database Design, Agile Modeling, Conformed Dimensions, Bus Matrix, Database Refactoring, and an Agile framework for Agile data projects
3. After the webinar…
• We will send directions to collect the PDU you will earn
from attending this webinar
• We will also send a links to the recorded webinar and
presentation slides once they are posted online
For more information, visit www.cprime.com
4. Your Instructor
• Tim Guay has over 25 years of IT experience and has
applied Agile methodologies since 2002.
• Enterprise Data Warehouse Specialist for 6 1/2 years
• Managed major DW projects
• PMP Certified since 2001, CSM since 2008, PMP-ACP
since 2012, and Lean Sensei since 2013.
• Clients have included government agencies, start-ups,
and Fortune 500 corporations.
• Agile trainer and coach.
4
5. Agenda
• Agile Data is Possible
• Why do it?
• Guiding Principles
• Evolutionary design
• Database Refactoring
• Hyper-normalization and Generalization
• Agile Modeling
• Q & A
5
6. Agile Data Is Possible
• There are many who say that creating an enterprise-
level database or data warehouse requires BDUF
• It is possible and it is actually the better way to go as
both Kimball and Inmon attest
• Though Kimball’s architecture is best suited and will be
the one that underlies my presentation.
• Best because:
• Bottom-up approach
• Conformed Dimensions and Bus
• Matrix Bus
6
7. Agile Data is Possible
Goals of Agile Data Architecture
•To architect to support the delivery of working DW/BI
functionality early and continuously to our customers
•To architect for change
•Scott Ambler key thought-leader in this space
7
8. Why Do It?
• Agile Myths - Too risky, no planning, no design, no
documentation, cowboy coding, only good for small
projects
• Waterfall realities - Overall failure rate 29% (Standish),
DW failure rate 50%+ (Gartner)
8
9. Why Do It?
• DW Failure Modes :
• Insufficient business involvement
• Underestimating the complexity and scope
• Not anticipating or allowing change
• Misunderstood expectations
• Overcomplicated architecture
• Poor understanding of the data
9
10. Guiding Principles
Agile Principles
1.Our highest priority is to satisfy the customer through early and continuous
delivery of valuable software
2.Welcome changing requirements, even late in development. Agile processes
harness change for the customer's competitive advantage
3.Deliver working software frequently, from a couple of weeks to a couple of
months, with a preference to the shorter timescale
4.Business people and developers must work together daily throughout the
project
5.Build projects around motivated individuals. Give them the environment and
support they need, and trust them to get the job done
6.The most efficient and effective method of conveying information to and within
a development team is face-to-face conversation
10
11. Guiding Principles
Agile Principles
7.Working software is the primary measure of progress
8.Agile processes promote sustainable development. The sponsors, developers,
and users should be able to maintain a constant pace indefinitely
9.Continuous attention to technical excellence and good design enhances agility
10.Simplicity — the art of maximizing the amount of work not done — is essential
11.The best architectures, requirements, and designs emerge from self-
organizing teams
12.At regular intervals, the team reflects on how to become more effective, then
tunes and adjusts its behavior accordingly.
11
12. Evolutionary Design
Key Practices
•Close collaboration between DBAs and developers
•Each developer gets their own DB instance and test data
•Continuous integration into the shared master
•Automate the refactoring
•Automatically update the developer instances whenever
the master is changed
•Have a clear DB access layer within the code
•Beware of delivering one-off solutions
12
13. Evolutionary Design
Laying the Foundation - Conformed Dimensions
•Conformed dimensions are descriptive master reference data that
are referenced in multiple dimensional models
•Fundamental to the Kimball approach
•Enables Agile DW/BI by levering existing CD’s
•Start by identifying a subset of attributes that have significance
across the enterprise and iteratively grow from there
•Failure to create conformed dimensions from the start will result
in significant technical debit and is one of the key reasons for Agile
DW project failure
13
14. Evolutionary Design
Laying the Foundation - Bus Matrix
•Each column is a conformed dimension
•Separate columns describe other information associated to
each business process i.e. Owner, etc.
•Each row is a business process
•Each dimension is associated to a process by an X in the
intersecting cell
•Meets the Agile principle of just enough documentation
14
15. Evolutionary Design
Laying the Foundation - Bus Matrix
•Can be done in a matter of days with the right people at the
table and a skilled facilitator
•Solid understanding of data and processes is required
•Collaboration is key
•Provides the Agile master plan and list of reusable common
dimensions
•Focusing on one row at a time reduces risks from overly-
ambitious plans and supports the Agile principle of rapid
development of valuable software
15
16. Evolutionary Design
Database Encapsulation Layer
•Software architecture should include a database
encapsulation layer; aka persistence layer or data layer
•Hides the physical details of the DB from the business code
•If DB changes only this layer needs to be changed
•Consolidates all DB access code in ‘one’ place
16
17. Evolutionary Design
Database Encapsulation Layer - Variations
•Single application, single DB - pretty straight-forward
•Multiple-applications, single DB - common when there is a
legacy DB
•Multiple applications, multiple DB
•Implement via direct SQL access, DAOs, Persistence
Frameworks, or services
17
18. Database Refactoring
• Essentially normalization after the fact
• Are design improvement changes to the schema that still
preserve its behavioral and informational semantics
• Includes both structural and functional aspects
• Can involve doing three changes together
1. Changing the schema
2. Migrating the data to the new schema
3. Changing the DB access code
18
19. Hyper Normalization & Generalization
• Hyper-normalization – Beyond 3NF
• Data Vault with attributes in satellite tables and foreign
keys moved to link tables
• Allows changes to data relationship without changing data
(hub) tables
• Hyper-generalization -all hub data moved to single table
and have a table of tables to ID which rows belong to what
data category. Also only one link table needed
• Reduces complexity and collateral damage from changes
19
20. Database Refactoring
Examples include:
•Apply Standard Types to Similar Data
•Consolidate Key Strategy for Entity
•Encapsulate Common Structure With View
•Introduce Column Constraint
•Introduce Common Format
•Introduce Lookup Table
•Migrate Database Method to Application
•Rename Column
•Replace One-To-Many With Associative Table
•Replace View With Method(s)
•Split Column 20
22. Agile Modeling
• Scott Ambler developed the concept of Agile Modeling
• Agile models are just barely good enough
• Agile models are developed iteratively
• Starts with a light-weight envisioning session to create a
domain model. To that I would add developing a Bus
Matrix and defining a core set of conformed dimensions
• With each iteration develop just barely enough of the
data model to support development of the sprint backlog
22