1. Seven Deadly Sins
Seven Deadly Sins
of Database Design
g
Speaker: Solomon Waters
Embarcadero Technologies
b d h l i
San Francisco SQL Server User Group
San Francisco SQL Server User Group
June 2010
Mark Ginnebaugh, User Group Leader
mark@designmind.com
mark@designmind com
2. Seven Deadly Sins of
Database Design
Solomon Waters
Manager, Software Consulting
Embarcadero Technologies
solomon.waters@embarcadero.com
2
3. Common Mistakes
Seven Deadly Sins of
Designing Databases
Solomon Waters
Manager, Software Consulting
Embarcadero Technologies
solomon.waters@embarcadero.com
3
4. Agenda
• Topic
– Seven Deadly Sins Common Mistakes of Designing Databases
• What we’ll learn
– Pitfalls of a poor database design
p g
– Basics of normalization
– How to communicate a database design effectively
– How to avoid some of the most common mistakes made when designing databases
• Q&A
4
5. 7 Deadly Sins Common Mistakes
1. Poor or no documentation for database(s) in production
2.
2 Little or no normalization
3. Not treating the data model like a living, breathing
organism
4. Improper storage of reference data
5. Not using foreign keys, check constraints, and/or
defaults in the database
6. Not using domains and naming standards
7. Not choosing and/or indexing keys properly
5
6. 7 Deadly Sins Common Mistakes
1. Poor or no documentation for database(s) in production
• Problems
– No central documentation of database structure(s)
– Inaccurate documentation of database structure(s)
( )
– No documentation at all of database structure(s)
• Ramifications
– Developers, DBAs, architects, etc. are not on the same page
– Inability to respond to change
– No communication between developers, DBAs and architects
• Solution
– Start from the bottom-up, i.e. reverse engineer database(s) to build
documentation
– Validate models prior to publishing them
– Use HTML reporting and Portal to communicate to users
6
7. 7 Deadly Sins Common Mistakes
2. Little or no normalization
• Problems
– Database denormalized unnecessarily (i.e. too much)
– One large table has been built to store “everything”
g y g
– Multiple values in one column or repeating values in a table
• Ramifications
– Performance may be better, but maintenance can become a nightmare and
expensive
– Lots of NULLs if specific columns don’t have values for specific rows
– Unneeded application code needed to parse out specific values
• Solution
– Understand the basics of database normalization
– Know when and how to normalize when needed
– Industry models can help as a reference or templates
7
8. 7 Deadly Sins Common Mistakes
2. Little or no normalization (cont’d)
• First Normal Form:
– Eliminate duplicative columns and repeating values in columns
• Second Normal Form:
– Remove redundant data that apply to multiple columns
• Third Normal Form:
– Each column of a table should be dependent on the primary key
8
10. 7 Deadly Sins Common Mistakes
3. Not treating the data model like a living, breathing
g
organism
• Problems
– Modeling is done upfront then never updated once the database changes
– Design is not completed/reviewed for flaws before moving to production
– Changes made in production/database without updating data model
• Ramifications
– Implementing changes becomes problematic and expensive
– Undocumented data can lead to security and regulatory issues
– Design missing functionality that the business needs
g g y
• Solution
– Plan out the design of the database conceptually, logically and physically
– Review the design with both technical AND non-technical stake holders
non technical
– Update the models as changes occur or better yet, update the model first!
10
11. 7 Deadly Sins Common Mistakes
3. Not treating the data model like a living, breathing organism
(cont’d)
• Uncontrolled Changes
– Models become out-of-date and no one uses
them
– Reports from out-of-date models are useless
– No understanding what has changed
11
12. 7 Deadly Sins Common Mistakes
3. Not treating the data model like a living, breathing
organism (cont’d)
• Controlled Change
– Define a means of communicating
changes
– Don't let models get out of date
– Build a process to update models
– Automate the process
– Ultimately drive changes from the
model
– Define a means of archiving/tracking
changes
12
13. 7 Deadly Sins Common Mistakes
4. Improper storage of reference data
• Problems
– Reference data (codes, lists, valid values) stored in more than one place
– Reference data stored in application, not in the database
pp
– Constraints not placed in the database
• Ramifications
– More work is needed when code values change
– Database can’t enforce consistency and accuracy of values
– Problems when data is sourced from another place
• Solution
– Leverage data models to store data values
– Keep them up to date with the database
13
14. 7 Deadly Sins Common Mistakes
5. Not using foreign keys, check constraints, and/or
defaults in the database
• Problems
– Legacy system with check constraints and foreign keys enforce by application
– Inconsistent data because lack of constraints
– Using NULLs instead of defaults
• Ramifications
– Incredibly difficult to document system for other users
– Special code becomes the norm, not the exception
– Poor data quality can result if standards are not followed
q y
• Solution
– If it can be enforced in the DDL at creation time do it
– Use tools to infer relationships
14
15. 7 Deadly Sins Common Mistakes
6. Not using domains and naming standards
• Problems
– The “same” columns defined with different data types in different tables
– The “same” column named differently in different tables
y
– Cryptic or non-descriptive names that don’t identify the use of a column
• Ramifications
– Inconsistent and/or poor data quality
– Confusion and wasted time for future developers, DBAs, architects, etc
– Inaccurate use of column
• Solution
– Define a common list of domains users can leverage
– Have a common naming standard dictionary to abbreviate logical to/from
physical names
15
16. 7 Deadly Sins Common Mistakes
7. Not choosing and/or indexing keys properly
• Problems
– Using surrogate keys that don’t uniquely identify the data
– Poorly choosing a p
y g primary key (
y y (too many columns, column is updated frequently)
y p q y)
– Not indexing foreign keys
• Ramifications
– Each row is unique but not the data which leads to redundant data
– Updating or changing primary keys is not trivial
– Performance issues when updating data or accessing related data often
• Solution
– Use a combination of natural and surrogate keys where applicable
– Follow the SUM rules when choosing PKs: 1. Static 2. Unique 3. Minimal
Columns
– Use model validation wizard to enforce rules
16