08448380779 Call Girls In Friends Colony Women Seeking Men
Open source data_warehousing_overview
1. Open Source Data Warehousing:
MySQL and Beyond
Alex Meadows
Twitter: @DBA_Alex
Percona MySQL University
Raleigh, NC
1/29/2013
2. What Is Data Warehousing?
● Central repository
● Oriented on Reporting and Analysis
● Integrates multiple sources
● Core to Business Intelligence and Advanced
Analytics
● Helps keep source systems clean and lean
3. Warehouse Methodologies
● Inmon’s 3NF/Hub and Spoke Model
● Kimball’s Conformed Dimension Model
● Linstedt’s Data Vault Model
● Rönnbäck’s Anchor Model/6NF
9. Kimball’s Conformed Dimensions
● Normal database modeling does not meet needs of
reporting and analysis
● Denormalize data
● Dimensions
● How does data need to be filtered?
● Facts
● What are we wanting to analyze/measure?
11. Open Source Software
● Greenplum (PostgreSQL derivative)
● InfiniDB (MySQL derivative)
● Infobright (MySQL derivative)
● Other columnar data stores
12. Columnar Data Stores
● Designed for conformed dimensions
● High Performance
● Self-indexing based on usage
● High compression of data
13. Row vs Columnar Databases
Source: http://dbbest.com/blog/column-oriented-database-technologies/
14. Cautions
● Traditional RDBMS
● Not built for conformed dimensions!
● Performance will become issue
15. Inmon’s Hub and Spoke
● Combines
● 3NF central data warehouse
● Conformed dimensions
● Becomes foundation for further variants
16. ● Linstedt’s Data Vault Model
● Mixes 3NF and Conformed Dimensions
● Model data per business entities and their
relationships
● Hubs
● Store unique business entity identifiers (keys)
● Links
● Relate hubs and other links to form relationships
● Satellites
● Store unique information regarding entity or
relationship
18. Cautions
● While you get the best mix between 3NF and
conformed dimensions, data marts are still needed
● Issues seen with both 3NF and conformed
dimensions can be found here
19. Open Source Software
● MySQL
● PostgreSQL
● Greenplum
● Other Traditional RDBMS
● NoSQL
● Hadoop
20. ● Rönnbäck’s Anchor Model/6NF
● Focus is on the data and it’s relationships.
● Anchors
● Model entities and events
● Attributes
● Model properties of anchors
● Ties
● Model relationships between anchors
● Knots
● Model relationships between shared properties
22. Cautions
● Number of joins will be an issue for some databases
● Queries will become complex
● Joins
● Finding properties/valuable information
● Every column in traditional tables becomes own
unique table