5. When you think of “modular” design, what do you think of? Snap ins Components Assembly Who remembers “units” knitwear?
6. Let’s Explore the Fast Food Example Basic Ingredients Basic toppings Menu Choice Hamburger Cheeseburger Cheese Hamburger Patty Double Cheeseburger Chicken Sandwich Hamburger Bun Grilled Chicken Breast Ranch Chicken Sandwich Chicken Sandwich w/Cheese Wrap Ranch Chicken Sandwich w/Cheese Fish Filet Fish Sandwich Ranch Dressing Fish Wrap
7. Benefits of Modular Design Combinability = meet the demands of more consumers Adaptability = earliest responder to changes in market conditions Efficiency = lower cost of product Consistency in (good) quality = lowers perceived risk and lowers “barriers to entry”
8. Drawbacks of Modular Design Change one/change all is a double edged sword Module management Storing Organizing Versioning
12. The process by which data and programs are defined with a representation similar to its meaning (semantics) Applies to Data Actions Abstraction in computer science
17. Modularity in ETL Which components of ETL apps can and should be modularized?
18. Data Abstraction in ETL Data type wrangling… ever felt like a cowboy/cowgirl trying to herd the unherdable? Git along li’lnvarchar SQL Server: nvarchar ODBC: SQL_VARCHAR MySQL: varchar PostgreSQL: character varying
19. Function Abstraction in ETL Think about using abstract functions. Does your transformation language support abstraction? Examples Lua anonymous functions and closures function makeaddfunc(x) return function(y) return x + y end end plustwo= makeaddfunc(2) C# anonymous function Func<int,int> foo = x => x*x;
20. Put it all together Modular, abstracted ETL programming environments: Modular connectivity objects Modular schema objects Modular data types that can be abstracted Modular functions that can be abstracted Build your applications like a fast food hamburger Mix n’ match applications are built efficiently
21. What else? Think about how you are designing your ETL applications How much of your code is dealing with data quality issues that have to be fixed manually? Null values Invalid values Invalid formats Missing values Data out of range or not in the proscribed range Find those errors first (and make fixing them someone else’s problem) It’s hard to predict the future, so adaptable and modular designs have long-terms benefits.
22. Conclusions and Deep Thoughts You’re doing something wonderful when you’re doing ETL! ETL is applied fundamental computer science.
24. Save 25%: Register by April 12th www.sqlpass.org/sqlrally May 11-13, Orlando, FL Register by March31st: save 40% and have the chance to win a cruise to Alaska! “24HR11” code gets you $100 off www.sqlpass.org/summit Oct 11-14, Seattle, WA