4. Bad, Bad, Bad Data Quality www.Test2008.in Erroneous Mailing hit $611 billion for US businesses in 2002
5. DQ is not my problem? Think Again !!!!! www.Test2008.in
6. DQ Hot Candidates www.Test2008.in Data Movement Migrations Backups Restore Import Export Data Warehousing Business Intelligence OLTP OLAP CRM ERP
7. DQ Ishikawa Diagram www.Test2008.in Bad Decisions (Loss $ & Customers) DQ Reqmts not documented Lack of white box testing Data is dynamic CRM & ERPs Implementations Mergers / Take Over
24. How do we test DQ? www.Test2008.in DQ Rule Engine Metadata Results Create Procedure RowCount (SrcTbl, TgtTbl) Begin Declare SRC, TGT Integer Select SRC = Count(*) from SrcTbl Select TGT = Count(*) from TgtTbl) If SRC = TGT Then Return “PASS” Else Return SRC – TGT End If End Metadata Results Row Count Logic Duplicate Logic Create Procedure Duplicate(Tbl) Begin Declare Dup Integer Select Dup = Count of Select * from Tbl GroupBy <<ColumnList>> Having count(*) > 1 If Dup = 0 Then Return “PASS” Else Return Dup End If End End Rule Tbl1 Tbl2 RC Emp Emp RI Emp Dept DC HR HR Rule Result Comment RC Pass - RI Fail 10 DC Pass -
25. You can’t improve what you can’t measure www.Test2008.in Threshold Time 5 % 10 % 100 % Data Quality Red: BAD DQ Yellow: Watch it Green: Good DQ