SlideShare a Scribd company logo
1 of 25
Download to read offline
Data Quality Services in SQL Server 2012
(An Introduction)
Stéphane Fréchette
Friday April 26, 2013
Matching
Cleansing
DQS
Who am I?
My name is Stéphane Fréchette
I’m a Database & Business Intelligence Professional and CEO | Founder of
I have a passion for architecting, designing and building solutions that matter.
Self proclaimed Open Data Hacker/Advocate I founded Gatineau Ouverte a citizen led
initiative which aims to promote open access to civic data of the city of Gatineau.
Twitter: @sfrechette
Email: stephanefrechette@ukubu.com
Blog: stephanefrechette.com
Session Outline
• Microsoft Business Intelligence (The Stack)
• Dirty Data…
• SQL Server Data Quality Services (DQS)
• Data Steward
• Knowledge Base and Domains
• Data Quality Projects
• Data Cleansing Transform – SSIS
• DQS (Install & Architecture)
• Enterprise Information Management (EMI)
• Resources
Analysis
Services
Reporting
Services
Integration
Services
Master Data
Services
SharePoint
Collaboration
Excel
Workbooks
PowerPivot
Applications
SharePoint
Dashboards & Scorecards
Data Quality
Services
OData
Feeds
Line of Business
Applications
Hadoop Big Data
Microsoft Business Intelligence
Dirty Data…
Do you have dirty data?
(all projects have it! Its inevitable)
Dirty Data…
Causes?
Bad data entry
Poor Data Governance
Duplicate entities in different LOB systems
Sample Data Representation
• Prospect in CRM System:
Mark Smith | 613.111-1234 | Ottawa | ON | K1P 1K1
• Prospect buys goods now entered in POS System:
Markus Smith | 1234 Stilton Ave | Kanata |ON | K1P 1K1
• Record also entered into Accounting System:
Markus Smith | 1234 Stilton Avenue | Kanata | ON | K1P 1K1
ETL process imports these records into the Data Warehouse / Data Mart
FirstName LastName Phone Address City Province PostalCode
Mark Smith 613.111-1234 Ottawa ON K1P 1K1
Markus Smith 1234 Stilton Ave Kanata ON K1P 1K1
Markus Smith 1234 Stilton Avenue Kanata ON K1P 1K1
Sample Data Representation
• Duplicate records and inaccurate, incomplete data
• What we want is a golden record (one version of the truth)
FirstName LastName Phone Address City Province PostalCode
Mark Smith 613.111-1234 Ottawa ON K1P 1K1
Markus Smith 1234 Stilton Ave Kanata ON K1P 1K1
Markus Smith 1234 Stilton Avenue Kanata ON K1P 1K1
FirstName LastName Phone Address City Province PostalCode
Markus Smith 613-111-1234 1234 Stilton Ave Kanata ON K1P 1K1
SQL Server Data Quality Services (DQS)
• New in SQL Server 2012
• Enables cleansing, matching, standardizing and enriching data
• Delivers trusted information for business intelligence, data warehouse, transaction
processing workloads
• Knowledge-Driven Solution (create/edit)
• A knowledge management process that builds the knowledge base
• A data quality project that proposes changes to source data based on the knowledge in the knowledge
base (cleansing and matching)
• A key component to an Enterprise Information Management (EIM) solution
Answering the Need with DQS
• DQS enables to resolve issues involving incompleteness, lack of conformity, inconsistency,
inaccuracy, invalidity, and data duplication
• Provides the following features to resolve data quality issues:
 Data Cleansing
 Matching
 Reference Data Services
 Profiling
 Monitoring
 Knowledge Base
Data Steward
• Key role - Is usually a Business User and not from the Information Technology side
• Nutshell: Responsible for maintaining data elements in a metadata registry…
• Data Steward -> DQS Client
• Create and edit Knowledge Bases
• Run and process data though continually, iteratively, improving the Knowledge Bases
• Knowledge Bases can be consumed and used by other Data Stewards and IT (SSIS / ETL Developers)
DQS
Data Steward
MDS
Data Steward
SSIS
Developer
Matching Cleansing
Knowledge Bases and Domains
The knowledge base is a repo of knowledge about your data that enables you to understand
your data and maintain its integrity.
• Processes:
• Computer-assisted
• Interactive
• Components:
• Knowledge Discovery
• Domain Management
• Reference Data Services
• Matching Policy
Demo
Knowledge Base Management
(Creating a Knowledge Base)
Data Quality Projects
Improve quality of source data by performing data cleansing and data matching activities
using defined knowledge bases
• Cleansing Activity (2 step process)
• Computer-assisted : data is categorized (suggested, new, invalid, corrected, and correct)
• Interactive: data steward to approve, reject, or modify the proposed results from the computer-assisted
cleansing process
• Matching Activity
• Using existing knowledge base matching policy
• Prevent and remove data duplication
• Data Profiling and Notifications
• Profiling provides data quality stats and info: completeness and accuracy
• Notification on actions that can be taken to enhance operations
Demo
Data Quality Project
(Cleansing and Matching)
DQS Cleansing Transform in SSIS
• When you want to automate the cleansing and matching process
and not use the DQS Client
• Use SSIS for batch data cleansing
• Matching can be done with Master Data Services (MDS)
• SSIS can be leveraged to bring DQS and MDS together
*DQS does not expose matching functionality for SSIS, but you can use Fuzzy Grouping Transform to
identify duplicate data
*Cleansing Transform is single threaded – use multiple transform for parallelism
Demo
Data Cleansing Transform
(Automating the Cleansing and Matching using SSIS)
Installing DQS
• Requires Business Intelligence or Enterprise/Developer version of SQL Server 2012
• During SQL Server setup;
• Instance Features -> Data Quality Services
• Shared Features -> Data Quality Client
• Execute the Data Quality Server Installer;
• C:Program FilesMicrosoft SQL ServerMSSQL11.MSSQLSERVERMSSQLBinnDQSInstaller.exe
• Data Quality Service – Data Quality Server Installer
(Apps - Microsoft SQL Server 2012)
DQS Architecture
DQS Server
DQS Catalog (3 databases)
• DQS_MAIN (Knowledge Bases)
• DQS_PROJECTS (Projects)
• DQS_STAGING_DATA (Sandbox, scratch pad area)
Security – Database Roles
• dqs_administrator
• dqs_kb_editor
• dqs_kb_operator
Windows Azure Marketplace
Reference Data Services -> validating, cleansing and enriching your data
Performance considerations - FYI
• Major performance improvements from RTM to CU1 release of SQL Server 2012 (strongly
recommend patching and upgrading) http://bit.ly/11eEhHC
• Must read -> DQS Performance Best Practice Guide http://bit.ly/16Gwenl
• Understand data volumes and hardware requirements… plan wisely!
Enterprise Information Management (EIM)
The EIM Stack as a whole is the ‘Master Data Management’ solution from Microsoft and
consist of the following:
• SQL Server Data Quality Services (DQS) - Capture and record knowledge, rules, and actions
• SQL Server Master Data Services (MDS) - Master Data Management repository, Dimension data
• SQL Server Integration Services (SSIS) – Moves data, integration
Enterprise Information Management (EMI)
‘Master Data Management’
Resources
• Data Quality Services Team Blog (MSDN) http://bit.ly/WCI2nO
• SQL Server Data Quality Services (TechNet) http://bit.ly/ZaUO8k
• DQS Performance Best Practices Guide http://bit.ly/16Gwenl
• Enterprise Information Management (EIM) Bringing Together SSIS, DQS, and
MDS (Video – Channel 9) http://bit.ly/NJXvKr
• Matt Masson – Getting Started with DQS and MDS http://bit.ly/149Ga9n
• Paras Doshi’s – Blog (DQS) http://bit.ly/YoLthh
What Questions Do You Have?
Thank You
For attending this session

More Related Content

What's hot

Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
 
The data quality challenge
The data quality challengeThe data quality challenge
The data quality challengeLenia Miltiadous
 
MDM Strategy & Roadmap
MDM Strategy & RoadmapMDM Strategy & Roadmap
MDM Strategy & Roadmapvictorlbrown
 
Data-Ed Online: Approaching Data Quality
Data-Ed Online: Approaching Data QualityData-Ed Online: Approaching Data Quality
Data-Ed Online: Approaching Data QualityDATAVERSITY
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business EnablerSrinivasan Sankar
 
Data Modeling Techniques
Data Modeling TechniquesData Modeling Techniques
Data Modeling TechniquesDATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
Chapter 7: Data Security Management
Chapter 7: Data Security ManagementChapter 7: Data Security Management
Chapter 7: Data Security ManagementAhmed Alorage
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...DATAVERSITY
 
Master Data Management - Aligning Data, Process and Governance
Master Data Management - Aligning Data, Process and Governance Master Data Management - Aligning Data, Process and Governance
Master Data Management - Aligning Data, Process and Governance Precisely
 
Building a Data Governance Strategy
Building a Data Governance StrategyBuilding a Data Governance Strategy
Building a Data Governance StrategyAnalytics8
 
Master Data Management methodology
Master Data Management methodologyMaster Data Management methodology
Master Data Management methodologyDatabase Architechs
 
Data Quality Best Practices
Data Quality Best PracticesData Quality Best Practices
Data Quality Best PracticesDATAVERSITY
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)DATAVERSITY
 
Preparing For a Master Data Management Implemenation
Preparing For a Master Data Management ImplemenationPreparing For a Master Data Management Implemenation
Preparing For a Master Data Management ImplemenationInnovative_Systems
 
‏‏‏‏‏‏‏‏‏‏‏‏Chapter 13: Professional Development
‏‏‏‏‏‏‏‏‏‏‏‏Chapter 13: Professional Development‏‏‏‏‏‏‏‏‏‏‏‏Chapter 13: Professional Development
‏‏‏‏‏‏‏‏‏‏‏‏Chapter 13: Professional DevelopmentAhmed Alorage
 
How to Make a Data Governance Program that Lasts
How to Make a Data Governance Program that LastsHow to Make a Data Governance Program that Lasts
How to Make a Data Governance Program that LastsDATAVERSITY
 

What's hot (20)

Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
DQS & MDS in SQL Server 2016
DQS & MDS in SQL Server 2016DQS & MDS in SQL Server 2016
DQS & MDS in SQL Server 2016
 
The data quality challenge
The data quality challengeThe data quality challenge
The data quality challenge
 
MDM Strategy & Roadmap
MDM Strategy & RoadmapMDM Strategy & Roadmap
MDM Strategy & Roadmap
 
Data-Ed Online: Approaching Data Quality
Data-Ed Online: Approaching Data QualityData-Ed Online: Approaching Data Quality
Data-Ed Online: Approaching Data Quality
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business Enabler
 
Mdm: why, when, how
Mdm: why, when, howMdm: why, when, how
Mdm: why, when, how
 
Data Modeling Techniques
Data Modeling TechniquesData Modeling Techniques
Data Modeling Techniques
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
Chapter 7: Data Security Management
Chapter 7: Data Security ManagementChapter 7: Data Security Management
Chapter 7: Data Security Management
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
Master Data Management - Aligning Data, Process and Governance
Master Data Management - Aligning Data, Process and Governance Master Data Management - Aligning Data, Process and Governance
Master Data Management - Aligning Data, Process and Governance
 
Building a Data Governance Strategy
Building a Data Governance StrategyBuilding a Data Governance Strategy
Building a Data Governance Strategy
 
Master Data Management methodology
Master Data Management methodologyMaster Data Management methodology
Master Data Management methodology
 
Data Quality Best Practices
Data Quality Best PracticesData Quality Best Practices
Data Quality Best Practices
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)
 
Preparing For a Master Data Management Implemenation
Preparing For a Master Data Management ImplemenationPreparing For a Master Data Management Implemenation
Preparing For a Master Data Management Implemenation
 
‏‏‏‏‏‏‏‏‏‏‏‏Chapter 13: Professional Development
‏‏‏‏‏‏‏‏‏‏‏‏Chapter 13: Professional Development‏‏‏‏‏‏‏‏‏‏‏‏Chapter 13: Professional Development
‏‏‏‏‏‏‏‏‏‏‏‏Chapter 13: Professional Development
 
How to Make a Data Governance Program that Lasts
How to Make a Data Governance Program that LastsHow to Make a Data Governance Program that Lasts
How to Make a Data Governance Program that Lasts
 

Viewers also liked

Data quality architecture
Data quality architectureData quality architecture
Data quality architectureanicewick
 
Mobile Loyalty that works: a successful case study by Warply and Eurobank
Mobile Loyalty that works: a successful case study by Warply and Eurobank Mobile Loyalty that works: a successful case study by Warply and Eurobank
Mobile Loyalty that works: a successful case study by Warply and Eurobank Warply
 
SQL Server 2012 Certifications
SQL Server 2012 CertificationsSQL Server 2012 Certifications
SQL Server 2012 CertificationsMarcos Freccia
 
Sql server-dba
Sql server-dbaSql server-dba
Sql server-dbaNaviSoft
 
Sql Server Interview Question
Sql Server Interview QuestionSql Server Interview Question
Sql Server Interview Questionpukal rani
 
Webinar On-Demand: The Power of Analytics to Drive Loyalty
Webinar On-Demand: The Power of Analytics to Drive LoyaltyWebinar On-Demand: The Power of Analytics to Drive Loyalty
Webinar On-Demand: The Power of Analytics to Drive LoyaltyTIBCO Loyalty Lab
 
Sql server 2008 interview questions answers
Sql server 2008 interview questions answersSql server 2008 interview questions answers
Sql server 2008 interview questions answersJitendra Gangwar
 
The AMB Data Warehouse: A Case Study
The AMB Data Warehouse: A Case StudyThe AMB Data Warehouse: A Case Study
The AMB Data Warehouse: A Case StudyMark Gschwind
 
Top 5 TSQL Improvements in SQL Server 2014
Top 5 TSQL Improvements in SQL Server 2014Top 5 TSQL Improvements in SQL Server 2014
Top 5 TSQL Improvements in SQL Server 2014Boris Hristov
 
Customer Segmentation and Predictive Modeling
Customer Segmentation and Predictive ModelingCustomer Segmentation and Predictive Modeling
Customer Segmentation and Predictive ModelingAngie Wang
 
Sql server 2012 dba online training
Sql server 2012 dba online trainingSql server 2012 dba online training
Sql server 2012 dba online trainingsqlmasters
 
New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 Richie Rump
 
70-461 Querying Microsoft SQL Server 2012
70-461 Querying Microsoft SQL Server 201270-461 Querying Microsoft SQL Server 2012
70-461 Querying Microsoft SQL Server 2012siphocha
 
Introduction to Master Data Services in SQL Server 2012
Introduction to Master Data Services in SQL Server 2012Introduction to Master Data Services in SQL Server 2012
Introduction to Master Data Services in SQL Server 2012Stéphane Fréchette
 
Best MCSA - SQL SERVER 2012 Training Institute in Delhi
Best MCSA - SQL SERVER 2012 Training Institute in DelhiBest MCSA - SQL SERVER 2012 Training Institute in Delhi
Best MCSA - SQL SERVER 2012 Training Institute in DelhiInformation Technology
 

Viewers also liked (17)

Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
 
Mobile Loyalty that works: a successful case study by Warply and Eurobank
Mobile Loyalty that works: a successful case study by Warply and Eurobank Mobile Loyalty that works: a successful case study by Warply and Eurobank
Mobile Loyalty that works: a successful case study by Warply and Eurobank
 
Data Quality
Data QualityData Quality
Data Quality
 
SQL Server 2012 Certifications
SQL Server 2012 CertificationsSQL Server 2012 Certifications
SQL Server 2012 Certifications
 
Sql server-dba
Sql server-dbaSql server-dba
Sql server-dba
 
Sql Server Interview Question
Sql Server Interview QuestionSql Server Interview Question
Sql Server Interview Question
 
Webinar On-Demand: The Power of Analytics to Drive Loyalty
Webinar On-Demand: The Power of Analytics to Drive LoyaltyWebinar On-Demand: The Power of Analytics to Drive Loyalty
Webinar On-Demand: The Power of Analytics to Drive Loyalty
 
Sql server 2008 interview questions answers
Sql server 2008 interview questions answersSql server 2008 interview questions answers
Sql server 2008 interview questions answers
 
The AMB Data Warehouse: A Case Study
The AMB Data Warehouse: A Case StudyThe AMB Data Warehouse: A Case Study
The AMB Data Warehouse: A Case Study
 
Top 5 TSQL Improvements in SQL Server 2014
Top 5 TSQL Improvements in SQL Server 2014Top 5 TSQL Improvements in SQL Server 2014
Top 5 TSQL Improvements in SQL Server 2014
 
Customer Segmentation and Predictive Modeling
Customer Segmentation and Predictive ModelingCustomer Segmentation and Predictive Modeling
Customer Segmentation and Predictive Modeling
 
Sql server 2012 dba online training
Sql server 2012 dba online trainingSql server 2012 dba online training
Sql server 2012 dba online training
 
New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012
 
70-461 Querying Microsoft SQL Server 2012
70-461 Querying Microsoft SQL Server 201270-461 Querying Microsoft SQL Server 2012
70-461 Querying Microsoft SQL Server 2012
 
Good sql server interview_questions
Good sql server interview_questionsGood sql server interview_questions
Good sql server interview_questions
 
Introduction to Master Data Services in SQL Server 2012
Introduction to Master Data Services in SQL Server 2012Introduction to Master Data Services in SQL Server 2012
Introduction to Master Data Services in SQL Server 2012
 
Best MCSA - SQL SERVER 2012 Training Institute in Delhi
Best MCSA - SQL SERVER 2012 Training Institute in DelhiBest MCSA - SQL SERVER 2012 Training Institute in Delhi
Best MCSA - SQL SERVER 2012 Training Institute in Delhi
 

Similar to Data Quality Services in SQL Server 2012

SQL Server Integration Services – Enterprise Manageability
SQL Server Integration Services – Enterprise ManageabilitySQL Server Integration Services – Enterprise Manageability
SQL Server Integration Services – Enterprise ManageabilityDan English
 
Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...
Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...
Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...Perficient, Inc.
 
Data Quality from Precisely
Data Quality from PreciselyData Quality from Precisely
Data Quality from PreciselyPrecisely
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...DataWorks Summit
 
SQLSaturday #188 - Enterprise Information Management
SQLSaturday #188  - Enterprise Information ManagementSQLSaturday #188  - Enterprise Information Management
SQLSaturday #188 - Enterprise Information ManagementTillmann Eitelberg
 
Krishna_IBM_Infosphere_Certified_Datastage_Consultant
Krishna_IBM_Infosphere_Certified_Datastage_Consultant Krishna_IBM_Infosphere_Certified_Datastage_Consultant
Krishna_IBM_Infosphere_Certified_Datastage_Consultant Krishna Kishore
 
Marlabs Capabilities Overview: Microsoft SharePoint Services
Marlabs Capabilities Overview: Microsoft SharePoint Services Marlabs Capabilities Overview: Microsoft SharePoint Services
Marlabs Capabilities Overview: Microsoft SharePoint Services Marlabs
 
Bi Resume Ejd
Bi Resume EjdBi Resume Ejd
Bi Resume EjdEJDonavan
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 

Similar to Data Quality Services in SQL Server 2012 (20)

SQL Server Integration Services – Enterprise Manageability
SQL Server Integration Services – Enterprise ManageabilitySQL Server Integration Services – Enterprise Manageability
SQL Server Integration Services – Enterprise Manageability
 
Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...
Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...
Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...
 
Data Quality from Precisely
Data Quality from PreciselyData Quality from Precisely
Data Quality from Precisely
 
Sravya(1)
Sravya(1)Sravya(1)
Sravya(1)
 
Ds04 data quality
Ds04   data qualityDs04   data quality
Ds04 data quality
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
 
SQLSaturday #188 - Enterprise Information Management
SQLSaturday #188  - Enterprise Information ManagementSQLSaturday #188  - Enterprise Information Management
SQLSaturday #188 - Enterprise Information Management
 
Krishna_IBM_Infosphere_Certified_Datastage_Consultant
Krishna_IBM_Infosphere_Certified_Datastage_Consultant Krishna_IBM_Infosphere_Certified_Datastage_Consultant
Krishna_IBM_Infosphere_Certified_Datastage_Consultant
 
Sarfaraz cv
Sarfaraz cvSarfaraz cv
Sarfaraz cv
 
DA_01_Intro.pptx
DA_01_Intro.pptxDA_01_Intro.pptx
DA_01_Intro.pptx
 
SQL_DBA USA_M&T Bank
SQL_DBA USA_M&T BankSQL_DBA USA_M&T Bank
SQL_DBA USA_M&T Bank
 
Exploring sql server 2016
Exploring sql server 2016Exploring sql server 2016
Exploring sql server 2016
 
Suvajitbasu
SuvajitbasuSuvajitbasu
Suvajitbasu
 
Marlabs Capabilities Overview: Microsoft SharePoint Services
Marlabs Capabilities Overview: Microsoft SharePoint Services Marlabs Capabilities Overview: Microsoft SharePoint Services
Marlabs Capabilities Overview: Microsoft SharePoint Services
 
Bi Resume Ejd
Bi Resume EjdBi Resume Ejd
Bi Resume Ejd
 
Padmini parmar
Padmini parmarPadmini parmar
Padmini parmar
 
Padmini Parmar
Padmini ParmarPadmini Parmar
Padmini Parmar
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Alphonso_Triplett.Sr_Prometheus_Phoenix
Alphonso_Triplett.Sr_Prometheus_PhoenixAlphonso_Triplett.Sr_Prometheus_Phoenix
Alphonso_Triplett.Sr_Prometheus_Phoenix
 
Ripon Datta. SQL DBA N
Ripon Datta. SQL DBA NRipon Datta. SQL DBA N
Ripon Datta. SQL DBA N
 

More from Stéphane Fréchette

Back to the future - Temporal Table in SQL Server 2016
Back to the future - Temporal Table in SQL Server 2016Back to the future - Temporal Table in SQL Server 2016
Back to the future - Temporal Table in SQL Server 2016Stéphane Fréchette
 
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston  Self-Service Data Integration with Power Query - SQLSaturday #364 Boston
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston Stéphane Fréchette
 
Power BI - Bring your data together
Power BI - Bring your data togetherPower BI - Bring your data together
Power BI - Bring your data togetherStéphane Fréchette
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL ServerStéphane Fréchette
 
Self-Service Data Integration with Power Query
Self-Service Data Integration with Power QuerySelf-Service Data Integration with Power Query
Self-Service Data Integration with Power QueryStéphane Fréchette
 
Le journalisme de données... par où commencer?
Le journalisme de données... par où commencer?Le journalisme de données... par où commencer?
Le journalisme de données... par où commencer?Stéphane Fréchette
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSStéphane Fréchette
 
Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg
Graph Databases for SQL Server Professionals - SQLSaturday #350 WinnipegGraph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg
Graph Databases for SQL Server Professionals - SQLSaturday #350 WinnipegStéphane Fréchette
 
Graph Databases for SQL Server Professionals
Graph Databases for SQL Server ProfessionalsGraph Databases for SQL Server Professionals
Graph Databases for SQL Server ProfessionalsStéphane Fréchette
 
SQL Server 2014 Faster Insights from Any Data
SQL Server 2014 Faster Insights from Any DataSQL Server 2014 Faster Insights from Any Data
SQL Server 2014 Faster Insights from Any DataStéphane Fréchette
 
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)Stéphane Fréchette
 
Business Intelligence in Excel 2013
Business Intelligence in Excel 2013Business Intelligence in Excel 2013
Business Intelligence in Excel 2013Stéphane Fréchette
 
Gatineau Ouverte troisième rencontre publique
Gatineau Ouverte troisième rencontre publiqueGatineau Ouverte troisième rencontre publique
Gatineau Ouverte troisième rencontre publiqueStéphane Fréchette
 
Gatineau Ouverte première rencontre publique
Gatineau Ouverte première rencontre publiqueGatineau Ouverte première rencontre publique
Gatineau Ouverte première rencontre publiqueStéphane Fréchette
 

More from Stéphane Fréchette (17)

Back to the future - Temporal Table in SQL Server 2016
Back to the future - Temporal Table in SQL Server 2016Back to the future - Temporal Table in SQL Server 2016
Back to the future - Temporal Table in SQL Server 2016
 
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston  Self-Service Data Integration with Power Query - SQLSaturday #364 Boston
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston
 
Power BI - Bring your data together
Power BI - Bring your data togetherPower BI - Bring your data together
Power BI - Bring your data together
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
Self-Service Data Integration with Power Query
Self-Service Data Integration with Power QuerySelf-Service Data Integration with Power Query
Self-Service Data Integration with Power Query
 
Introduction to Azure HDInsight
Introduction to Azure HDInsightIntroduction to Azure HDInsight
Introduction to Azure HDInsight
 
Le journalisme de données... par où commencer?
Le journalisme de données... par où commencer?Le journalisme de données... par où commencer?
Le journalisme de données... par où commencer?
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
 
Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg
Graph Databases for SQL Server Professionals - SQLSaturday #350 WinnipegGraph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg
Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg
 
Graph Databases for SQL Server Professionals
Graph Databases for SQL Server ProfessionalsGraph Databases for SQL Server Professionals
Graph Databases for SQL Server Professionals
 
SQL Server 2014 Faster Insights from Any Data
SQL Server 2014 Faster Insights from Any DataSQL Server 2014 Faster Insights from Any Data
SQL Server 2014 Faster Insights from Any Data
 
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
 
TEDxGatineau
TEDxGatineau TEDxGatineau
TEDxGatineau
 
Power BI
Power BIPower BI
Power BI
 
Business Intelligence in Excel 2013
Business Intelligence in Excel 2013Business Intelligence in Excel 2013
Business Intelligence in Excel 2013
 
Gatineau Ouverte troisième rencontre publique
Gatineau Ouverte troisième rencontre publiqueGatineau Ouverte troisième rencontre publique
Gatineau Ouverte troisième rencontre publique
 
Gatineau Ouverte première rencontre publique
Gatineau Ouverte première rencontre publiqueGatineau Ouverte première rencontre publique
Gatineau Ouverte première rencontre publique
 

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Data Quality Services in SQL Server 2012

  • 1. Data Quality Services in SQL Server 2012 (An Introduction) Stéphane Fréchette Friday April 26, 2013 Matching Cleansing DQS
  • 2. Who am I? My name is Stéphane Fréchette I’m a Database & Business Intelligence Professional and CEO | Founder of I have a passion for architecting, designing and building solutions that matter. Self proclaimed Open Data Hacker/Advocate I founded Gatineau Ouverte a citizen led initiative which aims to promote open access to civic data of the city of Gatineau. Twitter: @sfrechette Email: stephanefrechette@ukubu.com Blog: stephanefrechette.com
  • 3. Session Outline • Microsoft Business Intelligence (The Stack) • Dirty Data… • SQL Server Data Quality Services (DQS) • Data Steward • Knowledge Base and Domains • Data Quality Projects • Data Cleansing Transform – SSIS • DQS (Install & Architecture) • Enterprise Information Management (EMI) • Resources
  • 4. Analysis Services Reporting Services Integration Services Master Data Services SharePoint Collaboration Excel Workbooks PowerPivot Applications SharePoint Dashboards & Scorecards Data Quality Services OData Feeds Line of Business Applications Hadoop Big Data Microsoft Business Intelligence
  • 5. Dirty Data… Do you have dirty data? (all projects have it! Its inevitable)
  • 6. Dirty Data… Causes? Bad data entry Poor Data Governance Duplicate entities in different LOB systems
  • 7. Sample Data Representation • Prospect in CRM System: Mark Smith | 613.111-1234 | Ottawa | ON | K1P 1K1 • Prospect buys goods now entered in POS System: Markus Smith | 1234 Stilton Ave | Kanata |ON | K1P 1K1 • Record also entered into Accounting System: Markus Smith | 1234 Stilton Avenue | Kanata | ON | K1P 1K1 ETL process imports these records into the Data Warehouse / Data Mart FirstName LastName Phone Address City Province PostalCode Mark Smith 613.111-1234 Ottawa ON K1P 1K1 Markus Smith 1234 Stilton Ave Kanata ON K1P 1K1 Markus Smith 1234 Stilton Avenue Kanata ON K1P 1K1
  • 8. Sample Data Representation • Duplicate records and inaccurate, incomplete data • What we want is a golden record (one version of the truth) FirstName LastName Phone Address City Province PostalCode Mark Smith 613.111-1234 Ottawa ON K1P 1K1 Markus Smith 1234 Stilton Ave Kanata ON K1P 1K1 Markus Smith 1234 Stilton Avenue Kanata ON K1P 1K1 FirstName LastName Phone Address City Province PostalCode Markus Smith 613-111-1234 1234 Stilton Ave Kanata ON K1P 1K1
  • 9. SQL Server Data Quality Services (DQS) • New in SQL Server 2012 • Enables cleansing, matching, standardizing and enriching data • Delivers trusted information for business intelligence, data warehouse, transaction processing workloads • Knowledge-Driven Solution (create/edit) • A knowledge management process that builds the knowledge base • A data quality project that proposes changes to source data based on the knowledge in the knowledge base (cleansing and matching) • A key component to an Enterprise Information Management (EIM) solution
  • 10. Answering the Need with DQS • DQS enables to resolve issues involving incompleteness, lack of conformity, inconsistency, inaccuracy, invalidity, and data duplication • Provides the following features to resolve data quality issues:  Data Cleansing  Matching  Reference Data Services  Profiling  Monitoring  Knowledge Base
  • 11. Data Steward • Key role - Is usually a Business User and not from the Information Technology side • Nutshell: Responsible for maintaining data elements in a metadata registry… • Data Steward -> DQS Client • Create and edit Knowledge Bases • Run and process data though continually, iteratively, improving the Knowledge Bases • Knowledge Bases can be consumed and used by other Data Stewards and IT (SSIS / ETL Developers) DQS Data Steward MDS Data Steward SSIS Developer Matching Cleansing
  • 12. Knowledge Bases and Domains The knowledge base is a repo of knowledge about your data that enables you to understand your data and maintain its integrity. • Processes: • Computer-assisted • Interactive • Components: • Knowledge Discovery • Domain Management • Reference Data Services • Matching Policy
  • 14. Data Quality Projects Improve quality of source data by performing data cleansing and data matching activities using defined knowledge bases • Cleansing Activity (2 step process) • Computer-assisted : data is categorized (suggested, new, invalid, corrected, and correct) • Interactive: data steward to approve, reject, or modify the proposed results from the computer-assisted cleansing process • Matching Activity • Using existing knowledge base matching policy • Prevent and remove data duplication • Data Profiling and Notifications • Profiling provides data quality stats and info: completeness and accuracy • Notification on actions that can be taken to enhance operations
  • 16. DQS Cleansing Transform in SSIS • When you want to automate the cleansing and matching process and not use the DQS Client • Use SSIS for batch data cleansing • Matching can be done with Master Data Services (MDS) • SSIS can be leveraged to bring DQS and MDS together *DQS does not expose matching functionality for SSIS, but you can use Fuzzy Grouping Transform to identify duplicate data *Cleansing Transform is single threaded – use multiple transform for parallelism
  • 17. Demo Data Cleansing Transform (Automating the Cleansing and Matching using SSIS)
  • 18. Installing DQS • Requires Business Intelligence or Enterprise/Developer version of SQL Server 2012 • During SQL Server setup; • Instance Features -> Data Quality Services • Shared Features -> Data Quality Client • Execute the Data Quality Server Installer; • C:Program FilesMicrosoft SQL ServerMSSQL11.MSSQLSERVERMSSQLBinnDQSInstaller.exe • Data Quality Service – Data Quality Server Installer (Apps - Microsoft SQL Server 2012)
  • 19. DQS Architecture DQS Server DQS Catalog (3 databases) • DQS_MAIN (Knowledge Bases) • DQS_PROJECTS (Projects) • DQS_STAGING_DATA (Sandbox, scratch pad area) Security – Database Roles • dqs_administrator • dqs_kb_editor • dqs_kb_operator
  • 20. Windows Azure Marketplace Reference Data Services -> validating, cleansing and enriching your data
  • 21. Performance considerations - FYI • Major performance improvements from RTM to CU1 release of SQL Server 2012 (strongly recommend patching and upgrading) http://bit.ly/11eEhHC • Must read -> DQS Performance Best Practice Guide http://bit.ly/16Gwenl • Understand data volumes and hardware requirements… plan wisely!
  • 22. Enterprise Information Management (EIM) The EIM Stack as a whole is the ‘Master Data Management’ solution from Microsoft and consist of the following: • SQL Server Data Quality Services (DQS) - Capture and record knowledge, rules, and actions • SQL Server Master Data Services (MDS) - Master Data Management repository, Dimension data • SQL Server Integration Services (SSIS) – Moves data, integration Enterprise Information Management (EMI) ‘Master Data Management’
  • 23. Resources • Data Quality Services Team Blog (MSDN) http://bit.ly/WCI2nO • SQL Server Data Quality Services (TechNet) http://bit.ly/ZaUO8k • DQS Performance Best Practices Guide http://bit.ly/16Gwenl • Enterprise Information Management (EIM) Bringing Together SSIS, DQS, and MDS (Video – Channel 9) http://bit.ly/NJXvKr • Matt Masson – Getting Started with DQS and MDS http://bit.ly/149Ga9n • Paras Doshi’s – Blog (DQS) http://bit.ly/YoLthh
  • 24. What Questions Do You Have?
  • 25. Thank You For attending this session