SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
Rapid Data Integration
and Curation
Delivering Business Value in the First 24 Hours

SPEAKER:
Thomas Kelly, Practice Director
Semantic Technology Center of Excellence
Enterprise Information Management
Cognizant Technology Solutions, Inc.
| ©2013, Cognizant
Agenda

1

2

| ©2013, Cognizant

BARRIERS TO RAPID DATA INTEGRATION

3

2

DELIVERING BUSINESS VALUE

RAPID DATA INTEGRATION AND
CURATION METHOD
We are at an Inflection Point at which Value is Created or
Destroyed

Source : The Motley Fool
3

| ©2013, Cognizant
Delivering Information Faster Produces Direct, Measurable
Business Value
What Difference Does One Day Make?

A blockbuster drug generates $3M+ in
revenue per day; a one-day delay in
completing clinical trials can generate up
to $500K in additional costs
Banking

A moderate-sized brokerage firm can
generate up to $1M in financial services
revenue per day

4

| ©2013, Cognizant
Barriers to Rapid Data Integration
Rework is expensive –
must “get it right” from
the start

Fit with the existing
data; avoid data silos

| ©2013, Cognizant

Reconciling differences
(data formats, coding,
identifiers, etc.)

Managing data quality
(accuracy, precision,
context)

5

Knowledge acquisition
takes time; new insights
come from
experimentation

Overcoming process
inertia
Evolutionary Method to Data Integration and Curation
Responsive

Data
Approach

• As new information flows into the
enterprise, people and processes are
dynamic in nature
• Questions arising during this phase
are “what to do” and “how to make
the best sense of the new data
source”. Rapid integration tools will
aid in quick prototyping and building
solutions of value

Rapid

Integration and
Curation
Method

• The data is profiled and explored for
value and quality issues.
• A rapid pruning exercise is
undertaken by prototyping and
integrating with in-house data to
evaluate if data is fit for purpose. It
influences in formulating a effective
approach for further phases.

Information
Management
Approach

Time

6

| ©2013, Cognizant

Managed
• As we progress, issues with the new
data are identified and managed.
The main focus is on establishing
data quality and adhering to
enterprise standards and
frameworks while building optimal
integration approaches
• The integration process is
evolutionary as further discoveries
are made for optimal design

Evolutionary
• Progressive build based on the new
data.
• Building awareness of the new
platform and fine tuning the
capabilities around the data source
are primary activities

Proactive
• Data management evolves to a morerefined state. A feedback loop is built
to enable proactive decisions around
data organization and access.
• Data integration is efficient and
stable. Verifiable compliance and
security.
• Integrated with the enterprise
information management framework

Predictable
• The services built around the new
data sources are now managed.
• The focus is on evolution of business
processes, based on managed models

Tactical

Progressive

Managed

First 1-5 Days

First 1 -3 Months

After 3 months
Leverage Insights and Expertise, Rapidly and Sustainably
Identify and leverage
existing, relevant data
assets and expertise

Ingest new data
sources (light
integration and
curation)

Reuse Expertise

Analyze
Monitor and measure
use and benefits
achieved; identify next
set of priorities

Realize
Benefits
Extend

Create and extend data
relationships,
leveraging insights from
previous study cycles

Govern
Elevate proven data,
relationships, and expertise
to organization-wise
definition

7

| ©2013, Cognizant

Refine
Capture insights from new data
analysis cycles, refining
relationships to support new
analytics
Can You Help Me With Some Data?

8

| ©2013, Cognizant
Rapid Data Integration and Curation Method

1

Define Preliminary Objectives

2

Profile the New Data

3

Generate Initial Ontology for the New Data
Generate Initial Ontology for the Existing
Data (if necessary)

4
5

Integrate Entities over Common URIs

6

Create URI Links

7

Add Initial Data Quality Filters

8
9

| ©2013, Cognizant

Analyze Data and Generate Feedback
1. Define Preliminary Objectives

1. Discuss Functional and Timing Objectives, and
Priorities
2. Clarify Immediate, Short-Term, and Long-Term
Business Value (SMART *)
a. Cost Reduction/Avoidance
b. Meet Critical Customer Need
3. Is This the Right Solution?
4. Set Expectations
a. Evolutionary Process
b. Initial Results Quickly
c. Frequent, Active Participation
d. Feedback Critical to Making Refinements

5. Brainstorm Deliverables that Produce Business
Benefits; Define a Few Sample Queries
6. Ask for Commitment to Benefits Realization
7. Start the Clock!

* SMART -- Specific, Measurable, Attainable, Realistic, and Traceable
10

| ©2013, Cognizant
2. Profile the New Data

Light Profiling, focusing on
Understanding Key Data Elements
Needed to Meet the First
Deliverable
Identify Initial Data Filtering
Candidates

Capture Insights about Key Data
Relationships

11

| ©2013, Cognizant
3. Generate Initial Ontology for the New Data

Reverse-engineer Ontology from
New Data

Load New Data into the RDF Store
(or Create Link to the Data)

Create Business-relevant Synonyms
for High-Importance Attributes

Refinements will be made in
Future Iterations

12

| ©2013, Cognizant
4. Generate Initial Ontology for the Existing Data (if necessary)

Map Selected Entities and Critical
Attributes for Existing Data Source(s)
to the Source-specific Ontology

Existing
Data

New
Data

13

| ©2013, Cognizant

Add Reference to the Source-specific
Ontology to the New Data Ontology

Refinements will be made in
Future Iterations
New Data Ontology manages
integration with Existing Data until the
ontology is sufficiently mature to be
promoted into an enterprise ontology
5. Integrate Entities over Common URIs

Different URIs, Separately
Maintained

Focus on Key Entities

Equivalence Functions Logically
Integrate the Federated Data

Reduces Query Complexity and
Can Improve Query Performance

14

| ©2013, Cognizant
6. Create URI Links
Geography

Customer

cust:ZipCode

JOIN

geo:ZipCode

Geography

Customer
cust:ZipCodeURI

LINK

The Data has Common Values that
can be used in Join Operations, but
Doesn’t have Links
Links Reduce Query Complexity
and Can Improve Query
Performance
Focus on Key Queries, Identify
Complex or Time-Sensitive Joins
Add Linking URI Attribute to
Dependent Entity
Amend Selected Queries to
Leverage the New Link

15

| ©2013, Cognizant
7. Add Initial Data Quality Filters and Transformations

Traditional Data Warehouse
Data Quality
Happens Here
Data Quality
Happens Here
Data
Source A

Data
Source B

Data
Source C

16

| ©2013, Cognizant

Existing
Data

ETL
New Data

And
Data Here
Warehouse

JIT Data Quality Management,
Everywhere that it is Needed

Data Filtering and Transformation
Rules are Encoded in the Ontology

Focus is on Critical Data
Quality Rules
Rule Updates are Automatically
in Effect, without Reloading All
of the Data
8. Analyze Data and Generate Feedback

Demonstrate Visualization using
Sample Queries

Walk Through Available Data
Sets and Data Organization

Experiment with Data Access
and New Visualizations
Provide Next Steps
Recommendations to Refine the
Data Integration and Curation

17

| ©2013, Cognizant
Architectural Foundation for Rapid Data Integration and Curation

SPARQL-based Visualization
Relational-to-RDF Mapping
Data Profiling

18

| ©2013, Cognizant

Ontology Editor

Automated Ontology Generation

RDF Store
Data Import
RDF Store
Capabilities That We Have Introduced

Rapid Response to New Data
Onboarding Needs

Process for Evolutionary Data
Integration and Curation

Flexible Design that is
Responsive to Business Changes
Foundation for Refinement and
Expansion of Ontology Models from
Fit-for-Purpose to Department, to
Business Unit, to Enterprise

19

| ©2013, Cognizant
Questions?

20 | ©2013, Cognizant
Thank you!

21 | ©2013, Cognizant
Speaker
Thomas (Tom) Kelly
Practice Director, Enterprise Information Management, Cognizant

Thomas Kelly is a Director in Cognizant’s Enterprise
Information Management (EIM) Practice and heads its
Semantic Technology Center of Excellence, a technology
specialty of Cognizant Business Consulting (CBC). He has
20-plus years of technology consulting experience in
leading data warehousing, business intelligence and big
data projects, focused primarily on the life sciences and
healthcare industries. Tom can be reached at
Thomas.Kelly@cognizant.com.

22

| ©2013, Cognizant

Mais conteúdo relacionado

Mais procurados

Change management success for data governance
Change management success for data governanceChange management success for data governance
Change management success for data governanceReid Elliott
 
Enterprise Data World Webinar: A Strategic Approach to Data Quality
Enterprise Data World Webinar: A Strategic Approach to Data Quality Enterprise Data World Webinar: A Strategic Approach to Data Quality
Enterprise Data World Webinar: A Strategic Approach to Data Quality DATAVERSITY
 
A better business case for big data with Hadoop
A better business case for big data with HadoopA better business case for big data with Hadoop
A better business case for big data with HadoopAptitude Software
 
Tips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data AnalyticsTips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data AnalyticsAbhishek Sood
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best PracticesBoris Otto
 
The Great Data Debate (4) Implementing a lean approach to Data Quality Manage...
The Great Data Debate (4) Implementing a lean approach to Data Quality Manage...The Great Data Debate (4) Implementing a lean approach to Data Quality Manage...
The Great Data Debate (4) Implementing a lean approach to Data Quality Manage...BCS Data Management Specialist Group
 
Data Quality
Data QualityData Quality
Data Qualityjerdeb
 
Big Data Readiness & Business Intelligence Capabilities Matrix
Big Data Readiness & Business Intelligence Capabilities MatrixBig Data Readiness & Business Intelligence Capabilities Matrix
Big Data Readiness & Business Intelligence Capabilities MatrixMichael Ghen
 
DGIQ 2018 Presentation: How to be successful in the post GDPR landscape – bui...
DGIQ 2018 Presentation: How to be successful in the post GDPR landscape – bui...DGIQ 2018 Presentation: How to be successful in the post GDPR landscape – bui...
DGIQ 2018 Presentation: How to be successful in the post GDPR landscape – bui...DATUM LLC
 
The Total Economic Impact™ Of Cisco Data Virtualization
The Total Economic Impact™ Of Cisco Data VirtualizationThe Total Economic Impact™ Of Cisco Data Virtualization
The Total Economic Impact™ Of Cisco Data Virtualizationxband
 
Data Governance PowerPoint Presentation Slides
Data Governance PowerPoint Presentation Slides Data Governance PowerPoint Presentation Slides
Data Governance PowerPoint Presentation Slides SlideTeam
 
Building an Effective & Extensible Data & Analytics Operating Model
Building an Effective & Extensible Data & Analytics Operating ModelBuilding an Effective & Extensible Data & Analytics Operating Model
Building an Effective & Extensible Data & Analytics Operating ModelCognizant
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareDATA360US
 
Overall Approach to Data Quality ROI
Overall Approach to Data Quality ROIOverall Approach to Data Quality ROI
Overall Approach to Data Quality ROIFindWhitePapers
 
Oracle Big Data Governance Webcast Charts
Oracle Big Data Governance Webcast ChartsOracle Big Data Governance Webcast Charts
Oracle Big Data Governance Webcast ChartsJeffrey T. Pollock
 

Mais procurados (20)

Change management success for data governance
Change management success for data governanceChange management success for data governance
Change management success for data governance
 
Enterprise Data World Webinar: A Strategic Approach to Data Quality
Enterprise Data World Webinar: A Strategic Approach to Data Quality Enterprise Data World Webinar: A Strategic Approach to Data Quality
Enterprise Data World Webinar: A Strategic Approach to Data Quality
 
A better business case for big data with Hadoop
A better business case for big data with HadoopA better business case for big data with Hadoop
A better business case for big data with Hadoop
 
Tips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data AnalyticsTips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data Analytics
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best Practices
 
Data Quality
Data QualityData Quality
Data Quality
 
The Great Data Debate (4) Implementing a lean approach to Data Quality Manage...
The Great Data Debate (4) Implementing a lean approach to Data Quality Manage...The Great Data Debate (4) Implementing a lean approach to Data Quality Manage...
The Great Data Debate (4) Implementing a lean approach to Data Quality Manage...
 
Data Quality
Data QualityData Quality
Data Quality
 
Big Data Readiness & Business Intelligence Capabilities Matrix
Big Data Readiness & Business Intelligence Capabilities MatrixBig Data Readiness & Business Intelligence Capabilities Matrix
Big Data Readiness & Business Intelligence Capabilities Matrix
 
DGIQ 2018 Presentation: How to be successful in the post GDPR landscape – bui...
DGIQ 2018 Presentation: How to be successful in the post GDPR landscape – bui...DGIQ 2018 Presentation: How to be successful in the post GDPR landscape – bui...
DGIQ 2018 Presentation: How to be successful in the post GDPR landscape – bui...
 
The Total Economic Impact™ Of Cisco Data Virtualization
The Total Economic Impact™ Of Cisco Data VirtualizationThe Total Economic Impact™ Of Cisco Data Virtualization
The Total Economic Impact™ Of Cisco Data Virtualization
 
Data Governance PowerPoint Presentation Slides
Data Governance PowerPoint Presentation Slides Data Governance PowerPoint Presentation Slides
Data Governance PowerPoint Presentation Slides
 
Infographic: Data Governance Best Practices
Infographic: Data Governance Best Practices Infographic: Data Governance Best Practices
Infographic: Data Governance Best Practices
 
Building an Effective & Extensible Data & Analytics Operating Model
Building an Effective & Extensible Data & Analytics Operating ModelBuilding an Effective & Extensible Data & Analytics Operating Model
Building an Effective & Extensible Data & Analytics Operating Model
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for Healthcare
 
Data Quality Presentation
Data Quality PresentationData Quality Presentation
Data Quality Presentation
 
Data Governance for Enterprises
Data Governance for EnterprisesData Governance for Enterprises
Data Governance for Enterprises
 
Overall Approach to Data Quality ROI
Overall Approach to Data Quality ROIOverall Approach to Data Quality ROI
Overall Approach to Data Quality ROI
 
Oracle Big Data Governance Webcast Charts
Oracle Big Data Governance Webcast ChartsOracle Big Data Governance Webcast Charts
Oracle Big Data Governance Webcast Charts
 
Why data governance is the new buzz?
Why data governance is the new buzz?Why data governance is the new buzz?
Why data governance is the new buzz?
 

Semelhante a Rapid Data Integration and Curation in 24 Hours

Fuel your Data-Driven Ambitions with Data Governance
Fuel your Data-Driven Ambitions with Data GovernanceFuel your Data-Driven Ambitions with Data Governance
Fuel your Data-Driven Ambitions with Data GovernancePedro Martins
 
The Merger is Happening, Now What Do We Do?
The Merger is Happening, Now What Do We Do?The Merger is Happening, Now What Do We Do?
The Merger is Happening, Now What Do We Do?DATUM LLC
 
Executive Overview on EDM Strategy
Executive Overview on EDM StrategyExecutive Overview on EDM Strategy
Executive Overview on EDM Strategyssuserf8f9b2
 
1145_October5_NYCDGSummit
1145_October5_NYCDGSummit1145_October5_NYCDGSummit
1145_October5_NYCDGSummitRobert Quinn
 
12 Guidelines For Success in Data Quality Projects
12 Guidelines For Success in Data Quality Projects12 Guidelines For Success in Data Quality Projects
12 Guidelines For Success in Data Quality ProjectsInnovative_Systems
 
TOP_407070357-Data-Governance-Playbook.pptx
TOP_407070357-Data-Governance-Playbook.pptxTOP_407070357-Data-Governance-Playbook.pptx
TOP_407070357-Data-Governance-Playbook.pptxSabrinaLameiras1
 
Data-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesData-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesDATAVERSITY
 
Pivotal_thought leadership paper_WEB Version
Pivotal_thought leadership paper_WEB VersionPivotal_thought leadership paper_WEB Version
Pivotal_thought leadership paper_WEB VersionMadeleine Lewis
 
Big & Fast Data: The Democratization of Information
Big & Fast Data: The Democratization of InformationBig & Fast Data: The Democratization of Information
Big & Fast Data: The Democratization of InformationCapgemini
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Precisely
 
Practical Guide to Data Governance Success
Practical Guide to Data Governance SuccessPractical Guide to Data Governance Success
Practical Guide to Data Governance SuccessAmple Insight Inc
 
DGIQ 2013 Learned and Applied Concepts
DGIQ 2013 Learned and Applied Concepts DGIQ 2013 Learned and Applied Concepts
DGIQ 2013 Learned and Applied Concepts Angela Boyd
 
Fate of the Chief Data Officer
Fate of the Chief Data OfficerFate of the Chief Data Officer
Fate of the Chief Data OfficerTamarah Usher
 
how to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdfhow to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdfbasilmph
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?Denodo
 
Build a Winning Data Strategy in 2022.pdf
Build a Winning Data Strategy in 2022.pdfBuild a Winning Data Strategy in 2022.pdf
Build a Winning Data Strategy in 2022.pdfAvinashBatham
 

Semelhante a Rapid Data Integration and Curation in 24 Hours (20)

Focus
FocusFocus
Focus
 
Fuel your Data-Driven Ambitions with Data Governance
Fuel your Data-Driven Ambitions with Data GovernanceFuel your Data-Driven Ambitions with Data Governance
Fuel your Data-Driven Ambitions with Data Governance
 
Sgcp14dunlea
Sgcp14dunleaSgcp14dunlea
Sgcp14dunlea
 
The Merger is Happening, Now What Do We Do?
The Merger is Happening, Now What Do We Do?The Merger is Happening, Now What Do We Do?
The Merger is Happening, Now What Do We Do?
 
Executive Overview on EDM Strategy
Executive Overview on EDM StrategyExecutive Overview on EDM Strategy
Executive Overview on EDM Strategy
 
1145_October5_NYCDGSummit
1145_October5_NYCDGSummit1145_October5_NYCDGSummit
1145_October5_NYCDGSummit
 
12 Guidelines For Success in Data Quality Projects
12 Guidelines For Success in Data Quality Projects12 Guidelines For Success in Data Quality Projects
12 Guidelines For Success in Data Quality Projects
 
TOP_407070357-Data-Governance-Playbook.pptx
TOP_407070357-Data-Governance-Playbook.pptxTOP_407070357-Data-Governance-Playbook.pptx
TOP_407070357-Data-Governance-Playbook.pptx
 
Data-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesData-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success Stories
 
Pivotal_thought leadership paper_WEB Version
Pivotal_thought leadership paper_WEB VersionPivotal_thought leadership paper_WEB Version
Pivotal_thought leadership paper_WEB Version
 
Big & Fast Data: The Democratization of Information
Big & Fast Data: The Democratization of InformationBig & Fast Data: The Democratization of Information
Big & Fast Data: The Democratization of Information
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
 
Practical Guide to Data Governance Success
Practical Guide to Data Governance SuccessPractical Guide to Data Governance Success
Practical Guide to Data Governance Success
 
DGIQ 2013 Learned and Applied Concepts
DGIQ 2013 Learned and Applied Concepts DGIQ 2013 Learned and Applied Concepts
DGIQ 2013 Learned and Applied Concepts
 
Fate of the Chief Data Officer
Fate of the Chief Data OfficerFate of the Chief Data Officer
Fate of the Chief Data Officer
 
how to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdfhow to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdf
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
Mdm: why, when, how
Mdm: why, when, howMdm: why, when, how
Mdm: why, when, how
 
BI_StrategyDM2
BI_StrategyDM2BI_StrategyDM2
BI_StrategyDM2
 
Build a Winning Data Strategy in 2022.pdf
Build a Winning Data Strategy in 2022.pdfBuild a Winning Data Strategy in 2022.pdf
Build a Winning Data Strategy in 2022.pdf
 

Mais de Thomas Kelly, PMP

Semantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeSemantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeThomas Kelly, PMP
 
Enterprise Semantic Technology
Enterprise Semantic TechnologyEnterprise Semantic Technology
Enterprise Semantic TechnologyThomas Kelly, PMP
 
The Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThomas Kelly, PMP
 
Transforming Big Data into Big Value
Transforming Big Data into Big ValueTransforming Big Data into Big Value
Transforming Big Data into Big ValueThomas Kelly, PMP
 
Semantic Technology for the Data Warehousing Practitioner
Semantic Technology for the Data Warehousing PractitionerSemantic Technology for the Data Warehousing Practitioner
Semantic Technology for the Data Warehousing PractitionerThomas Kelly, PMP
 
Semantic Technology for Provider-Payer-Pharma Data Collaboration
Semantic Technology for Provider-Payer-Pharma Data CollaborationSemantic Technology for Provider-Payer-Pharma Data Collaboration
Semantic Technology for Provider-Payer-Pharma Data CollaborationThomas Kelly, PMP
 

Mais de Thomas Kelly, PMP (8)

Semantic Analytics
Semantic AnalyticsSemantic Analytics
Semantic Analytics
 
Semantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeSemantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data Lake
 
Enterprise Semantic Technology
Enterprise Semantic TechnologyEnterprise Semantic Technology
Enterprise Semantic Technology
 
Mobile semantic technology
Mobile semantic technologyMobile semantic technology
Mobile semantic technology
 
The Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT Strategy
 
Transforming Big Data into Big Value
Transforming Big Data into Big ValueTransforming Big Data into Big Value
Transforming Big Data into Big Value
 
Semantic Technology for the Data Warehousing Practitioner
Semantic Technology for the Data Warehousing PractitionerSemantic Technology for the Data Warehousing Practitioner
Semantic Technology for the Data Warehousing Practitioner
 
Semantic Technology for Provider-Payer-Pharma Data Collaboration
Semantic Technology for Provider-Payer-Pharma Data CollaborationSemantic Technology for Provider-Payer-Pharma Data Collaboration
Semantic Technology for Provider-Payer-Pharma Data Collaboration
 

Último

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 

Rapid Data Integration and Curation in 24 Hours

  • 1. Rapid Data Integration and Curation Delivering Business Value in the First 24 Hours SPEAKER: Thomas Kelly, Practice Director Semantic Technology Center of Excellence Enterprise Information Management Cognizant Technology Solutions, Inc. | ©2013, Cognizant
  • 2. Agenda 1 2 | ©2013, Cognizant BARRIERS TO RAPID DATA INTEGRATION 3 2 DELIVERING BUSINESS VALUE RAPID DATA INTEGRATION AND CURATION METHOD
  • 3. We are at an Inflection Point at which Value is Created or Destroyed Source : The Motley Fool 3 | ©2013, Cognizant
  • 4. Delivering Information Faster Produces Direct, Measurable Business Value What Difference Does One Day Make? A blockbuster drug generates $3M+ in revenue per day; a one-day delay in completing clinical trials can generate up to $500K in additional costs Banking A moderate-sized brokerage firm can generate up to $1M in financial services revenue per day 4 | ©2013, Cognizant
  • 5. Barriers to Rapid Data Integration Rework is expensive – must “get it right” from the start Fit with the existing data; avoid data silos | ©2013, Cognizant Reconciling differences (data formats, coding, identifiers, etc.) Managing data quality (accuracy, precision, context) 5 Knowledge acquisition takes time; new insights come from experimentation Overcoming process inertia
  • 6. Evolutionary Method to Data Integration and Curation Responsive Data Approach • As new information flows into the enterprise, people and processes are dynamic in nature • Questions arising during this phase are “what to do” and “how to make the best sense of the new data source”. Rapid integration tools will aid in quick prototyping and building solutions of value Rapid Integration and Curation Method • The data is profiled and explored for value and quality issues. • A rapid pruning exercise is undertaken by prototyping and integrating with in-house data to evaluate if data is fit for purpose. It influences in formulating a effective approach for further phases. Information Management Approach Time 6 | ©2013, Cognizant Managed • As we progress, issues with the new data are identified and managed. The main focus is on establishing data quality and adhering to enterprise standards and frameworks while building optimal integration approaches • The integration process is evolutionary as further discoveries are made for optimal design Evolutionary • Progressive build based on the new data. • Building awareness of the new platform and fine tuning the capabilities around the data source are primary activities Proactive • Data management evolves to a morerefined state. A feedback loop is built to enable proactive decisions around data organization and access. • Data integration is efficient and stable. Verifiable compliance and security. • Integrated with the enterprise information management framework Predictable • The services built around the new data sources are now managed. • The focus is on evolution of business processes, based on managed models Tactical Progressive Managed First 1-5 Days First 1 -3 Months After 3 months
  • 7. Leverage Insights and Expertise, Rapidly and Sustainably Identify and leverage existing, relevant data assets and expertise Ingest new data sources (light integration and curation) Reuse Expertise Analyze Monitor and measure use and benefits achieved; identify next set of priorities Realize Benefits Extend Create and extend data relationships, leveraging insights from previous study cycles Govern Elevate proven data, relationships, and expertise to organization-wise definition 7 | ©2013, Cognizant Refine Capture insights from new data analysis cycles, refining relationships to support new analytics
  • 8. Can You Help Me With Some Data? 8 | ©2013, Cognizant
  • 9. Rapid Data Integration and Curation Method 1 Define Preliminary Objectives 2 Profile the New Data 3 Generate Initial Ontology for the New Data Generate Initial Ontology for the Existing Data (if necessary) 4 5 Integrate Entities over Common URIs 6 Create URI Links 7 Add Initial Data Quality Filters 8 9 | ©2013, Cognizant Analyze Data and Generate Feedback
  • 10. 1. Define Preliminary Objectives 1. Discuss Functional and Timing Objectives, and Priorities 2. Clarify Immediate, Short-Term, and Long-Term Business Value (SMART *) a. Cost Reduction/Avoidance b. Meet Critical Customer Need 3. Is This the Right Solution? 4. Set Expectations a. Evolutionary Process b. Initial Results Quickly c. Frequent, Active Participation d. Feedback Critical to Making Refinements 5. Brainstorm Deliverables that Produce Business Benefits; Define a Few Sample Queries 6. Ask for Commitment to Benefits Realization 7. Start the Clock! * SMART -- Specific, Measurable, Attainable, Realistic, and Traceable 10 | ©2013, Cognizant
  • 11. 2. Profile the New Data Light Profiling, focusing on Understanding Key Data Elements Needed to Meet the First Deliverable Identify Initial Data Filtering Candidates Capture Insights about Key Data Relationships 11 | ©2013, Cognizant
  • 12. 3. Generate Initial Ontology for the New Data Reverse-engineer Ontology from New Data Load New Data into the RDF Store (or Create Link to the Data) Create Business-relevant Synonyms for High-Importance Attributes Refinements will be made in Future Iterations 12 | ©2013, Cognizant
  • 13. 4. Generate Initial Ontology for the Existing Data (if necessary) Map Selected Entities and Critical Attributes for Existing Data Source(s) to the Source-specific Ontology Existing Data New Data 13 | ©2013, Cognizant Add Reference to the Source-specific Ontology to the New Data Ontology Refinements will be made in Future Iterations New Data Ontology manages integration with Existing Data until the ontology is sufficiently mature to be promoted into an enterprise ontology
  • 14. 5. Integrate Entities over Common URIs Different URIs, Separately Maintained Focus on Key Entities Equivalence Functions Logically Integrate the Federated Data Reduces Query Complexity and Can Improve Query Performance 14 | ©2013, Cognizant
  • 15. 6. Create URI Links Geography Customer cust:ZipCode JOIN geo:ZipCode Geography Customer cust:ZipCodeURI LINK The Data has Common Values that can be used in Join Operations, but Doesn’t have Links Links Reduce Query Complexity and Can Improve Query Performance Focus on Key Queries, Identify Complex or Time-Sensitive Joins Add Linking URI Attribute to Dependent Entity Amend Selected Queries to Leverage the New Link 15 | ©2013, Cognizant
  • 16. 7. Add Initial Data Quality Filters and Transformations Traditional Data Warehouse Data Quality Happens Here Data Quality Happens Here Data Source A Data Source B Data Source C 16 | ©2013, Cognizant Existing Data ETL New Data And Data Here Warehouse JIT Data Quality Management, Everywhere that it is Needed Data Filtering and Transformation Rules are Encoded in the Ontology Focus is on Critical Data Quality Rules Rule Updates are Automatically in Effect, without Reloading All of the Data
  • 17. 8. Analyze Data and Generate Feedback Demonstrate Visualization using Sample Queries Walk Through Available Data Sets and Data Organization Experiment with Data Access and New Visualizations Provide Next Steps Recommendations to Refine the Data Integration and Curation 17 | ©2013, Cognizant
  • 18. Architectural Foundation for Rapid Data Integration and Curation SPARQL-based Visualization Relational-to-RDF Mapping Data Profiling 18 | ©2013, Cognizant Ontology Editor Automated Ontology Generation RDF Store Data Import RDF Store
  • 19. Capabilities That We Have Introduced Rapid Response to New Data Onboarding Needs Process for Evolutionary Data Integration and Curation Flexible Design that is Responsive to Business Changes Foundation for Refinement and Expansion of Ontology Models from Fit-for-Purpose to Department, to Business Unit, to Enterprise 19 | ©2013, Cognizant
  • 21. Thank you! 21 | ©2013, Cognizant
  • 22. Speaker Thomas (Tom) Kelly Practice Director, Enterprise Information Management, Cognizant Thomas Kelly is a Director in Cognizant’s Enterprise Information Management (EIM) Practice and heads its Semantic Technology Center of Excellence, a technology specialty of Cognizant Business Consulting (CBC). He has 20-plus years of technology consulting experience in leading data warehousing, business intelligence and big data projects, focused primarily on the life sciences and healthcare industries. Tom can be reached at Thomas.Kelly@cognizant.com. 22 | ©2013, Cognizant