Watch full webinar here: https://goo.gl/2wNBhg
To grow or compete in today's fast paced business environment, you need a robust, agile and cost effective data-driven decision strategy.
However, many companies are struggling with the growing complexity of data integration projects as they try to manage the increasing volumes and types of data from traditional enterprise sources as well as new sources such as big data, machine data, social media or cloud sources.
Data virtualization is the technology to simplify and reduce the costs of your data integration projects.
Watch this webinar in which we explore:
• How data virtualization lets you provide the business with the information it needs to make better decisions faster.
• How you can connect and combine all your data in real-time, without compromising on scalability, security or governance.
2. What is Data Virtualization?
Data Virtualization is a technology solution which
can combine data from disparate sources to deliver
integrated information in real time to various
consumers and applications in an agile and cost-
effective way without compromising on scalability,
security or governance.
3. Data Analysis - Evolution
“By 2020, 35% of enterprise
organizations will implement Data
Virtualization in some form as a more
forward-thinking option for Data
Integration.
- Gartner 2017
Unified view of data from diverse sources integral to data insights.
Data Warehouses - enabled a central
repository of data with ETL technologies
paving way for unified view of business
entities. Data replication, consistent
business logic.
Data Federation – Capability to
integrate data from disparate
sources at runtime. No data
replication but strictly within
control of admin. Limited by
performance & structured data.
Data Virtualization – Unified
view from structured and
unstructured data delivered
at runtime without
replication, optimum
performance and consistent
with data governance
policies.
Data Silos – Unified view of data
created outside core systems. Data
replication, diverse business logic.
4. Current State of BI Enterprise
Metadata Repository, Data Lineage
Governance & Security
Data Reconciliation
Enterprise Logical Data Model
Data Access &
Consumers
Reporting
Data Labs
ESB
Data Services
Enterprise
Integration
LDAP
WEB / Mobile
Portal
Data
Sources
Internal Data
Sources
External Data
Sources
Mater Data
Management
Data Quality
Enrichment
Standardization
Cleansing
Data
Acquisition
(Open Source or
COTS)
Streaming
CDC
ETL
Real-Time
Quality
Control
Data Platform
Operational Data Store (ODS)
StorageandArchival
Enterprise Data Warehouse (EDW)
Data Marts (DM)
Data
Propagation
Replication
File Transfer
(FTP/SFTP)
API (REST,
SOAP)
JDBC
ODBC
SQL
ETL
Data Lakes
5. KEY CHALLENGES
Data Integration Challenges
Lack of Flexibility
• Too many point-to-point integrations are implemented
• Information chain is too long
• Lack of clear strategy about how to integrate new types of data
• Lack of tools to integrate new types of data
Increase in Overall Cost
• Cost of implementation goes up due to environment, storage & implementation cost
• Change request are equally costly and have large cascading impact
Delayed Time to Market
• Slows down delivery timeline due to tight coupling of solutions
• Lacks quality due to application dependency which takes time to stable
Data Replication
• Data is stored and duplicated too many times along the way requiring security,
governance and regulation
6. How Data Virtualization Fit In?
Metadata Repository, Data Lineage
Governance and Security
Data Reconciliation
Enterprise Logical Data Model
Data Access &
Consumers
Reporting
Data Labs
ESB
Data Services
Enterprise
Integration
LDAP
WEB / Mobile
Portal
Data
Sources
Internal Data
Sources
External Data
Sources
Mater Data
Management
Data Quality
Enrichment
Standardization
Cleansing
Data
Acquisition
&
Integration
(Open Source or
COTS)
Streaming
CDC
ETL
Real-Time
Quality
Control
Data Platform
Operational
Data Store
(ODS)
StorageandArchival
EDW
Data Marts
(DM)
Data Virtualisation
MonitoringandRoleBasedAccess
Metadata
Scheduler
Optimizer
DynamicDataDiscovery
Publish
Business /
Subject Area
Specific
Views
DataAsaService
Confidentiality
Security
Governance
Audit
Combine
Transform
& Integrate
Views
Connect
Native
Connectivity
to Disparate
Sources
Normalized
Views
Data
Propagation
Replication
File Transfer
(FTP/SFTP)
API (REST,
SOAP)
JDBC
ODBC
SQL
ETL
Data Lakes
7. Key Capabilities:
• Logical abstraction means flexibility and
agile solution
• Improved data federation
• Integration of structure and
unstructured data
• Expose data as service
• Embedded data management practices
like governance & security
Data Virtualization: Capability Offering
By 2019, organizations with
data virtualization capabilities
will spend 40% less on building
and managing data integration
processes, for connecting
distributed data assets.”
- Gartner 2017
“
Denodo Ltd. - Confidential
8. Industry Use Cases
BI & BIG DATA
• Semantic layer for analytics; self service analytics
• Logical data warehouse architecture (Bi-modal)
• Virtual or agile data marts
• Real-time dashboards
• Simplify data lake; data warehouse offloading OPERATIONAL
• Virtual ODS
• Master data management
• Legacy system migration
• Application data access
EMERGING
• Cloud data sharing
• Data hub enablement
• Multi-structured data services (NoSQL)
9. ….but is this
really True?
Myths and Pitfalls
Data Virtualization
“Data virtualization will be much
slower than a persisted approach ETL
• There is a large amount of data moved through the
network for each query
• Network transfer is slow”
Charlie Assumption, Acme LTD
10. Data Virtualization: Myths and Pitfalls
Not as much data is moved as you might think!
Query Description
Returned
Rows
Time
Netezza
Time Denodo
(Federated
Oracle, Netezza &
SQL Server)
Optimization Technique
(automatically selected)
Total sales by customer 1.99 M 20.9 sec. 21.4 sec. Full aggregation push-down
Total sales by customer and year
between 2000 and 2004
5.51 M 52.3 sec. 59.0 sec Full aggregation push-down
Total sales by item brand 31.35 K 4.7 sec. 5.0 sec.
Partial aggregation push-
down
Total sales by item where sale
price less than current list price
17.05 K 3.5 sec. 5.2 sec On the fly data movement
System
Execution
Time
Data
Transferred
Optimization Technique
(Auto Selected)
Denodo 9 sec. 4 M Aggregation push-down
Tableau 125 sec. 292 M None: Full Scan
11. “
Real-time Data Integration
Denodo – Data Virtualization Technology Architecture
Consume
in business
applications
Combine
related data
into views
Connect
to disparate
data sources
2
3
1
DATA CONSUMERS
DISPARATE DATA SOURCES
Enterprise Applications, Reporting, BI, Portals, ESB, Mobile, Web, Users
Databases & Warehouses, Cloud/Saas Applications, Big Data, NoSQL, Web, XML, Excel, PDF, Word...
Analytical Operational
Less StructuredMore Structured
CONNECT COMBINE PUBLISH
Multiple Protocols,
Formats
Query, Search,
Browse
Request/Reply,
Event Driven
Secure
Delivery
SQL,
MDX
Web
Services
Big Data
APIs
Web Automation
and Indexing
CONNECT COMBINE CONSUME
Share, Deliver,
Publish, Govern,
Collaborate
Discover, Transform,
Prepare, Improve
Quality, Integrate
Normalized views of
disparate data
Data virtualization
integrates disparate
data sources in real
time or near-real
time to meet
demands for
analytics and
transactional data.”
– Create a Road Map For A
Real-time, Agile, Self-Service
Data Platform, Forrester
Research, Dec 16, 2015
12. Why Customers Choose Denodo
Data Virtualization Success Factors
Ease of usePerformance Flexibility Enterprise Governance
13. Complete enterprise information,
combining Web, cloud,
streaming, and structured data
ROI realization within 6 months,
with the flexibility to adjust to
unforeseen changes
80% reduction in integration
costs, in terms of resources and
technology
Real-time integration and data
access, enabling faster business
decisions
“Get it Real-time and Get it Fast!”
The Benefits of Data Virtualization
Deliver data-centricity across the business
• Innovate through big data, adding new
sources for enterprise use, advanced analytics
Increase operational efficiencies & reduce
costs
• Reduce costs and complexity, minimize data
replication, foster data reusability and
collaboration
Enable business agility
• Provide an enterprise data Marketplace and
enable self-service
Tame the ‘Data mess”
• Unify the diverse universe of data assets and
help enforce enterprise data policies
14. “
Case Study – UK Building Society
We were very pleased with the speed of
delivery and proactivity of the Mastek team
in collaborating with us to complete the
build."
UK building society
Business Objective:
Simplify and optimise regulatory reporting process
Challenges:
• Lack of information agility, hampering response time
• Multiple, difficult-to-integrate data sources
• Limited data governance and lineage
• Legacy information architecture
• Limited internal resources
Benefits:
• 40% faster development
• 25% faster processing of data
• Simplified integrated view of data
• Secure data access
• Simplified integration with enterprise systems
15. Data Virtualization - Services and Assets
SERVICES
ASSETS
Data Virtualization (DV) Jumpstart Kit
• Identifying potential DV use cases with enterprise capability
• Tools/technology selection matrix for DV implementation
• Blueprint and roadmap delivery with optimised ROI
• Delivering the transition & target architecture
• End to end project implementation
• Production support
• Vendor management
• DV reference architecture & fitment to use case
• Best practices and data standard guidelines
• Checklist and recommendations
• Tools comparison and evaluation framework