O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise of Data Lakes"

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio

Confira estes a seguir

1 de 22 Anúncio

Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise of Data Lakes"

Baixar para ler offline

Dr. Christian Kurze, Principal Sales Engineer DACH at Denodo Technologies GmbH

"Data Virtualization: Fulfilling the Promise of Data Lakes"

Dr. Christian Kurze, Principal Sales Engineer DACH at Denodo Technologies GmbH

"Data Virtualization: Fulfilling the Promise of Data Lakes"

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Quem viu também gostou (20)

Anúncio

Semelhante a Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise of Data Lakes" (20)

Mais de Dataconomy Media (20)

Anúncio

Mais recentes (20)

Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise of Data Lakes"

  1. 1. Data Virtualization: Fulfilling the Promise of Data Lakes Dr. Christian Kurze Principal Sales Engineer – DACH ckurze@denodo.com heiko.klarl@xdi360.com
  2. 2. 2 Key qestions I want to answer today  What is Data Virtualization?  How to leverage Hadoop Data Lakes to support Internet of Things / Operational Data Store / Offloading / … use cases?  How to query Hadoop Data Lakes combined with any other structured, semi-structured and unstructured data sources using a single logical data lake? What about Cloud?  How to avoid Data Swamps via a light weight data governance approach that helps enterprises maximize the value of their Data Lake?  How to use a logical data lake/data warehouse to prevent a physical data lake from becoming a silo? Agenda
  3. 3. 3 Status Quo – Data Integration Access to all information MarketingSales ExecutiveSupport  Access to complete information  … in an economically meaningful way  … real-time and in high quality incl. monitoring, security and audit Cross-sell / Up-sell Channel Warranty Product Customer Database Apps Warehouse Cloud Big Data Documents AppsNoSQL  Manual Access to legacy systems and constantly new technologies – IoT, Big Data, Cloud  Point-to-Point connections  Too slow projects for new initiatives – from disparate silos and technologies The Requirement… … versus the current architecture
  4. 4. 4 Status Quo – Data Integration Access to all information MarketingSales ExecutiveSupport  Access to complete information  … in an economically meaningful way  … real-time and in high quality incl. monitoring, security and audit Cross-sell / Up-sell Channel Warranty Product Customer Database Apps Warehouse Cloud Big Data Documents AppsNoSQL  Manual Access to legacy systems and constantly new technologies – IoT, Big Data, Cloud  Point-to-Point connections  Too slow projects for new initiatives – from disparate silos and technologies The Requirement… … versus the current architecture „My architecture works fine, but I am not able to access all my silos.“ - Enterprise Data Architect • Different locations • Different technologies • Different data structures • Too large datasets to move them • Different APIs and access methods • Excessive use of ETL to copy data • Synchronization issues
  5. 5. 5 The Solution Data Virtualization as a Data Abstraction Layer DATA ABSTRACTION LAYER Central repository to access all data Abstracts the underlying technology of the data sources Enables the definition of a semantic data model Offers a metadata-rich catalog Multiple access methods: SQL based Keyword based search (via index) RESTful navigation (hyperlinks) Native support for nexted document structures (XML, JSON, …)
  6. 6. 6 Modelling in a Data Virtualization Solution Sources Combine, Transform & Integrate Publish Base View (Source Abstraction)Client Address Client Type Company Invoicing Service Usage Product Logs Web Incidents Customer Invoice Product Customer Invoicing Service Usage Incident Hadoop Web SiteRest Web Service MultidimensionalSalesforceSQL ServerOracle SQL, SOAP, REST, ODATA, Message Queues (JMS), etc.. Denodo’s Information Self Service Independent of the access method – all views use the same metadata and access privileges
  7. 7. 7 Common Data Virtualization Use Cases Data Virtualization BIG DATA, CLOUD INTEGRATION  Advanced Analytics  Data Warehouse Offloading  Big Data for Enterprise  Cloud / SaaS Integration AGILE BUSINESS INTELLIGENCE  Logical Data Warehouse  Virtual Data Marts  Self-Service BI  Operational BI / Analytics SINGLE VIEW APPLICATIONS  Single Customer View - Call Centers, Portals  Single Product View - Catalogs  Single Inventory View - Inventory Reconciliation  Vertical Specific - Single View of Wells DATA SERVICES  Unified Data Services Layer  Logical Data Abstraction  Agile Application Development  Linked Data Services
  8. 8. 8 DWH & MartsAdvanced Analytics (multiple structures) Advanced Analytics (structured) MDMStreams Multiple platforms optimized for different Workloads Additionally in a hybrid environment: OnPrem vs. Cloud C R U D NoSQL / Graph DB Data Lake: Hadoop / Spark / Hive / … EDW Mart DW Appliance DW Appliance Cust Prod Real-time stream processing & decision management Graph analysis Graph analysis Investigative analysis, data refinery Data mining, model development Data mining, model development Traditional query, reporting & analysis Governed context information Traditional query, reporting & analysis
  9. 9. 9 Business requires a combination of data MDM C R U D Hadoop Cust Prod Who are our customers? What products do we sell? What are the most popular naviational paths through our web site that led to high-fee products? Who are our most loyal, low risk customers that generate low fees? What is the online behavior of our loyal, low risk, low fee customers so that we can offer them higher fee products? Where do I find this data? How to combine this data? How to share it with my colleagues? What about their access privileges? EDW
  10. 10. Big Data Connectivity BigData and Cloud Databases Connectivity ■ Hadoop Ecosystem: ■ SQL on Hadoop: Hive, Impala, Presto,… ■ HDFS, Parquet, Avro, CSV… ■ Execution of map/reduce Jobs ■ Certified with major Hadoop distributions ■ In-memory platforms: Apache Spark SQL, Presto DB, HANA,… ■ Parallel DWs and Appliances: Vertica, Impala, Teradata, Greenplum,… ■ Cloud RDBMS: Redshift, Snowflake, DynamoDB,… ■ NoSQL (MongoDB, CouchDB, Neo4J, Redis, Oracle NoSQL, Cassandra, etc.) ■ Streaming data (Spark streams, Splunk, IBM Streams, Kafka,…) 10 Enhanced Adapters for Big Data ecosystem Delimited text files Sequence files Map files Avro files
  11. 11. 11 How to provide access by multiple tools and technologies? DWH MDM Hadoop Appliances NoSQL External Services Excel / MS BI Tableau Power BI Composite Desktop 360 Views Cockpit Other Applications  Complex Security Policies? RBAC?  Single Sign On (Kerberos)  Governance / Audit  Fast Prototyping?  Automated Processes?  Manual development of Service Layer?  Source Changes  New Attributes and Requirements  Accounting of source usage (cloud migration pending)  Refactoring of sources  New Sources
  12. 12. 12 Marketing Data Lakes Research Logical Data Lake Finance Self-Service Analytics Operational Apps A Single Governed Logical Data Lake Data Virtualization combines one or more physical data lakes with other enterprise data to create a “virtual” or “logical” data lake. Other Data Sources MDM Cloud Apps BI/Analytical Tools Excel Reports DATA VIRTUALIZATION Semantic Model Data Discovery Metadata Catalog Security Governance Denodo Platform Bridges Distinct Data Architectures  Simplified Architecture  Single Point of Access  Lower TCO  Lower Operational Costs  Improved Agility  Improved Flexibility  Consistency and Integrity for multiple tools
  13. 13. 13 Information Self Service E/R diagram 1 Click on a view to navigate to the details 2 Hover on the arrows to show the details of the PK-FK relationships
  14. 14. 14 Information Self Service Browse Metadata Catalog 1Browse and search virtual databases 2 Browse and search available views 3 Review metadata and descriptions 4 Query the view
  15. 15. 15 Information Self Service Search Metadata Catalog 1 Full-text search within view metadata (name, column names, descriptions) 2 Show additional view information and query data
  16. 16. 16 Information Self Service Querying Data 1Access to the Denodo catalog 2 Query and filter for data 3 Click on the green arrows to drill down into related information
  17. 17. 17 Information Self Service Data Lineage 1 Select Data Lineage for the View 2 Select column to see lineage 3 Hover and click the icons to see details
  18. 18. 18 Telematics & Predictive Maintenance Leading Construction Manufacturer Dealer Maintenance Parts Inventory OSI PI Hadoop Cluster Tableau: Dealer / Customer Dashboard
  19. 19. 19 Business Benefits  Improved asset performance and proactive maintenance.  Reduced warranty costs due to proactive maintenance of parts preventing parts failure.  Optimized pricing for services and parts among global service providers.  New Business Model opportunities based on real-time analysis of detailed sensor data.
  20. 20. 20 How can I get started? Read New Whitepaper by Rick F. Van der Lans Developing a Bimodal Logical Data Warehouse Architecture Using Data Virtualization Register at: http://bit.ly/2frs782 Get Started Today! Download Denodo Express: www.denodoexpress.com Access Denodo on AWS: www.denodo.com/en/denodo-platform/denodo-platform-for-aws
  21. 21. www.denodo.com info@denodo.com © Copyright Denodo Technologies. All rights reserved Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.

×