O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

The Transformation of your Data in modern IT (Presented by DellEMC)

435 visualizações

Publicada em

Organizations have a wealth of data contained within the existing infrastructures. At DellEMC we’re helping customers remove the barriers of legacy datastores and transforming the customer experience in the modern datacentre. Learn how to unshackle the valuable data inside your existing data warehouse, leverage new techniques, applications and technology to enhance the financial impact of all your data sources

Publicada em: Software
  • Seja o primeiro a comentar

The Transformation of your Data in modern IT (Presented by DellEMC)

  1. 1. The Transformation of your Data in modern IT Jeff Wiggins, Technical Manager Emerging Technology Division
  2. 2. © Copyright 2016 Dell Inc.2 ALL ORGANISATIONS ARE ON A JOURNEY TO… 1000X MORE DATA REAL TIME OPERATION ANALYTIC INSIGHTS PERSONALISATION & ENHANCED SERVICES
  3. 3. © Copyright 2016 Dell Inc.3 THE JOURNEY TO DIGITAL BREAKS TRADITIONAL IT INFRASTRUCTURE Gartner IT Budget Growth Clickstream Geolocation Web Data Internet of Things Docs, emails Server logs TRADITIONAL DATA NEW DATA SOURCES
  4. 4. © Copyright 2016 Dell Inc.4 Challenges with Enterprise Data Warehouses 1. Expensive storage – 70% of data in a typical EDW is unused 2. Expensive processing – On average 55% of EDW CPU utilisation is low value ETL 3. Expensive licensing… 4. New data sources – Traditional systems are unable to capture and use new data sources, such as unstructured or semi-structured data
  5. 5. © Copyright 2016 Dell Inc.5 COST DRIVERS OPERATIONS 50% ANALYTICS 20% ETL/ELT 30% COLD DATA 70% HOT DATA 30% ENTERPRISE DATA WAREHOUSE HADOOP WITH ENTERPRISE GRADE STORAGE SOLUTION ETL/ELT OFFLOADACTIVE ARCHIVE > $16 K per TB < $1 K per TB Cost Comparison Vs.
  6. 6. © Copyright 2016 Dell Inc.6 Throw Data Away1 Waste capacity on low value workloads 2 Unable to leverage new data sources 3 CHALLENGES WITH EXISTING EDW INFRASTRUCTURE
  7. 7. © Copyright 2016 Dell Inc.7 DATA ARCHITECTURE OPTIMISATION WITH HADOOP Don’t throw data away 1 Reclaim Enterprise Data Warehouse for high value BI 2 Leverage new data sources 3
  8. 8. EMC CONFIDENTIAL—INTERNAL USE ONLY Enterprise Data Hub 1. Open Architecture • Open source platform • APIs & engines for multiple workloads • Extensible for 3rd parties 2. Secure & Compliant • Robust access controls • Data encryption options • Shared security policies 3. Enterprise Data Governance • Meta data management • Data lineage/tethering • Audit histories 4. Unified & manageable • Common storage & resource management • On-prem , cloud & managed service • Highly available (including DR) Enterprise-Grade Hadoop: Must-Haves Resource Management Online NoSQL DBMS Analytic MPP DBMS Search Engine Batch Processing Stream Processing Machine Learning SQL Streaming File System System Management Data Management Metadata,Security,Audit,Lineage
  9. 9. © Copyright 2016 Dell Inc.9 ENTERPRISE DATAHUB- A PROGRESSION EDWs Marts Storage Search Servers Documents Archives ERP, CRM, RDBMS, Machines Files, Images, Video, Logs, Clickstreams External Data Sources Multi-workload analytic platform • Bring applications to data • Combine different workloads on common data (i.e. SQL + Search) • True BI agility 4 1 2 1 34 Active archive • Full fidelity original data • Indefinite time, any source • Lowest cost storage 1 Data management, transformations • One source of data for all analytics • Persisted state of transformed data • Significantly faster & cheaper 2 Self-service exploratory BI • Simple search + BI tools • “Schema on read” agility • Reduce BI user backlog requests 3
  10. 10. © Copyright 2016 Dell Inc.10 ALBERT wants to:  Optimise the existing data infrastructure spend  Enable analytics on all data, structured and unstructured  Lay the solid foundation of Self-Service BI • Albert has an existing large Enterprise Data Warehouse Infrastructure. With rapid growth in data volume, he needs to add 500 TB of capacity to his existing EDW Infrastructure. 2013 6.5M 2014 2015 2016 EDW Cost SAMPLE PROBLEM SCENARIO • At Average Cost of $13,000 Per TB of EDW Storage, the expansion is estimated to cost $6.5 Million to add 500 TB of capacity.
  11. 11. © Copyright 2016 Dell Inc.11 Data Management DATA SOLUTIONS FOR EDW MODERNISATION Clickstream Web & Social Geolocation Sensor & Machine Server Logs EXISTINGSOURCES ERP CRM DATA SERVICES OPERATIONAL SERVICES Advanced Application ETL HADOOP CORE Business Analytics Visualization & Dashboards IT Applications NEWSOURCES 2 3 1 ETL/ELT OFFLOAD ACTIVE ARCHIVE ENRICH WITH NEW DATA TYPES MULTI-PROTOCOL ACCESS ENTERPRISE-GRADE DATA MANAGEMENT 5 NFS, SMB, HTTP, Swift 1 2 3 4 5 4 New Data Flow Current Data Flow Legend OFFLOAD
  12. 12. © Copyright 2016 Dell Inc.12 ENTERPRISE EVOLUTION PROCESS COST DRIVERS REVENUE DRIVERS Enterprise Data Warehouse is Processing Limited Enterprise Data Warehouse is Capacity Limited Need to add new data source Types Typical Evolution Process (Every customer journey is different) HADOOP WITH ENTERPRISE GRADE STORAGE SOLUTION ETL/ELT OFFLOADACTIVE ARCHIVE ENRICH WITH NEW DATA TYPES
  13. 13. © Copyright 2016 Dell Inc.13 DATA SILO CONSOLIDATION 13© Copyright 2016 EMC Corporation. All rights reserved.
  14. 14. © Copyright 2016 Dell Inc.14 DATA SILO CONSOLIDATION Home Directories & File SharesSurveillance Next-Gen Application Hadoop & Analytics Transaction Logs BLOBSEDW Content Shares Marketing M&E Social & Next-Gen Archive & Backup Target Data Monetization Design, Test & Manufacture Application Test 14© Copyright 2016 EMC Corporation. All rights reserved.
  15. 15. © Copyright 2016 Dell Inc.15 DATA SILO CONSOLIDATION Home Directories & File SharesSurveillance Next-Gen Application Hadoop & Analytics Transaction Logs BLOBSEDW Content Shares Marketing M&E Social & Next-Gen Archive & Backup Target Data Monetization Design, Test & Manufacture Application Test 15© Copyright 2016 EMC Corporation. All rights reserved.
  16. 16. © Copyright 2016 Dell Inc.16 DATA SILO CONSOLIDATION DATA LAKE Home Directories & File SharesSurveillance Next-Gen Application Hadoop & Analytics Transaction Logs BLOBSEDW Content Shares Marketing M&E Social & Next-Gen Archive & Backup Target Data Monetization Design, Test & Manufacture Application Test 16© Copyright 2016 EMC Corporation. All rights reserved.
  17. 17. © Copyright 2016 Dell Inc.17 DATA LAKE SCALE-OUT SINGLE REPOSITORY IN-PLACE ANALYTICS MULTI-PROTOCOL / WORKLOAD TIERS 17 ENTERPRISE FEATURES MANAGE PBs © Copyright 2016 EMC Corporation. All rights reserved.
  18. 18. © Copyright 2016 Dell Inc.18 LOADING DATA WITH SQOOP… sqoop import --verbose --connect ‘jdbc:mysql://localhost/people’ --table persons --username root --hcatalog-table persons --hcatalog-storage-stanza "stored as orc” --m 1 --create-hcatalog-table --driver com.mysql.jdbc.Drive MySQL HDFS Hive Batch Sqoop Sqoop can do bidirectional transfers between JDBC compliant stores and Isilon HDFS.
  19. 19. © Copyright 2016 Dell Inc.19 HIVE – ONE TOOL FOR MANY SQL USE CASES… OLTP, ERP, CRM Systems Unstructured documents, emails Clickstream Server logs Social Media/Web Data Sensor. Machine Data Geolocation Interactive Analytics Batch Reports / Deep Analytics Hive - SQL ETL / ELT Compute & Isilon HDFS storage scales independently as needed Processed HiveQL Interactive Hive Server
  20. 20. © Copyright 2016 Dell Inc.20 Hive Server 2 (compile, optimize, execute) Isilon HDFS DELL EMC AT SCALE HIVE ARCHITECTURE Client – beeline, Hive View, Zeppelin, BI of Choice databas e Table 1 Partition 1 Table 2 Partition 2 Hive MetaStore TEZ / MR Data in Isilon HDFS • Structured • Unstructured • Semi structured Schema definitions Distribution Engine Data Storage Interpreter Hive parses and plans query Query converted to MR/TEZ MR or TEZ run by Hadoop
  21. 21. © Copyright 2016 Dell Inc.22 1. Active Archive – Optimise EDW storage by archiving cold data but still analyse as needed 2. ETL Offload – Improve EDW performance by offloading ETL processing to Hadoop 3. Semi/Unstructured Data Analytics – Increase confidence in business decisions with new data sources 4. Multi-protocol Access – Enable seamless in-place access using NFS, SMB, HTTP, Swift, FTP, … 5. Scale storage & compute independently – virtualise Hadoop 6. Data Management – Enterprise-grade data management at Hadoop economics
  22. 22. © Copyright 2016 Dell Inc.23 Dell EMC SOLUTION ACCELERATORS PROVIDING DELIVERY CERTAINTY AND IMPROVING TIME TO VALUE INGEST STORE ANALYZE SURFACE ACT VISUALIZE COTs and Custom App Integration  Rapid implementation of applications  Knowledge exchange of custom integration projects  Documented best practices MODEL AND REFINE Develop & Refine Analytical Models  Library of analytical models and algorithms  Industry focused  Use case focused CAPTURE AND STORE Source Systems, Data Lake Storage  Documented procedures to use Open Source tools
  23. 23. © Copyright 2016 Dell Inc.24 UNDECIDED? BIG DATA VISION WORKSHOP IDENTIFY YOUR OPPORTUNITY Align Business & IT Around Big Data Identify Opportunities for Big Data Analytics Demonstrate Data Science Possibilities Prioritize Use Cases by Feasibility and Value Recommendation & Roadmap
  24. 24. © Copyright 2016 Dell Inc.25 25© Copyright 2016 EMC Corporation. All rights reserved.

×