O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Big Data Management: What's New, What's Different, and What You Need To Know

1.031 visualizações

Publicada em

This presentation is from a recorded webinar with 451 Research analyst and thought leader Matt Aslett for a discussion about the growing importance of the right data management best practices and techniques for delivering on the promise of big data in the enterprise. Matt reviews the big data landscape, how the data lake complements and competes with the data warehouse, and key takeaways as you move from big data test and development environments to production. You can watch the webinar here: http://bit.ly/25ShiQu

Publicada em: Dados e análise
  • Seja o primeiro a comentar

Big Data Management: What's New, What's Different, and What You Need To Know

  1. 1. 1 Big Data Management: What’s New, What’s Different and What You Need to Know
  2. 2. 2 Today’s Featured Presenter Matt Aslett Research Director, Data Platforms and Analytics 451 Research As Research Director, Matt has overall responsibility for the data platforms and analytics research coverage, which includes operational and analytic databases, Hadoop, grid/cache, stream processing, search-based data platforms, data integration, data quality, data management, analytics, and advanced analytics. Matt's own primary area of focus includes data management, reporting and analytics, and exploring how the various data platform and analytics technology sectors are converging in the form of next-generation data platform
  3. 3. 33 Agenda • Big Data Management – Matt Aslett, 451 Research • SnapLogic Overview • SnapLogic Demonstration – Ravi Dharnikota, Head of SnapLogic Enterprise Architecture • Q&A
  4. 4. Copyright (C) 2016 451 Research LLC Big Data Management Matt Aslett, Research Director
  5. 5. Copyright (C) 2016 451 Research LLC 451 Research is a leading IT research & advisory company 5 Founded in 2000 250+ employees, including over 100 analysts 1,000+ clients: Technology & Service providers, corporate advisory, finance, professional services, and IT decision makers 50,000+ IT professionals, business users and consumers in our research community Over 52 million data points published each quarter and 4,500+ reports published each year 2,000+ technology & service providers under coverage 451 Research and its sister company, Uptime Institute, are the two divisions of The 451 Group Headquartered in New York City, with offices in London, Boston, San Francisco, Washington DC, Mexico, Costa Rica, Brazil, Spain, UAE, Russia, Taiwan, Singapore and Malaysia Research & Data Advisory Events Go 2 Market
  6. 6. Copyright (C) 2016 451 Research LLC Big data and beyond • V is for various things… but does not define big data 3
  7. 7. Copyright (C) 2016 451 Research LLC Big data and beyond • V is for various things… but does not define big data • To understand the trends driving ‘big data’ 451 Research focused beyond the nature of the data on what enterprises wanted to do with it 4
  8. 8. Copyright (C) 2016 451 Research LLC Big data and beyond 8 • V is for various things… but does not define big data • To understand the trends driving ‘big data’ 451 Research focused beyond the nature of the data on what enterprises wanted to do with it • Totality – storing and processing all data (or as much as is economically viable) • Exploration – schema-free approaches to analyzing data to identify new patterns • Frequency – more frequent analysis of data to enable real-time decision making
  9. 9. Copyright (C) 2016 451 Research LLC ‘Big data’ is primarily driven by economics, not data 6 • ‘Big Data’ is the realization of competitive advantage based on the fact that it is now more economically feasible to store and process data that was previously ignored due to the cost and functional limitations of traditional data management technologies to handle its volume, velocity and variety
  10. 10. Copyright (C) 2016 451 Research LLC ‘Big data’ is primarily driven by economics, not data 6 “Big data is what happened when the cost of keeping information became less than the cost of throwing it away.” George Dyson • ‘Big Data’ is the realization of competitive advantage based on the fact that it is now more economically feasible to store and process data that was previously ignored due to the cost and functional limitations of traditional data management technologies to handle its volume, velocity and variety
  11. 11. Copyright (C) 2016 451 Research LLC ‘Big data’ is primarily driven by economics, not data 7 “Big data is what happened when the cost of keeping information became less than the cost of throwing it away.” George Dyson • ‘Big Data’ is the realization of competitive advantage based on the fact that it is now more economically feasible to store and process data that was previously ignored due to the cost and functional limitations of traditional data management technologies to handle its volume, velocity and variety • Moved from storing 1% of data for 60 days in EDW @ $100,000/TB • To 100% of data for a year in Hadoop @ $900/TB
  12. 12. Copyright (C) 2016 451 Research LLC Source: 451 Research, Total Data Analytics 2016 The evolution of enterprise analytics 12 REPORTING - What happened ANALYSIS - Why did it happen? PRESCRIPTIVE - Influence what happens STATISTICAL MODELING MACHINE LEARNING DESCRIPTIVE - What is happening? PREDICTIVE - What will happen? Complexity AutomatedUser-drivenIT-driven VISUALIZATION
  13. 13. Copyright (C) 2016 451 Research LLC Data sources: Multi-structured RDBMS, Hadoop, NoSQL, stream processing, historical and real-time Source: 451 Research, Total Data Analytics 2016 Data sources: Structured, RDBMS, historical The evolution of enterprise analytics 13 REPORTING - What happened ANALYSIS - Why did it happen? PRESCRIPTIVE - Influence what happens STATISTICAL MODELING MACHINE LEARNING DESCRIPTIVE - What is happening? PREDICTIVE - What will happen? Complexity AutomatedUser-drivenIT-driven VISUALIZATION
  14. 14. Copyright (C) 2016 451 Research LLC EDW vs Hadoop (Schema-on-write vs schema-on-read) 14 Source: https://www.flickr.com/photos/wbaiv/16510090506/ Source: https://www.flickr.com/photos/notbrucelee/5696238930/
  15. 15. Copyright (C) 2016 451 Research LLC Schema-on-write 15 Source: https://www.flickr.com/photos/wbaiv/16510090506/ • Pre-prepared • Single-purpose • Some assembly required • Inflexible
  16. 16. Copyright (C) 2016 451 Research LLC Schema-on-read 16 Source: https://www.flickr.com/photos/notbrucelee/5696238930/ • Flexible • Reusable • Some imagination required* • Multi-purpose • *Instructions available if desired
  17. 17. Copyright (C) 2016 451 Research LLC Hadoop-based data lakes • The concept of the data lake has taken off in recent years, with the Apache Hadoop data-processing framework serving as the unified repository into which raw data is landed from multiple sources and made available to multiple users for multiple purposes. 17 Photo: Myrabella / Wikimedia Commons, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=11263585
  18. 18. Copyright (C) 2016 451 Research LLC Hadoop-based data lakes • The concept of the data lake has taken off in recent years, with the Apache Hadoop data-processing framework serving as the unified repository into which raw data is landed from multiple sources and made available to multiple users for multiple purposes. • Beware the data swamp 18 https://www.flickr.com/photos/lofink/4501610335/
  19. 19. Copyright (C) 2016 451 Research LLC Data governance, data preparation and the data lake • Data needs to be filtered, processed, treated and managed to make it suitable for multiple analytics use cases. • Data governance • Data catalog • Data security • Data lineage • Data preparation • Data discovery • Data cleansing • Data harmonization 19 • Data inventory • Data quality • Data pipelines • Data enrichment • Data matching • Collaboration
  20. 20. Copyright (C) 2016 451 Research LLC Data governance, data preparation and the data lake 20 DATA-AS-A-SERVICE PARTNERS SUPPLIERS SELF-SERVICE DATA PREPARATION IT DATA LAKE APPLICATIONS DATA GOVERNANCE Data lineage Data inventory Data catalog Data security Data quality Data pipelines DATA STEWARDS Data cleansing Data harmonization Data discovery Collaboration Data matching Data enrichment ADVANCED ANALYTICS DATA SCIENTISTS SELF-SERVICE ANALYTICS SENIOR EXECUTIVES BUSINESS ANALYSTS DATA ANALYSTS
  21. 21. Copyright (C) 2016 451 Research LLC Hadoop and other animals 21
  22. 22. Copyright (C) 2016 451 Research LLC Recommendations 22 • Enterprises should seriously consider the data governance and management requirements before embarking on data lake projects to ensure that the functionality is available to turn the concept into reality. • For flexibility and agility, employ data management approaches and technologies that abstract data processing pipelines from the execution environment. • Look for data integration and transformation technologies that execute natively, taking advantage of the underlying engine (e.g. Spark, YARN). • Seek out data management and integration technologies that enable consumption and transformation of large volumes of structured and unstructured data.
  23. 23. Copyright (C) 2016 451 Research LLC Thank You! matthew.aslett@451research.com @maslett www.451research.com
  24. 24. SnapLogic Elastic Integration Accelerate Your Integration. Accelerate Your Business “We can do more in two hours with SnapLogic than we could in two days with traditional solutions.”
  25. 25. 25 CSV Big Data and hybrid cloud environments are making yesterday’s approaches to integration obsolete
  26. 26. 26 Anything apps | data | APIs | things SnapLogic: Unified Platform for Data and Application Integration Anytime batch | streaming | real-time Anywhere on prem | cloud | hybrid
  27. 27. 2727 SnapLogic in the Modern Data Fabric: Ingest, Transform, Deliver ConsumeStore&ProcessSource z z z z HANA Data Warehouses & Data Marts Big Data and Data Lakes INGEST INGEST Data Integration and Transformation On Prem Applications Relational Databases Cloud Applications NoSQL Databases Web Logs Internet of Things DELIVER DELIVER
  28. 28. 28 Modern Architecture: Hybrid and Elastic Execution Streams: No data is stored/cached Secure: 100% standards-based Elastic: Scales out & handles data and app integration use cases Metadata Data Databases On Prem Apps Big Data Cloud Apps and DataCloud-Based Designer, Manager, Dashboard Execution Execution Execution Firewall SnapLogic “respects data’s gravity.”
  29. 29. SnapLogic Demonstration
  30. 30. 30 Discussion Matt Aslett Research Director, Data Platforms and Analytics 451 Research Ravi Dharnikota Head of Enterprise Architecture SnapLogic
  31. 31. 31 Integrate at the speed of modern business +1 888-494-1570 sales@snaplogic.com @SnapLogic www.snaplogic.com

×