O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Business Data Lake Best Practices

3.089 visualizações

Publicada em

Arne Rossmann outlines why the Business Data Lake works and which Services the Business Data Lake should provide. Organizations can use the Business Data Lake concept best when they standardize, industrialize and innovate.

Presented by Arne Rossman, Capgemini Germany, at the OOP Conference, 31 January 2017

Publicada em: Tecnologia
  • Seja o primeiro a comentar

Business Data Lake Best Practices

  1. 1. Business Data Lake best practices OOP Munich, 2017-01-31
  2. 2. 2Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices The speaker – Arne Roßmann !  Part of Insights & Data team •  Global team delivering around BI, DWH, Information Strategy & Big Data !  Working in Business Intelligence since 2008 !  Delivering as Big Data architect & Project Manager at our clients •  Defining processes •  Creating architectures •  Leading projects !  Worked in many industries •  Retail, Chemical, Financial, Logistics, Automotive, ...
  3. 3. 3Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices Capgemini’s Insights & Data Global Practice With 15,000 experts globally, we are a recognized leader in information-led transformation Capgemini’s Insights & Data Global Practice Expertise in Big Data & Analytics Capgemini Solutions !  Over 15,000 consultants globally !  Industrialized delivery framework Next Gen Business Insights Service Centre !  CUBE lab on the cloud with various demonstrations for BI environments !  Built-in Tools for interactive agile BI and Devops Partner Ecosystem 800+ Big Data & 400+ Data Science Global Consultants Customer Analytics !  Segmentation & Behavior Profiling !  Behavior Propensity scoring !  Pricing Analytics Marketing & Campaign Analytics !  Campaign Recommendation !  Cross Sell/Up Sell !  Campaign Measurement !  Campaign Execution Management Operations Analytics !  Sales/ Demand Forecasting !  Activity Based Costing !  Call Center Analytics Asset/ Equipment Analytics !  Warranty Analytics !  Asset Performance Monitoring !  Predictive Asset Maintenance !  Insights from Connected Equipment Fraud Analytics !  Fraud Scoring !  Collusion Fraud Identification !  Fraud Framework for Public Sector (Trouve) Content Analytics !  Text Mining Accelerators !  Key Opinion Leader !  Content Analytics for Fraud Detection Business Data Lake offering Data Warehouse Optimization Solution Strategic Alliances and partnerships with major vendors Enabling Co-Innovation with the CUBE lab Experience in designing and deploying big data analytics solutions in a varied ecosystems
  4. 4. 4Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices Table of Contents !  Why the Business Data Lake works !  Services your Business Data Lake should provide !  Standardize, Industrialize and Innovate!
  5. 5. Why the Business Data Lake works
  6. 6. 6Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices Big Data creates opportunities but poses challenges as well Where do I start ? “We know that Big Data can be helpful but how do we quantify the benefits and develop a Business Case?” “How do we know which Big Data technology/platform(s) suits our architecture and business requirement? “ “How do I get all the unstructured data (mainly images) out of my operational processes, into an analytical environment that allows me to experiment with data?” “Can we easily combine data from multiple source systems into our Big Data environment and visa versa?” “Can I do it myself? What skills do I need for Big Data? “ “How do I measure the effectiveness or performance of my Big Data initiative? How do I measure ROI?”
  7. 7. 7Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices Businesses are looking to close the gap towards ‘insight driven’ Have not completely integrated their data sources across the organization 79% Scattered data lying in silos across the organization Do not have well-defined criteria to measure the success of their own Big Data initiatives 67% Absence of clear business case for funding and implementation Dependence on legacy systems for data processing and management Use cloud based Big Data and analytics platforms 36% Have either scattered pockets of resources or follow a decentralized model for analytics initiatives Ineffective co-ordination of Big Data and analytics teams 47%
  8. 8. 8Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices The Business Data Lake delivers what we need for the new data landscape. Govern Where it matters Encourage local requirements Distill on demand Store securely !  Focus on MDM !  Enforce only when sharing !  Treat Corporate as aggregation of Local. !  Let the business decide what they need !  Build from the bottom !  Enable traceability to source disposable data views. !  Store everything ‘as is’ !  Include structured and unstructured data !  Store it cheaply were possible !  Select only what you want !  Business friendly tooling !  Re-usable information maps !  Rapid change cycle.
  9. 9. 9Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices Business Challenges driving the need for BDL services Business Enablement !  Achieve real-time optimization of business processes through predictive insights and performance analytics !  Enhance new services and stay competitive in the market !  Be agile, get insights fast ControlControl !  Ensure data security and compliance with EU data regulations !  Enable up- and downscaling according to business needs ControlControl !  Reduce costs associated with the governance and secure storage of data !  Control the costs of running flexible data services !  Reduce Capex
  10. 10. Services your Business Data Lake should provide
  11. 11. 11Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices Capgemini can help accelerate clients’ journey to Insights.. A cloud powered, big data & insights service; bring all your data in one place, deliver insights at the point of action and generate differentiated business value. ‘Software- Defined’’, full stack cloud infrastructure Flexible ‘Pay-as-you-go’ Commercial Model Secure as a Vault ‘Ready to Harvest’ Sector & Domain Insights Modular Hybrid & Elastic powered by ‘Intelligent Automation’ Get started quickly: with our platform , tools and expertise we can support you at any level to manage your data and harvest insights Your ‘Lab in the Cloud’ !  Experiment !  Hypothesize !  Simulate
  12. 12. 12Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices The BDL architecture we built for our clients Pla$orm as a Service Insights Platform UX Portal HTML 5, CSS, Angular JS Big Data Lab Dataset Library Data Science Lab Models Library Insights Lab Ready Insights Common Services Common Services Ingest Algorithm Library Sector Insight Labs Smart Insights 360 Catalog & Provision Meter&Bill ResourceMonitor Provision ServiceCatalog IoTFramework AccessMgmt KnowledgeBase Helpdesk RESTful Web Services Infrastructure as a Service Hybrid Cloud Extensibility - (Bosh, CF) CG-CSB, Virtustream Storage and ParallelizaIon - EMC Isilon Compute & Memory - EMC VCE Big Data Suite – Pivotal, Cloudera, Hortonworks VMware, Cortex Data Management – InformaIca, Talend, HDF, Apache Nify AnalyIcs tools - SAS, Madlib, RStudio,Spark Vmware Security & Governance RSA, AD, Knox, Ranger, Kerberos, Atlas, TDE, W2W, Metron, Falcon ITSM - BMC Remedy •  Common Web UI and UX architecture •  Fully Virtualized compute, storage & Network •  Intelligent automation of provisioning, process, service and support orchestration •  Modular Component Architecture •  Multiple points of presence •  Seamless integration between on-premise, private & public cloud •  Proven reference and component architecture for on- premise builds •  Professional Services teams to build full stack •  Demo of full stack •  Accelerated Partner enablement MD&LM Environment Hadoop DistribuIon – Hortonworks, Cloudera RE&D, Dev Ops - Cloud Foundry, Jira, Jit, Application LayerInfra Layer User Access LayerSoftware & Services VisualisaIon – Qlik, Tableau, SAS VA, D3, High Charts VisualisationVisualisation Self Service Insights Capgemini Private Cloud On Premise Cloud
  13. 13. 13Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices BDLaaS – illustrative example service Dashboard
  14. 14. 14Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices Standardize, Industrialize and Innovate!
  15. 15. 15Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices Big data processing is done in three different stages and we have to cater to each stage differently !  Continuously running analytics processes !  Trust in data quality !  Service levels secured !  Managed by IT Operationalize !  Store everything: internal and external, structured and unstructured !  Store granular data !  Minimal effort on IT Load “as-is” !  Agile and explorative way of work !  Self service !  Fail fast Distill on demand Time Stage Actors Paradigms IT implements data integration process for production Data providers and IT provide and store data Data scientists and engineers explore and analyze data 1 2 3 Allow creativity Encourage collaborationEnsure Business Meta Data & Data Catalogue Enable Data Masking Industrialize! Examples of technical metadata !  Path (folder location) !  Filename !  File type !  File size !  Date of ingestion !  Technical Owner / Group !  For HIVE: !  Nr of records / lines !  Column number !  Column names if available !  Column data types !  Value distribution !  Min/Max Examples of business metadata !  Project (possibly automatic) !  Data set name !  Logical description of dataset !  Data owner/data stewart !  Confidentiality classification !  Line of business
  16. 16. 16Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices Start using ELT tools now! Need for more platform updates Need for more denormalization Need for more specialized Know-How " Abstraction layer to Hadoop processing engines " Abstraction layer to NoSQL & SQL databases " Standardized control flows " Availability of developers ELT Tools offer:
  17. 17. 17Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices 17Copyright © Capgemini 2016. All Rights Reserved Insights as a Service – Analytics Cloud for Oil & Gas major Well Health Dashboards Equipment Performance Disaster Management Supply Chain Analytics Predictive Maintenancezz z Device Data Driving behavior, GPS, diagnostics, etc. Real Time DataSystem Data Environment DataProject Data • 10 data points per sec • 40 GB per field • 5-6 GB per day per well, • 80TB Well data year • 24x7x365 monitoring usage • Real time charts of streaming data • Real time alerts • Thermal Visualizations
  18. 18. 18Copyright © Capgemini 2015. All Rights Reserved OOP MUC 2017 - Business Data Lake best practices We helped customers getting to real value within 12 weeks from idea to production. 1 3 a 5 6 7 9 11 Business Insights Need Integrate DataSet Model Build and Training Iterate and Tune Data Exploration Test Data Science Model Apply Data Science 12 Business Validation Publish Insights Weeks Business Problem Identified Business Value Delivered
  19. 19. The information contained in this presentation is proprietary. Copyright © 2015 Capgemini. All rights reserved. Rightshore® is a trademark belonging to Capgemini. www.capgemini.com About Capgemini With more than 145,000 people in over 40 countries, Capgemini is one of the world's foremost providers of consulting, technology and outsourcing services. The Group reported 2014 global revenues of EUR 10.573 billion. Together with its clients, Capgemini creates and delivers business and technology solutions that fit their needs and drive the results they want. A deeply multicultural organization, Capgemini has developed its own way of working, the Collaborative Business Experience™, and draws on Rightshore®, its worldwide delivery model Learn more about us at www.capgemini.com.

×