SlideShare a Scribd company logo
1 of 25
Download to read offline
mycervello.com
Modern Data Warehouses In The Cloud
Use Cases + Live Demo
19th February 2019
Slim Baltagi
Director, Big Data & ML
DFW Area Advanced
Analytics Meetup
© 2019 Cervello Inc.
Agenda
1. Quick Intro: Talk Format, Speaker Background, Q&A session
2. Key Terms: Cloud, Data Warehouse, Modern, …
3. Data Era: Challenges & Opportunities
4. Traditional Data Warehouses: Key Challenges
5. Modern Data Warehouses: Examples, Solutions, Key
Opportunities,
6. Use Cases + Live Demo
7. Key Takeaways & Where To Go From Here?
8. Interactive Session: Q&A, discussion and more networking
2
© 2019 Cervello Inc.
1. Quick Intro: Speaker, Talk, Q&A session
§ Format of this talk: 40 minutes Talk, a 15 minutes Live Demo, a 30 minutes Q&A session
§ Currently
• Advocating Modern Data Architecture (MDA) and Modern Data Warehouses in the cloud to organizations as
director at Cervello Inc., an A.T. Kearney fast-growing professional services company
• Working with our team of creative problem solvers to deliver projects that help our clients win with data
• Enjoying emerging technologies (Kubernetes, Kubeflow …) , speaking at conferences and organizing meetups
§ In the past
• Created a mathematical formula for provisioning Apache Hadoop, now an obsolete technology!
• Evangelized Apache Spark and was the first to fully explain how it relates to Apache Hadoop without any
marketing fluff!
• Initiated a discussion at Hortonworks (recently merged with Cloudera) on its official stance on Apache Spark at a
time when Hortonworks itself was happier with Tez and touting it as the future compute engine of Hadoop!
• Publicly shared the limitations of Spark Streaming. As an alternative, evangelized Apache Flink and introduced it
to the western hemisphere. In particular, at Capital One as first adopter before the likes of Uber, Netflix, Lyft, etc
… Gave Apache Flink talks in 4 continents
• Evangelized Apache Kafka as a platform for building end-to-end streaming data applications using Kafka Core,
Kafka connect and Kafka Streams/KSQL
• Delivered many end-to-end data projects as director, architect and engineer
3
© 2019 Cervello Inc.
2. Key Terms: Cloud
§ What is exactly the cloud? According to Gartner, ‘A style of computing in which scalable and
elastic IT-enabled capabilities are delivered as a service using Internet technologies’
§ What are the models within the cloud and how do they apply to data warehousing?
4
Hardware:
Datacenter
Software: Data warehouse Management:
Optimizing, tuning,
maintenance
Cook, Eat, Clean
On-premises You You You At home
Infrastructure IaaS
Example: EC2
Vendor You You At a vacation
home
Platform PaaS
Example: Redshift,
EMR
Vendor Vendor You At a fast food
restaurant
Software SaaS
Example: Snowflake
Vendor Vendor Vendor At a full service
restaurant
© 2019 Cervello Inc.
2. Key Terms: Data Warehouse, Modern
§ What is a data warehouse? ‘A subject-oriented, integrated, time-variant, non-volatile collection
of data in support of management’s decision-making process.’ Source: Building the data
warehouse, a book by Bill Inmon, who is considered to be the father of data warehousing
§ From the beginning data warehousing was about the business making better and quicker
decisions
§ A key factor driving the evolution of data warehousing is the cloud. An updated definition of a
data warehouse in this modern era of cloud, social networks and mobile would add attributes
such as elastic, scalable (All data, All users),shareable, agile, conductive to exploratory
analytics. What do you think?
§ Be aware that a traditional data warehouse hosted in the cloud is not necessarily a data
warehouse ‘built for the cloud’ but it is a ‘cloud-washed’ one! Amazon Redshift would be a
good example. Hold on, more to come on this!
§ Often modernization in the context of data warehouse is confined to just a ‘technology refresh’.
How about getting any improvement to the underlying business processes? That would require
a ‘mindset refresh’!
5
© 2019 Cervello Inc.
3. Data Era: Challenges
Regardless of using databases, data lakes, data warehouses or data swamps,
businesses all share common challenges related to data in this new era!
• Data is a game changing asset but most organizations struggle to derive business value
from it
• Business decisions are hindered by slow and incomplete data analytics
• Legacy technologies, skills, methods and mindsets are stalling innovation
• Changing business requirements are outpacing the capability of IT teams to deliver
• Silos of diverse data from diverse sources ( internal enterprise apps, public agencies,
external events, 3rd party services, web, …) are more and more segregated, data can not be
combined together and business value can not be derived from it
• Costly and complex data infrastructure consuming significant resources to build and
maintain data platforms
• Technology selection is becoming more and more challenging due to the abundance of data
tools, platforms and services
6
© 2019 Cervello Inc.
3. Data Era: Opportunities
What are some of today’s opportunities in relation to data in this new era?
• Cloud computing and the increase in computer-processing power, storage and data-
gathering abilities enable businesses to rent much-higher-capacity computer infrastructure
from third-party vendors, avoiding the hefty costs and additional resources needed to run
their own data centers
• External data sources such as social media data, weather data, demographic data and other
third-party data can enrich and augment internal data to help businesses with better
decision making
• Real-Time or Near-Real-Time insights that match fast paced business and markets thanks
to the maturity of stream processing technologies
• Data-driven decisions thanks to growing adoption of Machine Learning and AI technologies
and services and also availability of more powerful compute and cheaper storage services
• Being a data-driven company is not confined to tech giants anymore
7
© 2019 Cervello Inc.
4. Traditional Data Warehouses: Key Challenges
1. Scalability: growth in data volume, number of users and applications
2. Concurrency: as number of users increase, they can not operate simultaneously
3. Performance: slow running queries, …
4. Resilience: data backup/retention and node failure protection
5. Complexity: from initial implementation to ongoing maintenance
6. Lack of native support for semi-structured data: JSON and XML formats are popular for
data exchange. In traditional data warehouses, data needs to be transformed first and
schema needs to be defined before loading
7. High maintenance overhead in the form of constant indexing, tuning, sorting
8. Handling workload fluctuation: sizing servers for workload peaks and valleys
9. Waste of resources: expensive computing resources are wasted during off peak usage
10. Lack of support for processing streaming data
11. Upfront costs: hardware, software licenses, staffing, …
12. Project delays due to provisioning infrastructure
Sure, you are familiar with real world scenarios such as End of Month Reporting, Intensive Load
Process, Demanding Executive Dashboard Users, …
8
© 2019 Cervello Inc.
5. Modern Data Warehouses: Examples
9
Reference: The Forrester Wave™: Cloud Data Warehouse, Q4 2018, October 29, 2018
The 14 Providers That Matter Most And How They Stack Up
© 2019 Cervello Inc.
5. Modern Data Warehouses: Examples
10
© 2019 Cervello Inc.
5. Modern Data Warehouses: Solutions
§ Major modern data warehouses are cloud based. Most of them do overcome the key
challenges of the traditional data warehouses. Unfortunately, overcoming these challenges is
happening at different degrees and it is not always a simple check of a yes or no!!
1. Scalability: Scale up, down, or off quickly without delay. Yes for Snowflake. Low for Redshift,
Yes for Azure Data Warehouse, Yes for Big Query ( but with limits)
2. Concurrency: Lots of users can operate simultaneously
3. Performance: processing bottlenecks and delays, slow running queries, …
4. Resilience: data backup/retention and node failure protection
5. Complexity: Cloud data warehouses are easier to use than on-premise data warehouses
6. Lack of native support of semi-structured data: Although cloud data warehouse do support
semi-structured data such as JSON. This support varies from one data warehouse to
another.
−Concurrent throughput for JSON: Yes for Snowflake, No for Redshift, No for Azure Data
Warehouse, Moderate for Big Query.
−High performance for JSON scans: Yes for Snowflake, No for Redshift unless you use the
additional Amazon Spectrum tool, No for Azure Data Warehouse, Moderate for Big Query.
11
© 2019 Cervello Inc.
5. Modern Data Warehouses: Solutions
7. High maintenance overhead is not a major issue with managed infrastructure for cloud data
warehouses, but the level of maintenance varies from one data warehouse to another. This
fluctuates to creating indices and distribution keys to near zero management
8. Handling workload fluctuation: With elasticity, cloud data warehouses adapt to workload peaks
and valleys
9. Waste of resources: No more wasted resources during off peak usage! Auto suspend, Auto
resume? Anybody knows about these features from Snowflake data warehouse?
10. Lack of support for streaming data: For example, loading and processing is possible with
Snowpipe, a serverless compute service from Snowflake data warehouse
11. Upfront costs: No upfront costs as the model of cloud data warehouse is pay as you go
12. Project Delays due to provisioning infrastructure: Not Applicable for cloud data warehouse.
With instant provisioning, projects don’t need to wait for infrastructure
12
© 2019 Cervello Inc.
5. Modern Data Warehouses: Snowflake key innovations
13
§ Snowflake is based on three key innovations:
1. Unique architecture : a unique architecture designed for the cloud and able to provide
complete elasticity for all your concurrent users and applications
2. Database engine that natively handles all your data both semi structured and structured
data without sacrificing performance nor flexibility
3. Technology that eliminates the need for manual data warehouse management and tuning.
No indexing, tuning, partitioning or vacuuming after loading data => effortless management
§ Snowflake architecture consists of three layers, each one is physically decoupled from the
other layer and scales independently:
1. Data Storage layer uses cloud storage to store all data loaded into snowflake in a scalable
and inexpensive way
2. Compute layer comprises of virtual warehouses compute resources that execute data
processing tasks required for queries. The virtual warehouses have access to all of the
data in the storage layer
3. Cloud Services layer coordinates the entire system managing security, optimization and
metadata
© 2019 Cervello Inc.
5. Modern Data Warehouses: Snowflake’s multi-cluster, shared data
architecture
14
© 2019 Cervello Inc.
6. Uses Cases + Live Demo
§ Data as a Service
§ Self-Service Analytics
§ Advanced Analytics Lab
§ Real-Time Insight
§ Single View of Customer
15
Data Consumers
§ Partners, Suppliers, and Vendors
§ Executives, Managers, and Analysts
§ Data Scientists
§ Embedded Applications
§ Many Business Users
New Use Cases
© 2019 Cervello Inc.
6. Use Cases + Live Demo: Snowflake reference architecture
16
Data Sources/Data Providers: On-premise or Cloud, Structured or Unstructured Data
Dataflow Tools: Streaming Data Platforms, ETL and ELT platforms. E: Extract, L: Load, T: Transform
Data Storage: S3/Azure Storage (missing in the architecture diagram!)
Analytics Services: BI, UI, Data Science, …
© 2019 Cervello Inc.
6. Use Case + Live Demo: Live Tracking of CTA Buses
17
§ The purpose of this live demo is to showcase a few aspects of a Snowflake such as Ease of
use, Fast provisioning, Native support of semi-structured data, Continuous data loading ,
…Due to time constraints, we won’t cover other unique features of Snowflake such as multi-
warehouse concurrency against same data, data sharing, time travel …
§ This live demo itself is about an end-to-end application for live tracking of CTA buses that:
1. Sends a request every minute to CTA Bus Tracker web service about live bus locations
and gets an immediate response as JSON data load
2. Stores the JSON data into Amazon S3 and trigger a notification event into an SQS queue
3. Consumes the event from the SQS using Snowpipe, a serverless compute service from
Snowflake. Snowpipe automatically loads JSON data, as is with no ETL, from S3 into a
Snowflake SQL database
4. Queries the JSON data using Snowflake’s flatten method as extension to standard ANSI
SQL
5. Visualizes the buses locations on a map
§ This is not a toy use case! Since JSON is a popular format for data exchange, the same data
pipeline can be leveraged in many other use cases with different data sources: internal
applications, external enterprise data services, machine generated data such as IoT devices …
© 2019 Cervello Inc.
6. Use Case + Live Demo: Live Tracking of CTA Buses
18
§ What did I use to cook this application?
• AWS Free tier: a free account from https://aws.amazon.com/free/
• Snowflake Test Drive: 30 days free trial through our partnership with Snowflake. Who wants
one?
https://trial.snowflake.com/?utm_source=cervello&utm_medium=referral&utm_campaign=sel
f-service-partner-referral-cervello
• CTA (Chicago Transit Authority) Bus Tracker API to get free up-to-the-minute JSON data
feeds: https://www.transitchicago.com/developers/bustracker/ You just need to request a
free API key from CTA
• Anaconda: Free and open source distribution of Python and R programming language.
Comes with Python IDE (Spyder 3.3.2) and other goodies!
• Free trial of Tableau Online: https://www.tableau.com/products/cloud-bi
© 2019 Cervello Inc.
6. Use Cases + Live Demo: Live Tracking of CTA buses
19
© 2019 Cervello Inc.
7. Key Takeaways & Where To Go From Here?
§Try it yourself
−Snowflake is worth a Free trial!
https://trial.snowflake.com/?utm_source=cervello&utm_medium=referral&utm_campaign=s
elf-service-partner-referral-cervello
−You can build a short term POC in either AWS or Azure while using US $400 credit for 30
days for both compute and storage,
§Engage the business
−It is critical to identify business sponsorship and use cases
−Cervello prefers short, high impact workshop approach
§Get end to end help
−Pick the right data platform … For example, the most ‘popular’ data warehouse is not
necessarily the one that fulfills your data warehouse needs! We helped many of our clients
move from Amazon Redshift and other traditional data warehouses to Snowflake.
−Avoid having a pure ‘technology refresh’ and missing the opportunities to add business
value (refer to use cases)
§Get advice
20
© 2019 Cervello Inc.
8. Interactive Session: Q&A, discussion and more networking
§ The goal here is for the audience to chime in now and later on the comments section of the
event or the meetup discussions section.
§ Here is a few samples of food for thought!
• Can somebody briefly share his/her real-world experience migrating from a legacy data
warehouse to a modern data warehouse?
• Did anybody worked on a fit-for-purpose analysis of modern data warehouses?
• What would be the best practices of using a modern data warehouse such as Snowflake?
• What advice will you give someone transitioning his/her career to work on a modern data
warehouse?
• Is there anyone who would like to give a follow up talk on modern data warehousing?
21
© 2019 Cervello Inc.
Boston | New York | Dallas | London
Get in touch with contributors:
Thank You!
Learn more about Cervello at
mycervello.com
Slim Baltagi
sbaltagi@gmail.com
https://www.linkedin.com/in/slimbaltagi/
@SlimBaltagi
22
Jim Leavitt
jleavitt@mycervello.com
https://www.linkedin.com/in/jimleavitt/
© 2019 Cervello Inc.
Appendix
23
© 2019 Cervello Inc.
Cervello Profile
We are a professional services firm focused on helping organizations win with data. We have a global presence with an ability to service customers across
industries. We are organized around three practices that our under pinned by our expertise in modern data architectures.
PERFORMANCE
MANAGEMENT
SUPPLIER + CUSTOMER
RELATIONSHIPS
DATA MONETIZATION +
PRODUCTS
DATA MANAGEMENT
& MACHINE LEARNING
We embed data management
processes and use machine
learning to improve data quality
Using leading cloud technology
platforms we develop and
design analytics-ready
connected data solutions
MODERN DATA
ARCHITECTURE & PLATFORMS
ANALYTICS & DATA SCIENCE
MODELING & PLANNING
MOBILE EXPERIENCE
We provide different methods
to consume data to facilitate
insights and actions inside and
outside the organization
BUSINESS DRIVEN
INSIGHTS & ACTIONS
HOWWEDELIVER
WHATWEDELIVER
STRATEGY
BUILD
SERVICES
RUN & COE
SERVICES
Our teams are located in Boston, New York, Dallas, London and Bangalore
© 2019 Cervello Inc.
25
TRAIN
2 DAYS
Getting around Snowflake
Analyzing & Reporting Data
Data Management & Security
PLAN
< 1 WEEK
Use case inventory
Readiness Assessment
Architecture Design
ACTION
1-2 WEEKS
Use case & MVP defined
POC Implementation
Stakeholder Presentation
TCO Analysis
Cervello Value
Accelerator
9
Years Working In The Cloud
30
Snowflake Trained Consultants
100+
Big Data & Analytics Engagements
End to End
Enablement

More Related Content

What's hot

Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMatillion
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWKent Graziano
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
Snowflake Data Loading.pptx
Snowflake Data Loading.pptxSnowflake Data Loading.pptx
Snowflake Data Loading.pptxParag860410
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsThomas Sykes
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflakeSivakumar Ramar
 
A 30 day plan to start ending your data struggle with Snowflake
A 30 day plan to start ending your data struggle with SnowflakeA 30 day plan to start ending your data struggle with Snowflake
A 30 day plan to start ending your data struggle with SnowflakeSnowflake Computing
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleAdam Doyle
 
Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxData platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxCalvinSim10
 
Modern data warehouse presentation
Modern data warehouse presentationModern data warehouse presentation
Modern data warehouse presentationDavid Rice
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 

What's hot (20)

Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - Snowflake
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Snowflake Data Loading.pptx
Snowflake Data Loading.pptxSnowflake Data Loading.pptx
Snowflake Data Loading.pptx
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Azure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data Flows
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflake
 
IBM Cloud pak for data brochure
IBM Cloud pak for data   brochureIBM Cloud pak for data   brochure
IBM Cloud pak for data brochure
 
A 30 day plan to start ending your data struggle with Snowflake
A 30 day plan to start ending your data struggle with SnowflakeA 30 day plan to start ending your data struggle with Snowflake
A 30 day plan to start ending your data struggle with Snowflake
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
 
Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxData platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptx
 
How to Streamline DataOps on AWS
How to Streamline DataOps on AWSHow to Streamline DataOps on AWS
How to Streamline DataOps on AWS
 
Modern data warehouse presentation
Modern data warehouse presentationModern data warehouse presentation
Modern data warehouse presentation
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
Data Vault Overview
Data Vault OverviewData Vault Overview
Data Vault Overview
 
Lakehouse in Azure
Lakehouse in AzureLakehouse in Azure
Lakehouse in Azure
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 

Similar to Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi

How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?Slim Baltagi
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data CentersGina Buck
 
2020 Cloud Data Lake Platforms Buyers Guide - White paper | Qubole
2020 Cloud Data Lake Platforms Buyers Guide - White paper | Qubole2020 Cloud Data Lake Platforms Buyers Guide - White paper | Qubole
2020 Cloud Data Lake Platforms Buyers Guide - White paper | QuboleVasu S
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?Harald Erb
 
Accenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdfAccenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdfRajvir Kaushal
 
Unblocking Innovation for Digital Transformation
Unblocking Innovation for Digital TransformationUnblocking Innovation for Digital Transformation
Unblocking Innovation for Digital TransformationAmazon Web Services
 
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...Precisely
 
Take the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented AnalyticsTake the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented AnalyticsTyler Wishnoff
 
Cloud Computing Overview
Cloud Computing OverviewCloud Computing Overview
Cloud Computing OverviewDoug Allen
 
Calculating the true value of industry specific clouds linthicum
Calculating the true value of industry specific clouds linthicumCalculating the true value of industry specific clouds linthicum
Calculating the true value of industry specific clouds linthicumDavid Linthicum
 
NoOps in a Serverless World
NoOps in a Serverless WorldNoOps in a Serverless World
NoOps in a Serverless WorldGary Arora
 
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower Cost
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower CostHow Consistent Data Services Deliver Simplicity, Compatibility, And Lower Cost
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower CostDana Gardner
 
Future of Making Things
Future of Making ThingsFuture of Making Things
Future of Making ThingsJC Davis
 
GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017Jeremy Maranitch
 
Data warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-clouderaData warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-clouderaJyrki Määttä
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesDataWorks Summit
 

Similar to Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi (20)

How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data Centers
 
2020 Cloud Data Lake Platforms Buyers Guide - White paper | Qubole
2020 Cloud Data Lake Platforms Buyers Guide - White paper | Qubole2020 Cloud Data Lake Platforms Buyers Guide - White paper | Qubole
2020 Cloud Data Lake Platforms Buyers Guide - White paper | Qubole
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?
 
Accenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdfAccenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdf
 
Unblocking Innovation for Digital Transformation
Unblocking Innovation for Digital TransformationUnblocking Innovation for Digital Transformation
Unblocking Innovation for Digital Transformation
 
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
 
CF-Chapitre1.pdf
CF-Chapitre1.pdfCF-Chapitre1.pdf
CF-Chapitre1.pdf
 
Take the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented AnalyticsTake the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented Analytics
 
Cloud Storage Benefits
Cloud Storage BenefitsCloud Storage Benefits
Cloud Storage Benefits
 
Big data rmoug
Big data rmougBig data rmoug
Big data rmoug
 
Cloud Computing Overview
Cloud Computing OverviewCloud Computing Overview
Cloud Computing Overview
 
Calculating the true value of industry specific clouds linthicum
Calculating the true value of industry specific clouds linthicumCalculating the true value of industry specific clouds linthicum
Calculating the true value of industry specific clouds linthicum
 
NoOps in a Serverless World
NoOps in a Serverless WorldNoOps in a Serverless World
NoOps in a Serverless World
 
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower Cost
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower CostHow Consistent Data Services Deliver Simplicity, Compatibility, And Lower Cost
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower Cost
 
Future of Making Things
Future of Making ThingsFuture of Making Things
Future of Making Things
 
GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017
 
Data warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-clouderaData warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-cloudera
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 

More from Slim Baltagi

Modern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetesModern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetesSlim Baltagi
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaSlim Baltagi
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsSlim Baltagi
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeSlim Baltagi
 
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitSlim Baltagi
 
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
Apache Fink 1.0: A New Era  for Real-World Streaming AnalyticsApache Fink 1.0: A New Era  for Real-World Streaming Analytics
Apache Fink 1.0: A New Era for Real-World Streaming AnalyticsSlim Baltagi
 
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksOverview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksSlim Baltagi
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsSlim Baltagi
 
Apache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiApache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiSlim Baltagi
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiSlim Baltagi
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Slim Baltagi
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkSlim Baltagi
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksSlim Baltagi
 
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini PalthepuApache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini PalthepuSlim Baltagi
 
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkOverview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkSlim Baltagi
 
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiHadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiSlim Baltagi
 
Big Data at CME Group: Challenges and Opportunities
Big Data at CME Group: Challenges and Opportunities Big Data at CME Group: Challenges and Opportunities
Big Data at CME Group: Challenges and Opportunities Slim Baltagi
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkSlim Baltagi
 

More from Slim Baltagi (20)

Modern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetesModern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetes
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
 
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
 
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
Apache Fink 1.0: A New Era  for Real-World Streaming AnalyticsApache Fink 1.0: A New Era  for Real-World Streaming Analytics
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
 
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksOverview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
 
Apache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiApache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim Baltagi
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink
 
Flink vs. Spark
Flink vs. SparkFlink vs. Spark
Flink vs. Spark
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache Flink
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics Frameworks
 
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini PalthepuApache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
 
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkOverview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
 
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiHadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
 
Big Data at CME Group: Challenges and Opportunities
Big Data at CME Group: Challenges and Opportunities Big Data at CME Group: Challenges and Opportunities
Big Data at CME Group: Challenges and Opportunities
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
 

Recently uploaded

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi

  • 1. mycervello.com Modern Data Warehouses In The Cloud Use Cases + Live Demo 19th February 2019 Slim Baltagi Director, Big Data & ML DFW Area Advanced Analytics Meetup
  • 2. © 2019 Cervello Inc. Agenda 1. Quick Intro: Talk Format, Speaker Background, Q&A session 2. Key Terms: Cloud, Data Warehouse, Modern, … 3. Data Era: Challenges & Opportunities 4. Traditional Data Warehouses: Key Challenges 5. Modern Data Warehouses: Examples, Solutions, Key Opportunities, 6. Use Cases + Live Demo 7. Key Takeaways & Where To Go From Here? 8. Interactive Session: Q&A, discussion and more networking 2
  • 3. © 2019 Cervello Inc. 1. Quick Intro: Speaker, Talk, Q&A session § Format of this talk: 40 minutes Talk, a 15 minutes Live Demo, a 30 minutes Q&A session § Currently • Advocating Modern Data Architecture (MDA) and Modern Data Warehouses in the cloud to organizations as director at Cervello Inc., an A.T. Kearney fast-growing professional services company • Working with our team of creative problem solvers to deliver projects that help our clients win with data • Enjoying emerging technologies (Kubernetes, Kubeflow …) , speaking at conferences and organizing meetups § In the past • Created a mathematical formula for provisioning Apache Hadoop, now an obsolete technology! • Evangelized Apache Spark and was the first to fully explain how it relates to Apache Hadoop without any marketing fluff! • Initiated a discussion at Hortonworks (recently merged with Cloudera) on its official stance on Apache Spark at a time when Hortonworks itself was happier with Tez and touting it as the future compute engine of Hadoop! • Publicly shared the limitations of Spark Streaming. As an alternative, evangelized Apache Flink and introduced it to the western hemisphere. In particular, at Capital One as first adopter before the likes of Uber, Netflix, Lyft, etc … Gave Apache Flink talks in 4 continents • Evangelized Apache Kafka as a platform for building end-to-end streaming data applications using Kafka Core, Kafka connect and Kafka Streams/KSQL • Delivered many end-to-end data projects as director, architect and engineer 3
  • 4. © 2019 Cervello Inc. 2. Key Terms: Cloud § What is exactly the cloud? According to Gartner, ‘A style of computing in which scalable and elastic IT-enabled capabilities are delivered as a service using Internet technologies’ § What are the models within the cloud and how do they apply to data warehousing? 4 Hardware: Datacenter Software: Data warehouse Management: Optimizing, tuning, maintenance Cook, Eat, Clean On-premises You You You At home Infrastructure IaaS Example: EC2 Vendor You You At a vacation home Platform PaaS Example: Redshift, EMR Vendor Vendor You At a fast food restaurant Software SaaS Example: Snowflake Vendor Vendor Vendor At a full service restaurant
  • 5. © 2019 Cervello Inc. 2. Key Terms: Data Warehouse, Modern § What is a data warehouse? ‘A subject-oriented, integrated, time-variant, non-volatile collection of data in support of management’s decision-making process.’ Source: Building the data warehouse, a book by Bill Inmon, who is considered to be the father of data warehousing § From the beginning data warehousing was about the business making better and quicker decisions § A key factor driving the evolution of data warehousing is the cloud. An updated definition of a data warehouse in this modern era of cloud, social networks and mobile would add attributes such as elastic, scalable (All data, All users),shareable, agile, conductive to exploratory analytics. What do you think? § Be aware that a traditional data warehouse hosted in the cloud is not necessarily a data warehouse ‘built for the cloud’ but it is a ‘cloud-washed’ one! Amazon Redshift would be a good example. Hold on, more to come on this! § Often modernization in the context of data warehouse is confined to just a ‘technology refresh’. How about getting any improvement to the underlying business processes? That would require a ‘mindset refresh’! 5
  • 6. © 2019 Cervello Inc. 3. Data Era: Challenges Regardless of using databases, data lakes, data warehouses or data swamps, businesses all share common challenges related to data in this new era! • Data is a game changing asset but most organizations struggle to derive business value from it • Business decisions are hindered by slow and incomplete data analytics • Legacy technologies, skills, methods and mindsets are stalling innovation • Changing business requirements are outpacing the capability of IT teams to deliver • Silos of diverse data from diverse sources ( internal enterprise apps, public agencies, external events, 3rd party services, web, …) are more and more segregated, data can not be combined together and business value can not be derived from it • Costly and complex data infrastructure consuming significant resources to build and maintain data platforms • Technology selection is becoming more and more challenging due to the abundance of data tools, platforms and services 6
  • 7. © 2019 Cervello Inc. 3. Data Era: Opportunities What are some of today’s opportunities in relation to data in this new era? • Cloud computing and the increase in computer-processing power, storage and data- gathering abilities enable businesses to rent much-higher-capacity computer infrastructure from third-party vendors, avoiding the hefty costs and additional resources needed to run their own data centers • External data sources such as social media data, weather data, demographic data and other third-party data can enrich and augment internal data to help businesses with better decision making • Real-Time or Near-Real-Time insights that match fast paced business and markets thanks to the maturity of stream processing technologies • Data-driven decisions thanks to growing adoption of Machine Learning and AI technologies and services and also availability of more powerful compute and cheaper storage services • Being a data-driven company is not confined to tech giants anymore 7
  • 8. © 2019 Cervello Inc. 4. Traditional Data Warehouses: Key Challenges 1. Scalability: growth in data volume, number of users and applications 2. Concurrency: as number of users increase, they can not operate simultaneously 3. Performance: slow running queries, … 4. Resilience: data backup/retention and node failure protection 5. Complexity: from initial implementation to ongoing maintenance 6. Lack of native support for semi-structured data: JSON and XML formats are popular for data exchange. In traditional data warehouses, data needs to be transformed first and schema needs to be defined before loading 7. High maintenance overhead in the form of constant indexing, tuning, sorting 8. Handling workload fluctuation: sizing servers for workload peaks and valleys 9. Waste of resources: expensive computing resources are wasted during off peak usage 10. Lack of support for processing streaming data 11. Upfront costs: hardware, software licenses, staffing, … 12. Project delays due to provisioning infrastructure Sure, you are familiar with real world scenarios such as End of Month Reporting, Intensive Load Process, Demanding Executive Dashboard Users, … 8
  • 9. © 2019 Cervello Inc. 5. Modern Data Warehouses: Examples 9 Reference: The Forrester Wave™: Cloud Data Warehouse, Q4 2018, October 29, 2018 The 14 Providers That Matter Most And How They Stack Up
  • 10. © 2019 Cervello Inc. 5. Modern Data Warehouses: Examples 10
  • 11. © 2019 Cervello Inc. 5. Modern Data Warehouses: Solutions § Major modern data warehouses are cloud based. Most of them do overcome the key challenges of the traditional data warehouses. Unfortunately, overcoming these challenges is happening at different degrees and it is not always a simple check of a yes or no!! 1. Scalability: Scale up, down, or off quickly without delay. Yes for Snowflake. Low for Redshift, Yes for Azure Data Warehouse, Yes for Big Query ( but with limits) 2. Concurrency: Lots of users can operate simultaneously 3. Performance: processing bottlenecks and delays, slow running queries, … 4. Resilience: data backup/retention and node failure protection 5. Complexity: Cloud data warehouses are easier to use than on-premise data warehouses 6. Lack of native support of semi-structured data: Although cloud data warehouse do support semi-structured data such as JSON. This support varies from one data warehouse to another. −Concurrent throughput for JSON: Yes for Snowflake, No for Redshift, No for Azure Data Warehouse, Moderate for Big Query. −High performance for JSON scans: Yes for Snowflake, No for Redshift unless you use the additional Amazon Spectrum tool, No for Azure Data Warehouse, Moderate for Big Query. 11
  • 12. © 2019 Cervello Inc. 5. Modern Data Warehouses: Solutions 7. High maintenance overhead is not a major issue with managed infrastructure for cloud data warehouses, but the level of maintenance varies from one data warehouse to another. This fluctuates to creating indices and distribution keys to near zero management 8. Handling workload fluctuation: With elasticity, cloud data warehouses adapt to workload peaks and valleys 9. Waste of resources: No more wasted resources during off peak usage! Auto suspend, Auto resume? Anybody knows about these features from Snowflake data warehouse? 10. Lack of support for streaming data: For example, loading and processing is possible with Snowpipe, a serverless compute service from Snowflake data warehouse 11. Upfront costs: No upfront costs as the model of cloud data warehouse is pay as you go 12. Project Delays due to provisioning infrastructure: Not Applicable for cloud data warehouse. With instant provisioning, projects don’t need to wait for infrastructure 12
  • 13. © 2019 Cervello Inc. 5. Modern Data Warehouses: Snowflake key innovations 13 § Snowflake is based on three key innovations: 1. Unique architecture : a unique architecture designed for the cloud and able to provide complete elasticity for all your concurrent users and applications 2. Database engine that natively handles all your data both semi structured and structured data without sacrificing performance nor flexibility 3. Technology that eliminates the need for manual data warehouse management and tuning. No indexing, tuning, partitioning or vacuuming after loading data => effortless management § Snowflake architecture consists of three layers, each one is physically decoupled from the other layer and scales independently: 1. Data Storage layer uses cloud storage to store all data loaded into snowflake in a scalable and inexpensive way 2. Compute layer comprises of virtual warehouses compute resources that execute data processing tasks required for queries. The virtual warehouses have access to all of the data in the storage layer 3. Cloud Services layer coordinates the entire system managing security, optimization and metadata
  • 14. © 2019 Cervello Inc. 5. Modern Data Warehouses: Snowflake’s multi-cluster, shared data architecture 14
  • 15. © 2019 Cervello Inc. 6. Uses Cases + Live Demo § Data as a Service § Self-Service Analytics § Advanced Analytics Lab § Real-Time Insight § Single View of Customer 15 Data Consumers § Partners, Suppliers, and Vendors § Executives, Managers, and Analysts § Data Scientists § Embedded Applications § Many Business Users New Use Cases
  • 16. © 2019 Cervello Inc. 6. Use Cases + Live Demo: Snowflake reference architecture 16 Data Sources/Data Providers: On-premise or Cloud, Structured or Unstructured Data Dataflow Tools: Streaming Data Platforms, ETL and ELT platforms. E: Extract, L: Load, T: Transform Data Storage: S3/Azure Storage (missing in the architecture diagram!) Analytics Services: BI, UI, Data Science, …
  • 17. © 2019 Cervello Inc. 6. Use Case + Live Demo: Live Tracking of CTA Buses 17 § The purpose of this live demo is to showcase a few aspects of a Snowflake such as Ease of use, Fast provisioning, Native support of semi-structured data, Continuous data loading , …Due to time constraints, we won’t cover other unique features of Snowflake such as multi- warehouse concurrency against same data, data sharing, time travel … § This live demo itself is about an end-to-end application for live tracking of CTA buses that: 1. Sends a request every minute to CTA Bus Tracker web service about live bus locations and gets an immediate response as JSON data load 2. Stores the JSON data into Amazon S3 and trigger a notification event into an SQS queue 3. Consumes the event from the SQS using Snowpipe, a serverless compute service from Snowflake. Snowpipe automatically loads JSON data, as is with no ETL, from S3 into a Snowflake SQL database 4. Queries the JSON data using Snowflake’s flatten method as extension to standard ANSI SQL 5. Visualizes the buses locations on a map § This is not a toy use case! Since JSON is a popular format for data exchange, the same data pipeline can be leveraged in many other use cases with different data sources: internal applications, external enterprise data services, machine generated data such as IoT devices …
  • 18. © 2019 Cervello Inc. 6. Use Case + Live Demo: Live Tracking of CTA Buses 18 § What did I use to cook this application? • AWS Free tier: a free account from https://aws.amazon.com/free/ • Snowflake Test Drive: 30 days free trial through our partnership with Snowflake. Who wants one? https://trial.snowflake.com/?utm_source=cervello&utm_medium=referral&utm_campaign=sel f-service-partner-referral-cervello • CTA (Chicago Transit Authority) Bus Tracker API to get free up-to-the-minute JSON data feeds: https://www.transitchicago.com/developers/bustracker/ You just need to request a free API key from CTA • Anaconda: Free and open source distribution of Python and R programming language. Comes with Python IDE (Spyder 3.3.2) and other goodies! • Free trial of Tableau Online: https://www.tableau.com/products/cloud-bi
  • 19. © 2019 Cervello Inc. 6. Use Cases + Live Demo: Live Tracking of CTA buses 19
  • 20. © 2019 Cervello Inc. 7. Key Takeaways & Where To Go From Here? §Try it yourself −Snowflake is worth a Free trial! https://trial.snowflake.com/?utm_source=cervello&utm_medium=referral&utm_campaign=s elf-service-partner-referral-cervello −You can build a short term POC in either AWS or Azure while using US $400 credit for 30 days for both compute and storage, §Engage the business −It is critical to identify business sponsorship and use cases −Cervello prefers short, high impact workshop approach §Get end to end help −Pick the right data platform … For example, the most ‘popular’ data warehouse is not necessarily the one that fulfills your data warehouse needs! We helped many of our clients move from Amazon Redshift and other traditional data warehouses to Snowflake. −Avoid having a pure ‘technology refresh’ and missing the opportunities to add business value (refer to use cases) §Get advice 20
  • 21. © 2019 Cervello Inc. 8. Interactive Session: Q&A, discussion and more networking § The goal here is for the audience to chime in now and later on the comments section of the event or the meetup discussions section. § Here is a few samples of food for thought! • Can somebody briefly share his/her real-world experience migrating from a legacy data warehouse to a modern data warehouse? • Did anybody worked on a fit-for-purpose analysis of modern data warehouses? • What would be the best practices of using a modern data warehouse such as Snowflake? • What advice will you give someone transitioning his/her career to work on a modern data warehouse? • Is there anyone who would like to give a follow up talk on modern data warehousing? 21
  • 22. © 2019 Cervello Inc. Boston | New York | Dallas | London Get in touch with contributors: Thank You! Learn more about Cervello at mycervello.com Slim Baltagi sbaltagi@gmail.com https://www.linkedin.com/in/slimbaltagi/ @SlimBaltagi 22 Jim Leavitt jleavitt@mycervello.com https://www.linkedin.com/in/jimleavitt/
  • 23. © 2019 Cervello Inc. Appendix 23
  • 24. © 2019 Cervello Inc. Cervello Profile We are a professional services firm focused on helping organizations win with data. We have a global presence with an ability to service customers across industries. We are organized around three practices that our under pinned by our expertise in modern data architectures. PERFORMANCE MANAGEMENT SUPPLIER + CUSTOMER RELATIONSHIPS DATA MONETIZATION + PRODUCTS DATA MANAGEMENT & MACHINE LEARNING We embed data management processes and use machine learning to improve data quality Using leading cloud technology platforms we develop and design analytics-ready connected data solutions MODERN DATA ARCHITECTURE & PLATFORMS ANALYTICS & DATA SCIENCE MODELING & PLANNING MOBILE EXPERIENCE We provide different methods to consume data to facilitate insights and actions inside and outside the organization BUSINESS DRIVEN INSIGHTS & ACTIONS HOWWEDELIVER WHATWEDELIVER STRATEGY BUILD SERVICES RUN & COE SERVICES Our teams are located in Boston, New York, Dallas, London and Bangalore
  • 25. © 2019 Cervello Inc. 25 TRAIN 2 DAYS Getting around Snowflake Analyzing & Reporting Data Data Management & Security PLAN < 1 WEEK Use case inventory Readiness Assessment Architecture Design ACTION 1-2 WEEKS Use case & MVP defined POC Implementation Stakeholder Presentation TCO Analysis Cervello Value Accelerator 9 Years Working In The Cloud 30 Snowflake Trained Consultants 100+ Big Data & Analytics Engagements End to End Enablement