The document provides an overview of the changing analytic environment and the evolution of the data warehouse. It discusses how new requirements like performance, usability, optimization, and ecosystem integration are driving the adoption of a real-time data warehouse approach. A real-time data warehouse is described as having low latency ingestion, in-memory and disk-optimized storage, and the ability to power both operational and machine learning applications. Examples are given of companies using a real-time data warehouse to enable real-time analytics and improve business processes.
Strategies for using alternative queries to mitigate zero results
The State of the Data Warehouse in 2017 and Beyond
1. The State of the
Data Warehouse in
2017 and Beyond
Presented by
2. Copyright (C) 2017 451 Research LLC
The Changing Analytic Environment
James Curtis, Senior Analyst, Data Platforms & Analytics
3. Copyright (C) 2017 451 Research LLC
33
451 Research is a leading IT research & advisory company
Founded in 2000
300+ employees, including over 120 analysts
2,000+ clients: Technology & Service providers, corporate
advisory, finance, professional services, and IT decision makers
50,000+ IT professionals, business users and consumers in our research
community
Over 52 million data points published each quarter and 4,500+ reports
published each year
3,000+ technology & service providers under coverage
451 Research and its sister company, Uptime Institute, are the two divisions
of The 451 Group
Headquartered in New York City, with offices in London, Boston, San
Francisco, Washington DC, Mexico, Costa Rica, Brazil, Spain, UAE, Russia,
Taiwan, Singapore and Malaysia
Research & Data
Advisory
Events
Go 2 Market
4. Copyright (C) 2017 451 Research LLC
4
A combination of research & data is delivered across fifteen
channels aligned to the prevailing topics and technologies of digital
infrastructure… from the datacenter core to the mobile edge.
5. Copyright (C) 2017 451 Research LLC
5
• Data Platforms & Analytics
• Some Trends
• The Evolving Data Warehouse
• The Evolution of Analytics
• Key Takeaways
5
Agenda
6. Copyright (C) 2017 451 Research LLC
66
Data Platforms & Analytics
§ Technologies to store, process and analyze
data
§ Collect/analyze data to identify potential
opportunities for improvement
§ Includes:
• Operational, analytic databases,
Hadoop, data grid/cache,
event/stream processing
• Data management/integration
technologies to prepare data for
analysis, and analytics tools
7. Copyright (C) 2017 451 Research LLC
7
Some Trends
The lines will continue to blur between
operational and analytical databases.
Machine learning and deep learning
will enter a new phase of strategic
adoption for predictive analytics.
Stream processing adoption will
accelerate as companies grapple
with fast data.
• Avoid seeing transactional and analytic databases
as two totally different systems
• Think about existing database admin and BI skills
• Don’t ignore the demand for predictive analytics
using machine learning
• Balance user-friendliness against complexity
• Does your existing data processing and analytics
infrastructure handle streaming data?
• Understand the business use case for streaming data
TREND RECOMMENDATION
Source: 451 Research. 2016/2017 Trends in Data
Platforms and Analytics. Oct 2015/2016.
8. Copyright (C) 2017 451 Research LLC
88
A Growing Market
Source: 451 Research Market
Monitor. Total Data: Platforms &
Analytics. May 2017.
9. Copyright (C) 2017 451 Research LLC
9
“He that will that not apply new remedies must
expect new evils.”
−Francis Bacon
10. Copyright (C) 2017 451 Research LLC
10
DECISION
MAKERS
DATA
ANALYSTS
IT PROSENTERPRISE
APPLICATIONS
DATA
WAREHOUSE
Enterprise Data Warehouse: Common characteristics
11. Copyright (C) 2017 451 Research LLC
What’s driving the change?
11
COMPUTE
OPTIONS
STORAGE
CHOICES
ORGANIZATIONAL
EXPECTATIONS
OPEN SOURCE
SOFTWARE
DATA, DATA,
AND MORE DATA
12. Copyright (C) 2017 451 Research LLC
12
ENTERPRISE
APPLICATIONS
DECISION
MAKERS
DATA
ANALYSTS
IT PROSDATA
WAREHOUSE
Adapt and
Expand
Our Field
of Vision
13. Copyright (C) 2017 451 Research LLC
13
ENTERPRISE
APPLICATIONS
CLOUD STORAGE
DECISION
MAKERS
HADOOP
SPARK
AI+ML
DATA
ANALYSTS
IT PROSDATA
WAREHOUSE
Expanded
Processing
Choices
14. Copyright (C) 2017 451 Research LLC
14
ENTERPRISE
APPLICATIONS
CLOUD STORAGE
MOBILE
APPS
BOTS
IOT DEVICES
AND SENSORS
SOCIAL
MEDIA
DECISION
MAKERS
HADOOP
SPARK
AI+ML
DATA
ANALYSTS
IT PROS
LOG AND
CLICKSTREAM
DATA
DATA
WAREHOUSE
Leads to
Expansion
of Data
Sources
15. Copyright (C) 2017 451 Research LLC
15
ENTERPRISE
APPLICATIONS
CLOUD STORAGE
MOBILE
APPS
BOTS
IOT DEVICES
AND SENSORS
SOCIAL
MEDIA
BUSINESS
USERS
DATA-DRIVEN
APPLICATIONS
DATA
SCIENTISTS
DECISION
MAKERS
HADOOP
SPARK
AI+ML
DATA
ANALYSTS
IT PROS
LOG AND
CLICKSTREAM
DATA
OT
USERS
DATA
WAREHOUSE
Which
leads to
More
Advanced
Decision-
Making
Processes
23. New Data Warehouse Requirements
Performance /
Intraday results on fast growing data
Usability /
Easier to setup, tune, and scale
Optimization /
Address scale and performance at a lower cost
Ecosystem /
Drive operational and machine learning applications
23
25. Real-Time Data Warehouse Explained
§ Low latency between data generation and analysis
§ Micro batching or stream ingestion
§ Sub-second views on operational data and applications
§ Transaction processing for accelerated transformation
§ Extensibility functions for ML applications
§ Durable for operational readiness
26. MemSQL: A Real-Time Data Warehouse
Easy to setup
real-time data pipelines
with exactly-once semantics
Streaming Data Ingest
Memory optimized tables
for analyzing
real-time events
Live Data
Disk optimized tables with up to
10x compression and vectorized
queries for fast analytics
Historical Data
26
27. Real-Time Data Warehouse Ecosystem
27
Streaming Ingest Live Data Historical Data
Real-Time Data
Pipelines
Memory Optimized
Tables
Disk Optimized
Tables
Real-Time Data
Messaging and
Transforms
Historical Data
Real-Time
Application
Analytics
Business Intelligence
Dashboards
Bare Metal, Virtual Machines, Containers On-Premises, Cloud, As a Service
Kafka Spark
Relational Hadoop Amazon S3
37. This image cannot currently be displayed.
Real-time data with high concurrency tracking millions of cars,
drivers, and riders to optimize fleet operations
+
37
39. ++
BUSINESS BENEFITS
• Real-time data with massive concurrency across millions of drivers, riders, and
employees accessing the database concurrently
• Enables real-time indicators to understand operating performance
• Geospatial indexing for live location-based analysis
• Company-wide dashboard for global trends
39
TECHNICAL BENEFITS
• Analyze millions of rows/second
• Analyze historical and live data simultaneously
• Massive concurrency: Hundred of users query reporting databases
40. 40
Real-time analytics transformed profitability analysis of customer logistics
from weekly to daily, and reduced latency from days to minutes
+
41. +
BUSINESS BENEFITS
• Real-time analytics transformed profitability analysis of customer logistics data
• Reduced data latency from hours to minutes giving business users access to the
most recent data
41
TECHNICAL BENEFITS
• Reduced 22 hour ETL to minutes
• Increased query response time by 80x over mySQL