SlideShare uma empresa Scribd logo
1 de 47
Powering real-time big data analytics
with a next-gen GPU database
November 1, 2017
Matt Aslett
Research Director, Data Platforms
and Analytics Channel
451 Research
Dipti Borkar
Vice President, Product Marketing
Kinetica
Housekeeping Items
2
Questions?
A copy of the presentation will be
provided to all attendeesPresentation Slides
Feedback
To ask a question, click on the
question button
Don’t forget to leave feedback
at the end of the webinar
Today’s speakers
3
Matt Aslett
Research Director, Data Platforms and Analytics Channel, 451 Research
Matt has overall responsibility for the data platforms and analytics research coverage, which includes operational and
analytic databases, Hadoop, grid/cache, stream processing, search-based data platforms, data integration, data quality,
data management, analytics, machine learning and advanced analytics. Matt's own primary area of focus includes data
management, reporting and analytics, and exploring how the various data platforms and analytics technology sectors
are converging in the form of next-generation data platforms.
Dipti Borkar
Vice President, Product Marketing, Kinetica
Dipti has over 15 years experience in database technology across relational and non-relational databases. Prior to
Kinetica, Dipti was Vice President of Product Marketing at Couchbase and held several leadership positions there
including Head of Global Technical Sales and Head of Product Management.
Earlier in her career Dipti was a part of the product team at MarkLogic and managed development teams at IBM DB2
where she started her career as a database software engineer. Dipti holds a Masters degree in Computer Science from
the University of California, San Diego with a specialization in databases, and an MBA from the Haas School of
Business at University of California, Berkeley.
Powering real-time big data analytics
with a next-gen GPU database
Matt Aslett
Research Director, Data Platforms & Analytics
451 Research is a leading IT research & advisory company
5
Founded in 2000
300+ employees, including over 120 analysts
2,000+ clients: Technology & Service providers, corporate
advisory, finance, professional services, and IT decision makers
70,000+ IT professionals, business users and consumers in our research
community
Over 52 million data points published each quarter and 4,500+ reports
published each year
3,000+ technology & service providers under coverage
451 Research and its sister company, Uptime Institute, are the two divisions
of The 451 Group
Headquartered in New York City, with offices in London, Boston, San
Francisco, Washington DC, Mexico, Costa Rica, Brazil, Spain, UAE, Russia,
Taiwan, Singapore and Malaysia
Research & Data
Advisory
Events
Go 2 Market
6
Big data and beyond
7
• V is for various things…
but does not define big data
• To understand the trends driving ‘big
data 451 Research focused beyond the
nature of the data on what enterprises
wanted to do with it
Big data and beyond
8
• V is for various things…
but does not define big data
• To understand the trends driving ‘big
data 451 Research focused beyond the
nature of the data on what enterprises
wanted to do with it
• Totality – storing and processing all data (or as much as is economically viable
• Exploration – schema-free approaches to analyzing data to identify new
patterns
• Frequency – more frequent analysis of data to enable real-time decision
making
Traditional systems of engagement and analysis
9
New systems of analysis
10
New systems of engagement
11
New systems of intelligence
12
New systems of intelligence
13
Emergence of GPU databases
▪ Potential customers that are doing deep
learning and more advanced analytics on
HPC systems that leverage GPU
processors
▪ Data scientists or other specialists need
to pull data from a system of record and
load it into an HPC system to perform the
analytics leveraging certain algorithms.
14
15
Emergence of GPU databases
• While HPC systems are well equipped to
handle advanced analytics because they
leverage GPUs, there is also a price to be
paid as it requires moving the data from
one system to the other.
• GPU databases open up the door for
machine learning, deep learning and
other advanced analytical workloads to
be run alongside BI workloads, within the
same environment.
CPUs and GPUs
• A CPU is a very good general processor,
handling a variety of complex tasks well.
• A GPU, is more specialized and can do
certain tasks extremely well.
• CPUs consist of multiple cores
• GPUs consist of thousands of cores
• CPUs geared for serial operations
• GPUs geared for parallel operations
▪ Can be paired together for the greatest overall optimization 16
What’s required for analytics?
17
Methods
Data
Processing
CPUs for standard SQL-based BI
18
Methods
Data
Processing
SQL
CPU
GPUs extend analytical benefits
19
Methods
Data
Processing
SQL
CPU
ML/DL
GPU
Some benefits of GPUs
▪ Performance,
acceleration
▪ Data sets, large/scale
▪ Analytics, machine
learning, deep learning
▪ Querying, real-time
dashboards, reports
▪ Visualization,
interactive, drill down
Key takeaways
20
Thank You!
matthew.aslett@451research.com
@maslett
www.451research.com
21
Powering real-time big data analytics with a
next-gen GPU database
Dipti Borkar| VP, product Marketing| dborkar@kinetica.com
Company
80+, enterprise and startup expertise
Awards Customers and Partners
Investors
$50m Series A June 2017
Ray Lane
Company| Summary
2014
2016
23
Advances in Big Data Processing
DATA WAREHOUSE
RDBMS & Data Warehouse
technologies enable
organizations to store and
analyze growing volumes of data
on high performance machines,
but at high cost.
DISTRIBUTED STORAGE
Hadoop and MapReduce
enables distributed storage and
processing across multiple
machines.
Storing massive volumes of data
becomes more affordable, but
performance is slow
AFFORDABLE MEMORY
Affordable memory allows for
faster data read and write.
HANA, MemSQL, & Exadata
provide faster analytics.
1990 - 2000’s 2005… 2010… 2017…
AT SCALE PROCESSING
BECOMES THE
BOTTLENECK
GPU ACCELERATED COMPUTE
GPU cores bulk process tasks in
parallel - far more efficient for many
data-intensive tasks than CPUs
which process those tasks linearly.
24
GPU | Tale of Numbers
100x
75%
Performance
>100x gains over traditional
RDBMS / NoSQL / In-Mem
Databases
Cores
Modern GPUs can consist of
up to 3000+ cores compared
to 32 in a CPU
Costs
75% reduction in
infrastructure costs, licensing,
staff, etc.
More with Less
Increase performance,
throughput, capability while
minimizing the costs to
support the business
GPUs are designed around thousands of small, efficient cores that
are well suited to performing repeated similar instructions in
parallel – making them ideal for the compute-intensive workloads
required of large data sets.
Performance Increase
Infrastructure Cost Savings
4000vs.
32
25
Kinetica: Core
26
ANALYTICS DATABASE ACCELERATED BY GPUs
KINETICA
Commodity Hardware
w/ GPUs
Disk
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
GPU Accelerated
Columnar In-memory Database
HTTP Head Node
Columnar in-memory database
Data available much like a traditional RDBMS… rows,
columns
Data held in-memory; persisted to disk
Interact with Kinetica through its native REST API,
Java, Python, JavaScript, NodeJS, C++, SQL, etc… as
well as with various connectors
Native GIS & IP address object support
VERY FAST: Ideal for OLAP workloads
Typical hardware setup: 256GB - 1TB
memory with 2-4 GPUs per node.
Kinetica Architecture
27
ETL / STREAM
PROCESSING
ON DEMAND SCALE OUT +
1TB MEM / 2 GPU CARDS
SQL
Native
APIs
PARALLELINGEST
Geospatial
WMS
Custom
Connectors
In-Database Processing
CUSTOM
LOGIC BIDMach
ML
Libs
BI DASHBOARDS
BI / GIS / APPS
CUSTOM APPS
& GEOSPATIAL
KINETICA ‘REVEAL’
STREAMINGDATAERP/CRM/
TRANSACTIONALDATA
UDFs
The Kinetica cluster architecture
VISUALIZATION via ODBC/JDBCAPIs
Java API
JavaScript API
REST API
C++ API
Node.js API
Python API
OPEN SOURCE
INTEGRATION
Apache NiFi
Apache Kafka
Apache Spark
Apache Storm
GEOSPATIAL CAPABILITIES
Geometric
Objects
Tracks
Geospatial
Endpoints
WMS
WKT
KINETICA CLUSTER
On-Demand Scale
Commodity Hardware
w/ GPUs
Disk
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
Columnar
In-memory
HTTP Server
Commodity Hardware
w/ GPUs
Disk
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
Columnar
In-memory
HTTP Server
Commodity Hardware
w/ GPUs
Disk
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
Columnar
In-memory
HTTP Server
Commodity Hardware
w/ GPUs
Disk
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
Columnar
In-memory
HTTP Server
OTHER
INTEGRATION
Message Queues
ETL Tools
Streaming Tools
28
Parallel Ingest Provides High Performance Streaming
29
1 NODE (1TB/2GPU)
PARALLEL
INGEST
1 NODE (1TB/2GPU)
1 NODE (1TB/2GPU)
Each node of the system can share the task of data
ingest, provides more and faster throughput. It can
always be made faster simply by adding more nodes.
50-100x Faster on Queries with Large Datasets
• Large retailer tested complex SQL queries
on 3 years of retail data (150bn rows)
• 10 node Kinetica cluster against 30TB+
cluster from next best alternative
• GPU is able to perform many instructions in
parallel. Huge performance gains on
aggregations, group bys, joins, etc.
• Kinetica sustained ingest of 1.3bn
objects/minute with 70 attributes per row
30
WHEN COMPARED TO LEADING IN-MEMORY ALTERNATIVES
Combined Strengths and Capabilities
Kinetica | Combined Strengths and Capabilities
Supercharge
BI
Taking advantage of the parallel nature
of the GPU, Kinetica delivers low-
latency, high-performance analytics on
large and steaming data sets.
Simultaneously ingest,
explore, analyze, and
visualize data within
milliseconds to make critical
decisions.
User-defined functions (UDFs) allow
for distributed custom compute
directly from within the database.
Easier to work with large
geospatial data sets.
Fast, Distributed
Database Engine
In-Database
Analytics
Native
Geospatial &
Visualization
Pipeline
32
Copyright (C) 2017 451 Research LLC
New systems of intelligence
33
Use Cases
FASTER BI WITH A GPU DATABASE
35
Tableau + Kinetica
Kinetica combines GPU’s brute-force compute with the
simplicity of a relational database for millisecond query
response on massive data sets without extensive
tuning.
• Incredibly fast query performance.
• Distributed design - ideal for large and streaming datasets.
• SQL-92 compliant relational database – without limits.
• More power means less need for tuning, indexing, and
administration of the database.
• No need to do pre-aggregation or build out cubes.
• Reduce reliance on specialized skills to prep and set-up
data.
36
Rethink interaction between business analyst & data scientist
SPECIALIZED AI/ DATA
SCIENCE TOOLS
SUBSET
DATA SCIENTISTS
BUSINESS USERS
EXTRACT
EXTRACTING DATA FOR AI IS
EXPENSIVE AND SLOW
ENTERPRISES
STRUGGLE TO MAKE
AI MODELS AVAILABLE
TO BUSINESS
???
• MapReduce
• Spark
• NoSQL DBs
• SQL Databases
• DFS
• CPU Compute Nodes
• GPU Compute Nodes
Proliferation of Hardware &
Software Components
Kinetica | The Ideal Process – Consolidate the BI / AI stack
37
Monte Carlo Risk
Custom Function 2
Custom Function 3
API EXPOSES CUSTOM
FUNCTIONS WHICH CAN BE
MADE AVAILABLE TO BUSINESS
USERS
BUSINESS USERS
DATA SCIENTISTS
UDFs
• Analytics
• AI/ML/Deep Learning
• Power of in-memory SQL
• Integrated CPU/GPU
• Bomb with Streams
Single Database Platform for
AI + BI
AI & BI on One GPU-Accelerated Database
HIGH PERFORMANCE ANALYTICS
DATABASE
UDF UDF UDF
ODBC
/ JDBC Native
REST API WMS
BUSINESS INTELLIGENCE
CUSTOM APPLICATIONS
HIGH FIDELITY
GEOSPATIAL PIPELINE
MACHINE LEARNING
& DEEP LEARNING GPU-ACCELERATED
DATA SCIENCE
PREDICTIVE MODELS
e.g. Risk Management,
Sales Volume, Fraud.
BIDMach
SQL
DATA SCIENTISTS
/ DEVELOPERS
BUSINESS
USERS
38
Distributed Geospatial Pipeline
39
NATIVE VISUALIZATION IS DESIGNED FOR FAST MOVING, LOCATION-BASED DATA
Native Geospatial Object Types
• Points, Shapes, Tracks, Labels
Native Geospatial Functions
• Filters (by area, by series, by geometry, etc.)
• Aggregation (histograms)
• Geofencing - triggers
• Video generation (based on dates/times)
Generate Map Overlay Imagery (via WMS)
• Rasterize points
• Style based on attributes (class-break)
• Heat maps
Customer Case-studies
ENTERTAINMENT | Customer 360
41
CASE STUDY : BI ACCELERATION
BUSINESS OBJECTIVE
• Accelerate Tableau dashboards for faster customer 360 analytics
NEW CAPABILITIES DELIVERED
• 24X faster dashboard loads
• 3.5X faster slice and dice, drilldowns, filters
SOLUTION OVERVIEW
• Tableau Server and Kinetica running on Google Cloud Platform
• Kinetica accelerates EDW workload
• Simply point to Kinetica using Tableau’s replace data source feature
42
AD TECH | Real-time reporting & ad delivery
CASE STUDY : REAL-TIME DATA AND ANALYTICS
BUSINESS OBJECTIVE
• Be first to market with game changing technologies that put publishers’
needs first
• Support PubMatic’s real-time campaign reporting
NEW CAPABILITIES DELIVERED
• High-speed ingest, store, and persist data processing capabilities
• Ad-hoc analytics on ad impression and bid data
SOLUTION OVERVIEW
• Kinetica considered as a functional replacement for a 40-node Apache
Apex cluster -> smaller HW footprint
• Hi-speed data ingestion via native Kafka integration
• Python access to Kinetica data store for simplified data science discovery
• Contributed fast data capabilities to long term retention and archive
Hadoop Data Lake
“At PubMatic, we are consistently focused on being early to
market with leading technologies that put publishers’ needs
first. Processing over one trillion ad impressions
monthly, PubMatic provides omni-channel revenue automation
technology for publishers and programmatic tools for media
buyers. Leveraging leading edge data and technology
innovation, Kinetica contributes high-speed ingest, store,
and persist data processing capabilities in support
of PubMatic’s real-time reporting and ad pacing engine.”
- Vasu Cherlopalle, Vice President of Big Data and Analytics
One of the things I like about
Kinetica is it gives us more of a
general-purpose use of the
technology. There has been a lot
of software created to answer
certain questions [but] highly
specialized tools have limited
functionality and are tuned to do
a certain workload.
"
Mark Ramsey, Chief Data Officer at GSK
BUSINESS OBJECTIVE
• Faster processing of transcriptomics to run simulations of
chemical reactions for drug discovery, research, and
development
NEW CAPABILITIES DELIVERED
• In-database processing to develop models, leveraging GPU
acceleration for performance, and direct access to CUDA APIs
via UDFs deployed within Kinetica
• Seek out signals from massive collection of drug targets
combined from external data, historical data from
experiments, ad clinical trials
SOLUTION OVERVIEW
• Kinetica running on-premises on a cluster of 7 HPE DL 380
servers
• Familiar relational database with GPU acceleration
LIFE SCIENCES : GENOMICS RESEARCH
CASE STUDY : ADVANCED IN-DATABASE ANALYTICS
43
PIPELINE & WELL ANALYTICS
44
CASE STUDY : LOCATION BASED ANALYTICS
BUSINESS OBJECTIVE
• Augment SaaS offering to provide research data and
analytics on oil and gas to energy investors and operators
with geospatial query, visualization, and analytics
NEW CAPABILITIES DELIVERED
• Geospatial visualization and analytics of massive number of
wells, pipelines by land ownership, region etc.
• Custom visualizations and charts for data-driven insights
• Embedded solution with seamless Node.js integration, GPU
acceleration
SOLUTION OVERVIEW
• Kinetica running in RSEG’s Amazon Web Services VPC
deployment
LOGISTICS | Workforce optimization
BUSINESS OBJECTIVE
• Deliver better business services, optimize operations, and save
costs across 600,000 employees, 215,000 delivery vehicles, and
deliver 500 million pieces of mail daily
NEW CAPABILITIES DELIVERED
• Real-time delivery and pickup notifications, shipment routing,
just-in-time supplies
• Real-time route optimization - route planning, rerouting
• Geospatial analytics to uncover overlapping coverage areas,
uncovered areas, and distribution bottlenecks
SOLUTION OVERVIEW
• USPS runs Kinetica as a 70 TB in-memory database on a HPE DL
380 200 node system. Each node consists of a single X86 blade
server with 1TB RAM, 2 NVIDIA K80 GPUs
• Kinetica collects, processes, and analyzes 200,000 messages
per minute for real-time streaming analytics. 15,000 daily
sessions with 5 9’s uptime
45
PERFORMANCE SCALABLE CONVERGED AI AND BI
INDUSTRY-STANDARD
CONNECTIVITY
 Distributed
 Columnar
 In-Memory
 Relational
 GPU Accelerated
 Ingest, Query, Compute
 Commodity Hardware
 On-premises or Cloud
 Scales to 100’s of TB
 Less Infrastructure
 More Compute
 Predictable, Linear
 Machine Learning
 Artificial Intelligence
 In-Database
 Self-Service
 Open Source
 Kafka, Storm, NiFi, Spark
 ODBC, JDBC
 ANSI SQL/92
 API’s for Java, JS, C++,
Python, Node.js, REST
Summary | Kinetica GPU Accelerated Analytics
46
Thank you!
Dipti Borkar | VP Product Marketing| dborkar@kinetica.com

Mais conteúdo relacionado

Mais procurados

GPU 101: The Beast In Data Centers
GPU 101: The Beast In Data CentersGPU 101: The Beast In Data Centers
GPU 101: The Beast In Data CentersRommel Garcia
 
How GPUs Enable XVA Pricing and Risk Calculations for Risk Aggregation
How GPUs Enable XVA Pricing and Risk Calculations for Risk AggregationHow GPUs Enable XVA Pricing and Risk Calculations for Risk Aggregation
How GPUs Enable XVA Pricing and Risk Calculations for Risk AggregationKinetica
 
GPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holdsGPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holdsArnon Shimoni
 
Introduction to SQream and the IoT environment
Introduction to SQream and the IoT environmentIntroduction to SQream and the IoT environment
Introduction to SQream and the IoT environmentArnon Shimoni
 
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...Databricks
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...Maya Lumbroso
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingDatabricks
 
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-103-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1Ognjen Antonic
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Spark Summit
 
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationIndexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationCesare Cugnasco
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is DistributedAlluxio, Inc.
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...DataWorks Summit
 
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream TechnologiesDataconomy Media
 
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...Spark Summit
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Qubole
 
Data Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data EngineeringData Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data EngineeringAnant Corporation
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachDataWorks Summit
 

Mais procurados (20)

GPU 101: The Beast In Data Centers
GPU 101: The Beast In Data CentersGPU 101: The Beast In Data Centers
GPU 101: The Beast In Data Centers
 
How GPUs Enable XVA Pricing and Risk Calculations for Risk Aggregation
How GPUs Enable XVA Pricing and Risk Calculations for Risk AggregationHow GPUs Enable XVA Pricing and Risk Calculations for Risk Aggregation
How GPUs Enable XVA Pricing and Risk Calculations for Risk Aggregation
 
GPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holdsGPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holds
 
Introduction to SQream and the IoT environment
Introduction to SQream and the IoT environmentIntroduction to SQream and the IoT environment
Introduction to SQream and the IoT environment
 
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks Streaming
 
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-103-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
 
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationIndexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is Distributed
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
 
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
 
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
 
Data Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data EngineeringData Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data Engineering
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
 

Semelhante a Powering Real-Time Big Data Analytics with a Next-Gen GPU Database

ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyAlluxio, Inc.
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMichael Hiskey
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseRizaldy Ignacio
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Group
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsjdijcks
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSPhilip Filleul
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data AnalyticsAttunity
 
Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013IntelAPAC
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesDenodo
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeDATAVERSITY
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Denodo
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Denodo
 

Semelhante a Powering Real-Time Big Data Analytics with a Next-Gen GPU Database (20)

ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
Sql 2017 net raf
Sql 2017  net rafSql 2017  net raf
Sql 2017 net raf
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 
Sql 2016 2017 full
Sql 2016   2017 fullSql 2016   2017 full
Sql 2016 2017 full
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FS
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
4AA6-4492ENW
4AA6-4492ENW4AA6-4492ENW
4AA6-4492ENW
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 
Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & Bénéfices
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
 

Último

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Último (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Powering Real-Time Big Data Analytics with a Next-Gen GPU Database

  • 1. Powering real-time big data analytics with a next-gen GPU database November 1, 2017 Matt Aslett Research Director, Data Platforms and Analytics Channel 451 Research Dipti Borkar Vice President, Product Marketing Kinetica
  • 2. Housekeeping Items 2 Questions? A copy of the presentation will be provided to all attendeesPresentation Slides Feedback To ask a question, click on the question button Don’t forget to leave feedback at the end of the webinar
  • 3. Today’s speakers 3 Matt Aslett Research Director, Data Platforms and Analytics Channel, 451 Research Matt has overall responsibility for the data platforms and analytics research coverage, which includes operational and analytic databases, Hadoop, grid/cache, stream processing, search-based data platforms, data integration, data quality, data management, analytics, machine learning and advanced analytics. Matt's own primary area of focus includes data management, reporting and analytics, and exploring how the various data platforms and analytics technology sectors are converging in the form of next-generation data platforms. Dipti Borkar Vice President, Product Marketing, Kinetica Dipti has over 15 years experience in database technology across relational and non-relational databases. Prior to Kinetica, Dipti was Vice President of Product Marketing at Couchbase and held several leadership positions there including Head of Global Technical Sales and Head of Product Management. Earlier in her career Dipti was a part of the product team at MarkLogic and managed development teams at IBM DB2 where she started her career as a database software engineer. Dipti holds a Masters degree in Computer Science from the University of California, San Diego with a specialization in databases, and an MBA from the Haas School of Business at University of California, Berkeley.
  • 4. Powering real-time big data analytics with a next-gen GPU database Matt Aslett Research Director, Data Platforms & Analytics
  • 5. 451 Research is a leading IT research & advisory company 5 Founded in 2000 300+ employees, including over 120 analysts 2,000+ clients: Technology & Service providers, corporate advisory, finance, professional services, and IT decision makers 70,000+ IT professionals, business users and consumers in our research community Over 52 million data points published each quarter and 4,500+ reports published each year 3,000+ technology & service providers under coverage 451 Research and its sister company, Uptime Institute, are the two divisions of The 451 Group Headquartered in New York City, with offices in London, Boston, San Francisco, Washington DC, Mexico, Costa Rica, Brazil, Spain, UAE, Russia, Taiwan, Singapore and Malaysia Research & Data Advisory Events Go 2 Market
  • 6. 6
  • 7. Big data and beyond 7 • V is for various things… but does not define big data • To understand the trends driving ‘big data 451 Research focused beyond the nature of the data on what enterprises wanted to do with it
  • 8. Big data and beyond 8 • V is for various things… but does not define big data • To understand the trends driving ‘big data 451 Research focused beyond the nature of the data on what enterprises wanted to do with it • Totality – storing and processing all data (or as much as is economically viable • Exploration – schema-free approaches to analyzing data to identify new patterns • Frequency – more frequent analysis of data to enable real-time decision making
  • 9. Traditional systems of engagement and analysis 9
  • 10. New systems of analysis 10
  • 11. New systems of engagement 11
  • 12. New systems of intelligence 12
  • 13. New systems of intelligence 13
  • 14. Emergence of GPU databases ▪ Potential customers that are doing deep learning and more advanced analytics on HPC systems that leverage GPU processors ▪ Data scientists or other specialists need to pull data from a system of record and load it into an HPC system to perform the analytics leveraging certain algorithms. 14
  • 15. 15 Emergence of GPU databases • While HPC systems are well equipped to handle advanced analytics because they leverage GPUs, there is also a price to be paid as it requires moving the data from one system to the other. • GPU databases open up the door for machine learning, deep learning and other advanced analytical workloads to be run alongside BI workloads, within the same environment.
  • 16. CPUs and GPUs • A CPU is a very good general processor, handling a variety of complex tasks well. • A GPU, is more specialized and can do certain tasks extremely well. • CPUs consist of multiple cores • GPUs consist of thousands of cores • CPUs geared for serial operations • GPUs geared for parallel operations ▪ Can be paired together for the greatest overall optimization 16
  • 17. What’s required for analytics? 17 Methods Data Processing
  • 18. CPUs for standard SQL-based BI 18 Methods Data Processing SQL CPU
  • 19. GPUs extend analytical benefits 19 Methods Data Processing SQL CPU ML/DL GPU Some benefits of GPUs ▪ Performance, acceleration ▪ Data sets, large/scale ▪ Analytics, machine learning, deep learning ▪ Querying, real-time dashboards, reports ▪ Visualization, interactive, drill down
  • 22. Powering real-time big data analytics with a next-gen GPU database Dipti Borkar| VP, product Marketing| dborkar@kinetica.com
  • 23. Company 80+, enterprise and startup expertise Awards Customers and Partners Investors $50m Series A June 2017 Ray Lane Company| Summary 2014 2016 23
  • 24. Advances in Big Data Processing DATA WAREHOUSE RDBMS & Data Warehouse technologies enable organizations to store and analyze growing volumes of data on high performance machines, but at high cost. DISTRIBUTED STORAGE Hadoop and MapReduce enables distributed storage and processing across multiple machines. Storing massive volumes of data becomes more affordable, but performance is slow AFFORDABLE MEMORY Affordable memory allows for faster data read and write. HANA, MemSQL, & Exadata provide faster analytics. 1990 - 2000’s 2005… 2010… 2017… AT SCALE PROCESSING BECOMES THE BOTTLENECK GPU ACCELERATED COMPUTE GPU cores bulk process tasks in parallel - far more efficient for many data-intensive tasks than CPUs which process those tasks linearly. 24
  • 25. GPU | Tale of Numbers 100x 75% Performance >100x gains over traditional RDBMS / NoSQL / In-Mem Databases Cores Modern GPUs can consist of up to 3000+ cores compared to 32 in a CPU Costs 75% reduction in infrastructure costs, licensing, staff, etc. More with Less Increase performance, throughput, capability while minimizing the costs to support the business GPUs are designed around thousands of small, efficient cores that are well suited to performing repeated similar instructions in parallel – making them ideal for the compute-intensive workloads required of large data sets. Performance Increase Infrastructure Cost Savings 4000vs. 32 25
  • 26. Kinetica: Core 26 ANALYTICS DATABASE ACCELERATED BY GPUs KINETICA Commodity Hardware w/ GPUs Disk A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4 GPU Accelerated Columnar In-memory Database HTTP Head Node Columnar in-memory database Data available much like a traditional RDBMS… rows, columns Data held in-memory; persisted to disk Interact with Kinetica through its native REST API, Java, Python, JavaScript, NodeJS, C++, SQL, etc… as well as with various connectors Native GIS & IP address object support VERY FAST: Ideal for OLAP workloads Typical hardware setup: 256GB - 1TB memory with 2-4 GPUs per node.
  • 27. Kinetica Architecture 27 ETL / STREAM PROCESSING ON DEMAND SCALE OUT + 1TB MEM / 2 GPU CARDS SQL Native APIs PARALLELINGEST Geospatial WMS Custom Connectors In-Database Processing CUSTOM LOGIC BIDMach ML Libs BI DASHBOARDS BI / GIS / APPS CUSTOM APPS & GEOSPATIAL KINETICA ‘REVEAL’ STREAMINGDATAERP/CRM/ TRANSACTIONALDATA UDFs
  • 28. The Kinetica cluster architecture VISUALIZATION via ODBC/JDBCAPIs Java API JavaScript API REST API C++ API Node.js API Python API OPEN SOURCE INTEGRATION Apache NiFi Apache Kafka Apache Spark Apache Storm GEOSPATIAL CAPABILITIES Geometric Objects Tracks Geospatial Endpoints WMS WKT KINETICA CLUSTER On-Demand Scale Commodity Hardware w/ GPUs Disk A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4 Columnar In-memory HTTP Server Commodity Hardware w/ GPUs Disk A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4 Columnar In-memory HTTP Server Commodity Hardware w/ GPUs Disk A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4 Columnar In-memory HTTP Server Commodity Hardware w/ GPUs Disk A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4 Columnar In-memory HTTP Server OTHER INTEGRATION Message Queues ETL Tools Streaming Tools 28
  • 29. Parallel Ingest Provides High Performance Streaming 29 1 NODE (1TB/2GPU) PARALLEL INGEST 1 NODE (1TB/2GPU) 1 NODE (1TB/2GPU) Each node of the system can share the task of data ingest, provides more and faster throughput. It can always be made faster simply by adding more nodes.
  • 30. 50-100x Faster on Queries with Large Datasets • Large retailer tested complex SQL queries on 3 years of retail data (150bn rows) • 10 node Kinetica cluster against 30TB+ cluster from next best alternative • GPU is able to perform many instructions in parallel. Huge performance gains on aggregations, group bys, joins, etc. • Kinetica sustained ingest of 1.3bn objects/minute with 70 attributes per row 30 WHEN COMPARED TO LEADING IN-MEMORY ALTERNATIVES
  • 31. Combined Strengths and Capabilities
  • 32. Kinetica | Combined Strengths and Capabilities Supercharge BI Taking advantage of the parallel nature of the GPU, Kinetica delivers low- latency, high-performance analytics on large and steaming data sets. Simultaneously ingest, explore, analyze, and visualize data within milliseconds to make critical decisions. User-defined functions (UDFs) allow for distributed custom compute directly from within the database. Easier to work with large geospatial data sets. Fast, Distributed Database Engine In-Database Analytics Native Geospatial & Visualization Pipeline 32
  • 33. Copyright (C) 2017 451 Research LLC New systems of intelligence 33
  • 35. FASTER BI WITH A GPU DATABASE 35 Tableau + Kinetica Kinetica combines GPU’s brute-force compute with the simplicity of a relational database for millisecond query response on massive data sets without extensive tuning. • Incredibly fast query performance. • Distributed design - ideal for large and streaming datasets. • SQL-92 compliant relational database – without limits. • More power means less need for tuning, indexing, and administration of the database. • No need to do pre-aggregation or build out cubes. • Reduce reliance on specialized skills to prep and set-up data.
  • 36. 36 Rethink interaction between business analyst & data scientist SPECIALIZED AI/ DATA SCIENCE TOOLS SUBSET DATA SCIENTISTS BUSINESS USERS EXTRACT EXTRACTING DATA FOR AI IS EXPENSIVE AND SLOW ENTERPRISES STRUGGLE TO MAKE AI MODELS AVAILABLE TO BUSINESS ??? • MapReduce • Spark • NoSQL DBs • SQL Databases • DFS • CPU Compute Nodes • GPU Compute Nodes Proliferation of Hardware & Software Components
  • 37. Kinetica | The Ideal Process – Consolidate the BI / AI stack 37 Monte Carlo Risk Custom Function 2 Custom Function 3 API EXPOSES CUSTOM FUNCTIONS WHICH CAN BE MADE AVAILABLE TO BUSINESS USERS BUSINESS USERS DATA SCIENTISTS UDFs • Analytics • AI/ML/Deep Learning • Power of in-memory SQL • Integrated CPU/GPU • Bomb with Streams Single Database Platform for AI + BI
  • 38. AI & BI on One GPU-Accelerated Database HIGH PERFORMANCE ANALYTICS DATABASE UDF UDF UDF ODBC / JDBC Native REST API WMS BUSINESS INTELLIGENCE CUSTOM APPLICATIONS HIGH FIDELITY GEOSPATIAL PIPELINE MACHINE LEARNING & DEEP LEARNING GPU-ACCELERATED DATA SCIENCE PREDICTIVE MODELS e.g. Risk Management, Sales Volume, Fraud. BIDMach SQL DATA SCIENTISTS / DEVELOPERS BUSINESS USERS 38
  • 39. Distributed Geospatial Pipeline 39 NATIVE VISUALIZATION IS DESIGNED FOR FAST MOVING, LOCATION-BASED DATA Native Geospatial Object Types • Points, Shapes, Tracks, Labels Native Geospatial Functions • Filters (by area, by series, by geometry, etc.) • Aggregation (histograms) • Geofencing - triggers • Video generation (based on dates/times) Generate Map Overlay Imagery (via WMS) • Rasterize points • Style based on attributes (class-break) • Heat maps
  • 41. ENTERTAINMENT | Customer 360 41 CASE STUDY : BI ACCELERATION BUSINESS OBJECTIVE • Accelerate Tableau dashboards for faster customer 360 analytics NEW CAPABILITIES DELIVERED • 24X faster dashboard loads • 3.5X faster slice and dice, drilldowns, filters SOLUTION OVERVIEW • Tableau Server and Kinetica running on Google Cloud Platform • Kinetica accelerates EDW workload • Simply point to Kinetica using Tableau’s replace data source feature
  • 42. 42 AD TECH | Real-time reporting & ad delivery CASE STUDY : REAL-TIME DATA AND ANALYTICS BUSINESS OBJECTIVE • Be first to market with game changing technologies that put publishers’ needs first • Support PubMatic’s real-time campaign reporting NEW CAPABILITIES DELIVERED • High-speed ingest, store, and persist data processing capabilities • Ad-hoc analytics on ad impression and bid data SOLUTION OVERVIEW • Kinetica considered as a functional replacement for a 40-node Apache Apex cluster -> smaller HW footprint • Hi-speed data ingestion via native Kafka integration • Python access to Kinetica data store for simplified data science discovery • Contributed fast data capabilities to long term retention and archive Hadoop Data Lake “At PubMatic, we are consistently focused on being early to market with leading technologies that put publishers’ needs first. Processing over one trillion ad impressions monthly, PubMatic provides omni-channel revenue automation technology for publishers and programmatic tools for media buyers. Leveraging leading edge data and technology innovation, Kinetica contributes high-speed ingest, store, and persist data processing capabilities in support of PubMatic’s real-time reporting and ad pacing engine.” - Vasu Cherlopalle, Vice President of Big Data and Analytics
  • 43. One of the things I like about Kinetica is it gives us more of a general-purpose use of the technology. There has been a lot of software created to answer certain questions [but] highly specialized tools have limited functionality and are tuned to do a certain workload. " Mark Ramsey, Chief Data Officer at GSK BUSINESS OBJECTIVE • Faster processing of transcriptomics to run simulations of chemical reactions for drug discovery, research, and development NEW CAPABILITIES DELIVERED • In-database processing to develop models, leveraging GPU acceleration for performance, and direct access to CUDA APIs via UDFs deployed within Kinetica • Seek out signals from massive collection of drug targets combined from external data, historical data from experiments, ad clinical trials SOLUTION OVERVIEW • Kinetica running on-premises on a cluster of 7 HPE DL 380 servers • Familiar relational database with GPU acceleration LIFE SCIENCES : GENOMICS RESEARCH CASE STUDY : ADVANCED IN-DATABASE ANALYTICS 43
  • 44. PIPELINE & WELL ANALYTICS 44 CASE STUDY : LOCATION BASED ANALYTICS BUSINESS OBJECTIVE • Augment SaaS offering to provide research data and analytics on oil and gas to energy investors and operators with geospatial query, visualization, and analytics NEW CAPABILITIES DELIVERED • Geospatial visualization and analytics of massive number of wells, pipelines by land ownership, region etc. • Custom visualizations and charts for data-driven insights • Embedded solution with seamless Node.js integration, GPU acceleration SOLUTION OVERVIEW • Kinetica running in RSEG’s Amazon Web Services VPC deployment
  • 45. LOGISTICS | Workforce optimization BUSINESS OBJECTIVE • Deliver better business services, optimize operations, and save costs across 600,000 employees, 215,000 delivery vehicles, and deliver 500 million pieces of mail daily NEW CAPABILITIES DELIVERED • Real-time delivery and pickup notifications, shipment routing, just-in-time supplies • Real-time route optimization - route planning, rerouting • Geospatial analytics to uncover overlapping coverage areas, uncovered areas, and distribution bottlenecks SOLUTION OVERVIEW • USPS runs Kinetica as a 70 TB in-memory database on a HPE DL 380 200 node system. Each node consists of a single X86 blade server with 1TB RAM, 2 NVIDIA K80 GPUs • Kinetica collects, processes, and analyzes 200,000 messages per minute for real-time streaming analytics. 15,000 daily sessions with 5 9’s uptime 45
  • 46. PERFORMANCE SCALABLE CONVERGED AI AND BI INDUSTRY-STANDARD CONNECTIVITY  Distributed  Columnar  In-Memory  Relational  GPU Accelerated  Ingest, Query, Compute  Commodity Hardware  On-premises or Cloud  Scales to 100’s of TB  Less Infrastructure  More Compute  Predictable, Linear  Machine Learning  Artificial Intelligence  In-Database  Self-Service  Open Source  Kafka, Storm, NiFi, Spark  ODBC, JDBC  ANSI SQL/92  API’s for Java, JS, C++, Python, Node.js, REST Summary | Kinetica GPU Accelerated Analytics 46
  • 47. Thank you! Dipti Borkar | VP Product Marketing| dborkar@kinetica.com