SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
Making the Most of In-Memory: More than Speed

The Briefing Room
Welcome

Host:
Eric Kavanagh
eric.kavanagh@bloorgroup.com

Twitter Tag: #briefr

The Briefing Room
Mission

!   Reveal the essential characteristics of enterprise software,
good and bad
!   Provide a forum for detailed analysis of today s innovative
technologies
!   Give vendors a chance to explain their product to savvy
analysts
!   Allow audience members to pose serious questions... and get
answers!

Twitter Tag: #briefr

The Briefing Room
Topics

This Month: DATA PROCESSING
November: DATA DISCOVERY & VISUALIZATION
December: INNOVATORS

Twitter Tag: #briefr

The Briefing Room
Data Processing

“

Efficiency	
  is	
  doing	
  things	
  
right;	
  effec2veness	
  is	
  doing	
  
the	
  right	
  things.	
  
~Peter Drucker

Twitter Tag: #briefr

The Briefing Room
Analyst: Robin Bloor

Robin Bloor is
Chief Analyst at
The Bloor Group	
	

robin.bloor@bloorgroup.com

Twitter Tag: #briefr

The Briefing Room
Kognitio
!   Founded in 1989, Kognitio is both an in-memory database
and an analytical engine
!   The Kognitio Analytical Platform can be deployed as
software, as an appliance, or in the cloud
!   The platform enables flexible, ad hoc queries on complex
data sets, including data from Hadoop, and it offers scaleup and scale-out capabilities

Twitter Tag: #briefr

The Briefing Room
Guest: Roger Gaskell

 
Roger Gaskell is the Chief Technology Officer and one of the founding members
of the Kognitio Development Team. He has overall responsibility for all product
development, strategic direction and roadmap of new innovation for the
Kognitio Analytical Platform. Roger has been instrumental in all generations of
the product to date. Over this time, it has evolved from an appliance-based
system in the original beta offering in 1989, to a hardware-independent
software for x86 processing, then to a cloud-based Platform-as-a-Service
offering in in the mid-1990s. Prior to Kognitio, Roger was test and development
manager at AB Electronics. During this time his primary responsibility was for
the famous BBC Micro Computer and the development and testing of the first
mass production of personal computers for IBM.

Twitter Tag: #briefr

The Briefing Room
Making the most of
in-memory platforms
October 2013
What is an “In-memory” analytical platform

A database where queries are run from data held in
computer memory (RAM) rather than mechanical disk

Memory = Fast / Disk = Slow
Analytics go much quicker – SIMPLE?

Unfortunately, it’s not as simple as that….
10
Why in-memory: RAM is faster than disk (really!)
Actually, this only part of the story:
workload
filtering
crunching

Analytics completely change the workload
characteristics on the database
Simple reporting & transactional processing
is all about “filtering” the data of interest
Analytics is all about complex “crunching”
of the data once it is filtered

CPU cycles
storing

Storing data on physical disks severely limits the
rate at which data can be provided to the CPUs

access
11

Crunching needs processing power & consumes
CPU cycles

Accessing data directly from RAM allows
much more CPU power to be deployed
Analytics is about

crunching through data

CPU cycle-intensive & CPU-bound
“CRUNCHING”
Analytical
Functions

Joins
Aggregations

Sorts

Grouping

•  To understand what is happening in the data
More complex analytics

=

More pronounced this becomes

•  In-memory analytical platforms are therefore CPU-bound
–  Assume disk I/O speeds not a bottleneck
–  In-memory removes the disk I/O bottleneck
12
For analytics, the CPU is king
Being CPU-bound fundamentally changes
a system’s design philosophy

Disk IO Bound

CPU Bound

CPUs wait for data from disk
No need for efficient coding
Parallelisation ineffective

Every CPU cycle is precious – efficient coding
Parallelization = scalable performance
Advanced techniques minimize CPU cycles

Interactive / ad hoc analytics:
THINK data to core ratios ≈ <10GB data per CPU core
13
Why now?

Interest in
in-memory

Price
of RAM,
Logarithmic
(10)

1987

14

1995

2000

2005

2010
Mature BI being overtaken
Numbers, tables, charts, indicators
Historical information, latency
…accessed with ease and simplicity

Decision Support
But BI and BI tools have plateaued!
Progression into advanced analytics & data science

It’s now all about doing more math
…a lot more math
15
Thus more complex methods – real-time
Machine learning algorithms

Analytical Complexity

Behaviour
modelling

Statistical
Analysis

Dynamic
Simulation

Clustering

Dynamic
Interaction
Fraud detection
Reporting & BPM

Campaign
Management

#PP_R
Technology/Automation
16
How to efficiently exploit RAM
•  A large cache is not in-memory
–  In-memory platforms hold data in structures that take advantage of the
properties of RAM
–  Caches are copies of frequently used disk blocks

•  Platform designed to specifically exploit the random
access nature of memory
–  Different algorithms
–  CPU cycles are precious – code efficiency paramount
–  Advanced techniques used to reduce code path length
•  Dynamic Machine Code Generation
•  Extended CPU instruction sets

•  Parallelize everything
–  Scale-out and Scale-up
–  Fully and efficiently use every CPU
core, in every CPU, in every server
17
Analytical Platform Reference Architecture
Application &
Client Layer
All BI Tools

All OLAP Clients

Excel

Analytical
Platform
Layer

Near-line
Storage
(optional)

Reporting

Persistence
Layer
18

Kognitio
Storage

Hadoop
Clusters

Cloud
Storage

Enterprise Data
Warehouses

Legacy
Systems
Perceptions & Questions

Analyst:
Robin Bloor

Twitter Tag: #briefr

The Briefing Room
Big Data, Maybe — Big Parallelism, Yes
Many latency-reducing changes are afoot:
u  Hadoop
u  CPU

is a data lake – It’s about latency

and memory rule – The old database is dying

u  Grids,

not clusters – A server is now a cluster

u  Scaling

Up AND Scaling Out – “Only scaling out”
is last year’s story

u  SSD

will replace spinning disk – But it will never
compete with RAM
Why the Excitement?
What are the “new” applications?
BIG DATA capture and staging
BIG DATA ANALYTICS
LITTLE DATA ANALYTICS
OPERATIONAL INTELLIGENCE
A “Modern” Workload

Query Light
&
Math Heavy
Where the Rubber Meets the Road
It isn’t really about application latency any more, it’s
about business process latency (business time!). This
can have many aspects:
u  The

collapse of data flows – take the processing
to the data

u  Data
u  Full

warehouse offload

process automation

u  Lower

latency = NEW BUSINESS PROCESSES
The Question
The question for most organizations is:

Exactly how do
we take
advantage of
these changes?
This is a BUSINESS question AND a TECHNICAL question.
u  Low

latency is exciting, but where do you see the
clear business opportunities?

u  There

seems to be a conundrum about where to
store “slow” data:
Ø  Hadoop?
Ø  Traditional data warehouse?
Ø  New data warehouse?

u  Is

the split between the application and the data
real any more?
u  In

your opinion, does the Enterprise need a new
architecture?

u  How

is it possible to define and monitor service
levels with in-memory applications?

u  Whither

data governance?
Twitter Tag: #briefr

The Briefing Room
Upcoming Topics

This Month: DATA PROCESSING
November: DATA DISCOVERY & VISUALIZATION
December: INNOVATORS

www.insideanalysis.com

Twitter Tag: #briefr

The Briefing Room
Thank You
for Your
Attention

Twitter Tag: #briefr

The Briefing Room

Mais conteúdo relacionado

Semelhante a Making the Most of In-Memory Speed and Crunching

Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Jos van Dongen
 
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, SisenseDatabase Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense✔ Eric David Benari, PMP
 
Thinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters AnalyticsThinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters AnalyticsInside Analysis
 
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabasePowering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabaseKinetica
 
How In Memory Computing Changes Everything
How In Memory Computing Changes EverythingHow In Memory Computing Changes Everything
How In Memory Computing Changes EverythingDebajit Banerjee
 
Architecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-HavesArchitecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-HavesYellowbrick Data
 
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...Senturus
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
 
Afterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écranAfterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écranJoseph Glorieux
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?Harald Erb
 
Big Data
Big DataBig Data
Big DataNGDATA
 
Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesJeff Bertman
 
big data et data viz - du lac à votre écran - afterwork
big data et data viz - du lac à votre écran - afterwork big data et data viz - du lac à votre écran - afterwork
big data et data viz - du lac à votre écran - afterwork OCTO Technology Suisse
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Tek12: Graphing real-time performance with Graphite
Tek12: Graphing real-time performance with GraphiteTek12: Graphing real-time performance with Graphite
Tek12: Graphing real-time performance with Graphitenanderoo
 
Hadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both WorldsHadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both WorldsInside Analysis
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...StampedeCon
 
Cloud Computing ...changes everything
Cloud Computing ...changes everythingCloud Computing ...changes everything
Cloud Computing ...changes everythingLew Tucker
 

Semelhante a Making the Most of In-Memory Speed and Crunching (20)

Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Database Shootout: What's best for BI?
Database Shootout: What's best for BI?
 
Why sap hana
Why sap hanaWhy sap hana
Why sap hana
 
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, SisenseDatabase Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
 
Thinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters AnalyticsThinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters Analytics
 
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabasePowering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
 
Vectorization whitepaper
Vectorization whitepaperVectorization whitepaper
Vectorization whitepaper
 
How In Memory Computing Changes Everything
How In Memory Computing Changes EverythingHow In Memory Computing Changes Everything
How In Memory Computing Changes Everything
 
Architecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-HavesArchitecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-Haves
 
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
Afterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écranAfterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écran
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?
 
Big Data
Big DataBig Data
Big Data
 
Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and Practices
 
big data et data viz - du lac à votre écran - afterwork
big data et data viz - du lac à votre écran - afterwork big data et data viz - du lac à votre écran - afterwork
big data et data viz - du lac à votre écran - afterwork
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Tek12: Graphing real-time performance with Graphite
Tek12: Graphing real-time performance with GraphiteTek12: Graphing real-time performance with Graphite
Tek12: Graphing real-time performance with Graphite
 
Hadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both WorldsHadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both Worlds
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
 
Cloud Computing ...changes everything
Cloud Computing ...changes everythingCloud Computing ...changes everything
Cloud Computing ...changes everything
 

Mais de Inside Analysis

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIInside Analysis
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessInside Analysis
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownInside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeInside Analysis
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataInside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsInside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureInside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskInside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataInside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseInside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldInside Analysis
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave DuggalInside Analysis
 

Mais de Inside Analysis (20)

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of Data
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 

Último

QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneUiPathCommunity
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - AvrilIvanti
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 

Último (20)

QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyone
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - Avril
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 

Making the Most of In-Memory Speed and Crunching

  • 1. Making the Most of In-Memory: More than Speed The Briefing Room
  • 3. Mission !   Reveal the essential characteristics of enterprise software, good and bad !   Provide a forum for detailed analysis of today s innovative technologies !   Give vendors a chance to explain their product to savvy analysts !   Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr The Briefing Room
  • 4. Topics This Month: DATA PROCESSING November: DATA DISCOVERY & VISUALIZATION December: INNOVATORS Twitter Tag: #briefr The Briefing Room
  • 5. Data Processing “ Efficiency  is  doing  things   right;  effec2veness  is  doing   the  right  things.   ~Peter Drucker Twitter Tag: #briefr The Briefing Room
  • 6. Analyst: Robin Bloor Robin Bloor is Chief Analyst at The Bloor Group robin.bloor@bloorgroup.com Twitter Tag: #briefr The Briefing Room
  • 7. Kognitio !   Founded in 1989, Kognitio is both an in-memory database and an analytical engine !   The Kognitio Analytical Platform can be deployed as software, as an appliance, or in the cloud !   The platform enables flexible, ad hoc queries on complex data sets, including data from Hadoop, and it offers scaleup and scale-out capabilities Twitter Tag: #briefr The Briefing Room
  • 8. Guest: Roger Gaskell   Roger Gaskell is the Chief Technology Officer and one of the founding members of the Kognitio Development Team. He has overall responsibility for all product development, strategic direction and roadmap of new innovation for the Kognitio Analytical Platform. Roger has been instrumental in all generations of the product to date. Over this time, it has evolved from an appliance-based system in the original beta offering in 1989, to a hardware-independent software for x86 processing, then to a cloud-based Platform-as-a-Service offering in in the mid-1990s. Prior to Kognitio, Roger was test and development manager at AB Electronics. During this time his primary responsibility was for the famous BBC Micro Computer and the development and testing of the first mass production of personal computers for IBM. Twitter Tag: #briefr The Briefing Room
  • 9. Making the most of in-memory platforms October 2013
  • 10. What is an “In-memory” analytical platform A database where queries are run from data held in computer memory (RAM) rather than mechanical disk Memory = Fast / Disk = Slow Analytics go much quicker – SIMPLE? Unfortunately, it’s not as simple as that…. 10
  • 11. Why in-memory: RAM is faster than disk (really!) Actually, this only part of the story: workload filtering crunching Analytics completely change the workload characteristics on the database Simple reporting & transactional processing is all about “filtering” the data of interest Analytics is all about complex “crunching” of the data once it is filtered CPU cycles storing Storing data on physical disks severely limits the rate at which data can be provided to the CPUs access 11 Crunching needs processing power & consumes CPU cycles Accessing data directly from RAM allows much more CPU power to be deployed
  • 12. Analytics is about crunching through data CPU cycle-intensive & CPU-bound “CRUNCHING” Analytical Functions Joins Aggregations Sorts Grouping •  To understand what is happening in the data More complex analytics = More pronounced this becomes •  In-memory analytical platforms are therefore CPU-bound –  Assume disk I/O speeds not a bottleneck –  In-memory removes the disk I/O bottleneck 12
  • 13. For analytics, the CPU is king Being CPU-bound fundamentally changes a system’s design philosophy Disk IO Bound CPU Bound CPUs wait for data from disk No need for efficient coding Parallelisation ineffective Every CPU cycle is precious – efficient coding Parallelization = scalable performance Advanced techniques minimize CPU cycles Interactive / ad hoc analytics: THINK data to core ratios ≈ <10GB data per CPU core 13
  • 14. Why now? Interest in in-memory Price of RAM, Logarithmic (10) 1987 14 1995 2000 2005 2010
  • 15. Mature BI being overtaken Numbers, tables, charts, indicators Historical information, latency …accessed with ease and simplicity Decision Support But BI and BI tools have plateaued! Progression into advanced analytics & data science It’s now all about doing more math …a lot more math 15
  • 16. Thus more complex methods – real-time Machine learning algorithms Analytical Complexity Behaviour modelling Statistical Analysis Dynamic Simulation Clustering Dynamic Interaction Fraud detection Reporting & BPM Campaign Management #PP_R Technology/Automation 16
  • 17. How to efficiently exploit RAM •  A large cache is not in-memory –  In-memory platforms hold data in structures that take advantage of the properties of RAM –  Caches are copies of frequently used disk blocks •  Platform designed to specifically exploit the random access nature of memory –  Different algorithms –  CPU cycles are precious – code efficiency paramount –  Advanced techniques used to reduce code path length •  Dynamic Machine Code Generation •  Extended CPU instruction sets •  Parallelize everything –  Scale-out and Scale-up –  Fully and efficiently use every CPU core, in every CPU, in every server 17
  • 18. Analytical Platform Reference Architecture Application & Client Layer All BI Tools All OLAP Clients Excel Analytical Platform Layer Near-line Storage (optional) Reporting Persistence Layer 18 Kognitio Storage Hadoop Clusters Cloud Storage Enterprise Data Warehouses Legacy Systems
  • 19. Perceptions & Questions Analyst: Robin Bloor Twitter Tag: #briefr The Briefing Room
  • 20.
  • 21. Big Data, Maybe — Big Parallelism, Yes Many latency-reducing changes are afoot: u  Hadoop u  CPU is a data lake – It’s about latency and memory rule – The old database is dying u  Grids, not clusters – A server is now a cluster u  Scaling Up AND Scaling Out – “Only scaling out” is last year’s story u  SSD will replace spinning disk – But it will never compete with RAM
  • 22. Why the Excitement? What are the “new” applications? BIG DATA capture and staging BIG DATA ANALYTICS LITTLE DATA ANALYTICS OPERATIONAL INTELLIGENCE
  • 23. A “Modern” Workload Query Light & Math Heavy
  • 24. Where the Rubber Meets the Road It isn’t really about application latency any more, it’s about business process latency (business time!). This can have many aspects: u  The collapse of data flows – take the processing to the data u  Data u  Full warehouse offload process automation u  Lower latency = NEW BUSINESS PROCESSES
  • 25. The Question The question for most organizations is: Exactly how do we take advantage of these changes? This is a BUSINESS question AND a TECHNICAL question.
  • 26. u  Low latency is exciting, but where do you see the clear business opportunities? u  There seems to be a conundrum about where to store “slow” data: Ø  Hadoop? Ø  Traditional data warehouse? Ø  New data warehouse? u  Is the split between the application and the data real any more?
  • 27. u  In your opinion, does the Enterprise need a new architecture? u  How is it possible to define and monitor service levels with in-memory applications? u  Whither data governance?
  • 28. Twitter Tag: #briefr The Briefing Room
  • 29. Upcoming Topics This Month: DATA PROCESSING November: DATA DISCOVERY & VISUALIZATION December: INNOVATORS www.insideanalysis.com Twitter Tag: #briefr The Briefing Room
  • 30. Thank You for Your Attention Twitter Tag: #briefr The Briefing Room