SlideShare uma empresa Scribd logo
1 de 26
New Business Applications 
Powered by In-Memory 
Technologies from Academia 
MIT Forum for Supply Chain Innovation 
HPI February 2011 
Paul Hofmann, PhD 
VP, Group of the Chief Scientist 
SAP Chief 
Scientist 
Group
Examples of Projects with In-Memory 
Technology in Collaboration with Academia 
© SAP AG 2011. All rights reserved. / Page 2 
Taking In-Memory Computing Seriously 
Coherent Shared Memory – BigIron in Palo Alto 
Keep programming model simple AND solve very complex problems 
Princeton University 
 Optimal pricing for energy management, online pricing, truck scheduling, … 
Infinite DRAM - RAMCloud 
Stanford University and HPI 
 Extremely low latency and very high bandwidth 
 Facebook like problems with high read AND write rate 
 Advanced analytics, what-if scenarios, demand planning, ... 
Hybrid In-Memory Store 
MIT CSAIL and HPI 
 Aggregate column store – the best of both worlds 
Multithreading Real Time Event Platform 
MIT Auto-ID Lab and HPI 
 500k events/s and millions of threads in-memory or distributed 
 Automatic meter reading, online billing, mobile billing, Smart Grid
Taking In-Memory Computing Seriously 
Chief Scientist Group 
Princeton University – Operations Research and Financial Engineering 
Warren Powell, et al 
© SAP AG 2011. All rights reserved. / Page 3
Taking In-Memory Computing Seriously! 
Basic Assumptions 
 Disk is tape - active data must be in DRAM 
 Data locality is king  avoid cache misses and stalled CPU 
Problems and Opportunities for In-Memory Computing 
 Addressable DRAM per box is limited – different than hard disks. 
We need to scale memory independently from physical boxes 
 Scaling Architecture 
– Arbitrary scaling of the amount of data stored in DRAM 
– Arbitrary & independent scaling of number of active users & associated computing load 
 Inter-Process Communication is slow and hard to program (latencies are in the area of 
0.5-1ms ) 
We can do better 
© SAP AG 2011. All rights reserved. / Page 4
Taking In-Memory Computing Seriously! 
How can we do better? 
 Coherent Shared Memory or ccNUMA 
All CPUs can access all memory and all I/O channels in about 1 μs 
We can scale independently with DRAM and CPUs 
You need more computing power – add another board … 
You need more DRAM – add another board … 
 Merge Application Server & DB Server – reference memory directly from app 
© SAP AG 2011. All rights reserved. / Page 5
BigIron - A System We Architected For Hana with 
Leading-Edge, Cluster Server Components 
System Specifications Architecture, Assembly, 
© SAP AG 2011. All rights reserved. / Page 6 
System architecture: SAP Technology 
Infrastructure 
Research Practice 
Assembly and Test: Colfax 
International 
Hosting: Bay Area Internet 
Solutions, Santa 
Clara, CA 
• Large shared 
coherent 
memory (5TB) 
across 
servers via 
Scale MP 
• 160 cores(320 
HT) 
Big Iron 2 
Extreme Performance, Scalability, 
and much simpler system model 
Research Server Cluster 
 5 x 4U Servers 
(4 Intel XEON x7560 2.26Ghz) 
 160 cores (32 Cores/Server) 
 5TB memory (64 x 16MB 
DDR3/Server) 
 30TB SSD (solid state disk) 
storage 
 5 Networks 
 VPN of ScaleMP (40- 
160GbIB) 
 VPN of Server Cluster 
(10GbE) 
 VPN of Storage Array 
(10GbE) 
 VPN of SAP Internal Network 
(10MbE metered) 
 Firewalled GW to Internet 
(1GbE Expandable) 
 1 NAS (72TB Expandable to 180) 
 1 x 48U Rack 
 System Software 
 SLES11 Linux OS Licenses 
 ScaleMP vSMP Licenses 
 System cost: $618K with tax and 
support 
& Hosting
Coherent Shared Memory – Improved 
Productivity at Low Price 
Traditional server clusters – Distributed memory 
Programming of Queries, 
Distribution of Data 
Physical 
Server #1 
© SAP AG 2011. All rights reserved. / Page 7 
Server Cluster with Coherent Shared Memory 
Traditional 
SAP developer 
Coherent Shared Memory via 
software approach 
 Developers can treat system as one “big” server and 
let the operating system and lower level software 
handle the problem 
 Initial design is timeless – hardware scaling handled 
below app design layer 
 Developers do not need additional skills for in-memory 
computing 
Before 
Physical 
Server #2 
SAP developer 
for in-memory 
computing 
Physical 
Server n … 
 Developers need to distribute queries and data 
across physical servers; access to data requires 
mastering complex communications protocols 
 Design trapped at a single “scale” – platform growth 
forces redesign every couple years 
 Specialized programming skills held only by SAP’s 
top developers 
After 
Physical 
Server #1 
Physical 
Server #2 
Physical 
Server n …
Solve Very Compute Intensive Problems 
Like Stochastic Optimization 
We need to juggle intermittent energy from wind or solar and volatile electricity 
prices to meet time-varying loads – Princeton has the necessary algorithms 
Wind speed 
Electricity prices 
We can reduce compute time from days to minutes! 
© SAP AG 2011. All rights reserved. / Page 8 
Load
Modeling uncertainty in power scheduling 
12 
10 
8 
6 
4 
2 
© SAP AG 2011. All rights reserved. / Page 9 
The effect of modeling uncertainty in wind 
0 
Uncertain forecast Perfect forecast Constant wind 
2% wind 
40% wind
Modeling Uncertainty In Power Scheduling 
Designing energy portfolios…. 
… is like building a stone wall. You can do a perfect job with a perfect forecast. The 
challenge is dealing with uncertainty. 
© SAP AG 2011. All rights reserved. / Page 10
In the beginning…. 
© SAP AG 2011. All rights reserved. / Page 11
Infinite DRAM with RAM Cloud 
Stanford University and HPI 
John Ousterhout, Mendel Rosenblum, Christian Tinnefeld, et al 
© SAP AG 2011. All rights reserved. / Page 12
Impact of Latency for Internet Applications 
© SAP AG 2011. All rights reserved. / Page 13 
Large-scale Apps Struggle with High Latency 
Web Application 
Application Server Storage Server 
0.5-10ms latency
RAMCloud – Stanford University 
Create distributed storage system that keeps data entirely in DRAM 
Combining the main memories of thousands of servers 
Use high-end network (10 Gigabit Ethernet / Infiniband) 
Replicate data synchronously into the main memories of other nodes 
Asynchronous writes to disk only for backup/archival purpose 
This results in: 
 Gracefully scaling storage solution 
 Very low access latency AND high bandwidth 
 Latency of memory access via network 1-5 μs 
 Eventually consistent data storage – consistency is sub second 
© SAP AG 2011. All rights reserved. / Page 14
RAMCloud – Stanford University 
RamCoud ideal for apps with millions of concurrent users that need more complex 
data structures than key value store e.g. Facebook, Quora, Yelp, etc. 
Research Questions in the Context of Enterprise Applications 
What-If Analytics 
 Ideally, companies want to do what-if analytics on their complete history of transactional 
data, not on a subset (Think about WalMart or Unilever) 
 What-if Analytics need high bursts of read AND write access 
 How can the needed data be modeled and placed in a RAMCloud? 
 How can the needed data operations be implemented in a scalable way? 
Transactional Properties 
 Enterprise applications rely on transactional properties 
 RAMCloud provides extremely low latency, but only eventually consistent 
 Are the reduced transaction times sufficient to avoid lock wait times or reduce aborted 
transactions? 
© SAP AG 2011. All rights reserved. / Page 15
Hybrid In-Memory Store HIRYSE 
MIT CSAIL and HPI 
Sam Madden, Philippe Cudre-Mauroux, Jens Krueger, Martin Grund, et al 
© SAP AG 2011. All rights reserved. / Page 16
Hybrid Partitioning for Mixed Workloads 
OLTP 
OLAP 
Row 
1 
Row 
2 
Row 
3 
Row 
1 
Row 
2 
Row 
3 
© SAP AG 2011. All rights reserved. / Page 17 
Row Store Column Store Hybrid Store 
Row 
4 
Row 
4 
Doc 
Num 
Doc 
Date 
Sold- 
To 
Value 
Sales 
Org 
Status 
Doc 
Num 
Doc 
Date 
Sold- 
To 
Value 
Sales 
Org 
Status 
Row 
1 
Row 
2 
Row 
3 
Row 
4 
Doc 
Num 
Sold- 
To 
Doc 
Date 
Value 
Sales 
Org 
Status
HYRISE Architecture 
© SAP AG 2011. All rights reserved. / Page 18 
■ Query Processor - chooses the best 
possible query plan for the hybrid data 
storage structure 
■ Layout Manager - given a workload 
it performs an evaluation based on a 
cache miss based cost model and 
generates the best possible layout 
■ Storage Manager - main memory 
hybrid storage manager capable of 
vertical partitioning in single relations
Cost Model 
Used to describe the layout dependent costs based on cache misses 
Combine complex access patterns from simpler ones (scan, lookup,…) 
Measure 
 All different access variants to the hybrid storage layer - full projection, partial 
projection, selection, tuple reconstruction 
 Different performance parameters - compiler settings, hardware pre-fetcher, … 
Observe 
 Combined behavior of all components – look at total workload 
 Count cache misses using the CPU hardware performance counter 
Understand 
 Develop the cost model 
© SAP AG 2011. All rights reserved. / Page 19
Results 
© SAP AG 2011. All rights reserved. / Page 20
Multithreading Real Time Event Platform 
MIT Auto ID Lab and HPI 
John Williams, Sergio Herrero, Abel Sanchez, et al 
© SAP AG 2011. All rights reserved. / Page 21
Motivation: Rapid Growth of Events and 
Messaging Platforms 
Verizon and T-Mobile: 2-3 days to generate phone bill 
© SAP AG 2011. All rights reserved. / Page 22 
A Comparative Study of Data Storage and Processing Architectures 
iTunes: 24 hours to generate bill 
Uninterrupted Growth of online billing systems (Hulu, Netflix…) 
Dynamic Pricing on SmartGrid requires design of infrastructure 
capable of ingesting millions of events in quasi-real time 
Goal: Design a 
multi-threaded 
system that 
produces the 
electricity 
consumption bill 
of a city of 1M 
households 
8 hours  seconds
Smart Meter Reading Problem 
Users 
© SAP AG 2011. All rights reserved. / Page 23 
Energy 
Producers 
Data Generation 
Data Persistence 
Data Processing
Multithreaded Simulator for Smart Meter 
Management 
Layer 
Generation Layer Consumption Layer 
© SAP AG 2011. All rights reserved. / Page 24 
Storage Layer 
Meter Read 
Generators 
Head Ends 
Simulation 
Manager 
Query 
Client 
Query 
Client 
Query 
Client 
MDUS 
Meter 
Data 
Unification 
System 
DRYAD 
HADOOPDB 
SAP HANA 
Web Service 
Interfaces 
Web Service 
Interfaces
Conclusion 
Platform that handles billions of events/day AND large numbers of 
threads on one machine (> 1 million), e.g. Siemens 500k events/s 
RDBMS (used by today’s MDUS vendors) provides good query 
performance but does not scale to millions of households (8 h) 
Distributed File System (DFS): Provides the scalability, reliability & insert 
performance necessary for the storage of Smart Meter Reader data 
Using Map Reduce on top of a DFS (HDFS or CEPH) is good for batch 
processing systems 
Loading data from DFS and executing queries in-memory using HANA 
provides very good performance results for real time queries  RAPTOR 
Prototype for SmartGrid allowing to ingest smart meter data in real time, 
do dynamic pricing (4 buckets), store in DFS & do real time analytics 
© SAP AG 2011. All rights reserved. / Page 25 
A Comparative Study of Data Storage and Processing Architectures 
Bill for 1 M households in seconds
Sense 
• Collect current conditions at 
fine grain in real time 
Analyze 
• Access real-time data – analyze 
data, learn, build models 
Actuate 
• Control (soft, hard, persuasive, 
personal) 
ELECTRONIC NERVOUS SYSTEM 
Analytical 
Approach 
Inductive 
Approach 
No Data Complete Data 
No Prior 
Knowledge 
Perfect 
Knowledge 
Data 
Knowledge 
2 
   
    
2 
1 
exp 
2 
i 
i 
x 
y 
 
  
  
  
X1 -1 Y1 0.02540487 
X2 -0.9 Y2 0.02779527 
X3 -0.8 Y3 0.03010825 
X4 -0.7 Y4 0.03228947 
X5 -0.6 Y5 0.03428442 .. .. .. .. . . . . 
GPS SIM 
Card

Mais conteúdo relacionado

Mais procurados

Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forumbigdatawf
 
How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)Rand Group
 
TUSC-Piocon OBIEE Case Studies
TUSC-Piocon OBIEE Case StudiesTUSC-Piocon OBIEE Case Studies
TUSC-Piocon OBIEE Case StudiesMark West
 
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...StampedeCon
 
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.HP Enterprise Italia
 
Why Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsWhy Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsRick Perret
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersDataWorks Summit
 
Building Innovative Industry Solutions for System z
Building Innovative Industry Solutions for System zBuilding Innovative Industry Solutions for System z
Building Innovative Industry Solutions for System zdkang
 
Infosys – Cloud Business Value Architecture
Infosys – Cloud Business Value ArchitectureInfosys – Cloud Business Value Architecture
Infosys – Cloud Business Value ArchitectureInfosys
 
IBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big DataIBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big DataIBM Analytics
 
Dell AI Oil and Gas Webinar
Dell AI Oil and Gas WebinarDell AI Oil and Gas Webinar
Dell AI Oil and Gas WebinarBill Wong
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forumbigdatawf
 
next-generation-data-centers
next-generation-data-centersnext-generation-data-centers
next-generation-data-centersJason Hoffman
 
Big Memory Webcast
Big Memory WebcastBig Memory Webcast
Big Memory WebcastMemVerge
 
IBM Power Systems at the heart of Cognitive Solutions
IBM Power Systems at the heart of Cognitive SolutionsIBM Power Systems at the heart of Cognitive Solutions
IBM Power Systems at the heart of Cognitive SolutionsDavid Spurway
 
Keynote for the IBM Avnet Indonesia MSP Day
Keynote for the IBM Avnet Indonesia MSP DayKeynote for the IBM Avnet Indonesia MSP Day
Keynote for the IBM Avnet Indonesia MSP DayPandu W Sastrowardoyo
 
1524 how ibm's big data solution can help you gain insight into your data cen...
1524 how ibm's big data solution can help you gain insight into your data cen...1524 how ibm's big data solution can help you gain insight into your data cen...
1524 how ibm's big data solution can help you gain insight into your data cen...IBM
 
PaaS: Open For Business
PaaS: Open For Business PaaS: Open For Business
PaaS: Open For Business VMware Tanzu
 

Mais procurados (20)

Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)
 
TUSC-Piocon OBIEE Case Studies
TUSC-Piocon OBIEE Case StudiesTUSC-Piocon OBIEE Case Studies
TUSC-Piocon OBIEE Case Studies
 
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
 
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
 
Why Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsWhy Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & Analytics
 
SAP vs SAS - Comparison
SAP vs SAS - ComparisonSAP vs SAS - Comparison
SAP vs SAS - Comparison
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
Building Innovative Industry Solutions for System z
Building Innovative Industry Solutions for System zBuilding Innovative Industry Solutions for System z
Building Innovative Industry Solutions for System z
 
Infosys – Cloud Business Value Architecture
Infosys – Cloud Business Value ArchitectureInfosys – Cloud Business Value Architecture
Infosys – Cloud Business Value Architecture
 
IBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big DataIBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big Data
 
Dell AI Oil and Gas Webinar
Dell AI Oil and Gas WebinarDell AI Oil and Gas Webinar
Dell AI Oil and Gas Webinar
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
next-generation-data-centers
next-generation-data-centersnext-generation-data-centers
next-generation-data-centers
 
Big Memory Webcast
Big Memory WebcastBig Memory Webcast
Big Memory Webcast
 
IBM Power Systems at the heart of Cognitive Solutions
IBM Power Systems at the heart of Cognitive SolutionsIBM Power Systems at the heart of Cognitive Solutions
IBM Power Systems at the heart of Cognitive Solutions
 
Haven 2 0
Haven 2 0 Haven 2 0
Haven 2 0
 
Keynote for the IBM Avnet Indonesia MSP Day
Keynote for the IBM Avnet Indonesia MSP DayKeynote for the IBM Avnet Indonesia MSP Day
Keynote for the IBM Avnet Indonesia MSP Day
 
1524 how ibm's big data solution can help you gain insight into your data cen...
1524 how ibm's big data solution can help you gain insight into your data cen...1524 how ibm's big data solution can help you gain insight into your data cen...
1524 how ibm's big data solution can help you gain insight into your data cen...
 
PaaS: Open For Business
PaaS: Open For Business PaaS: Open For Business
PaaS: Open For Business
 

Destaque

Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsPaul Hofmann
 
Economics of Cloud Computing
Economics of Cloud ComputingEconomics of Cloud Computing
Economics of Cloud ComputingPaul Hofmann
 
RFID Simulation of the US Pharmaceutical Supply Chain
RFID Simulation of the US Pharmaceutical Supply ChainRFID Simulation of the US Pharmaceutical Supply Chain
RFID Simulation of the US Pharmaceutical Supply ChainPaul Hofmann
 
Saffron Tech Company Profile
Saffron Tech Company ProfileSaffron Tech Company Profile
Saffron Tech Company ProfileIT Chimes
 
e-Learning Reimagined: the Secret to Achieving and Measuring ROI
e-Learning Reimagined: the Secret to Achieving and Measuring ROIe-Learning Reimagined: the Secret to Achieving and Measuring ROI
e-Learning Reimagined: the Secret to Achieving and Measuring ROISaffron Interactive
 
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...Mr.Allah Dad Khan
 
深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開Seiya Tokui
 

Destaque (9)

Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
 
Economics of Cloud Computing
Economics of Cloud ComputingEconomics of Cloud Computing
Economics of Cloud Computing
 
RFID Simulation of the US Pharmaceutical Supply Chain
RFID Simulation of the US Pharmaceutical Supply ChainRFID Simulation of the US Pharmaceutical Supply Chain
RFID Simulation of the US Pharmaceutical Supply Chain
 
LINK TO VIDEOS
LINK TO VIDEOSLINK TO VIDEOS
LINK TO VIDEOS
 
Saffron Tech Company Profile
Saffron Tech Company ProfileSaffron Tech Company Profile
Saffron Tech Company Profile
 
e-Learning Reimagined: the Secret to Achieving and Measuring ROI
e-Learning Reimagined: the Secret to Achieving and Measuring ROIe-Learning Reimagined: the Secret to Achieving and Measuring ROI
e-Learning Reimagined: the Secret to Achieving and Measuring ROI
 
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...
 
Saffron
SaffronSaffron
Saffron
 
深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開
 

Semelhante a New Business Applications Powered by In-Memory Technology @MIT Forum for Supply Chain Innovation, 2011

NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverNGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverScott Shadley, MBA,PMC-III
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDATAVERSITY
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015Doug O'Flaherty
 
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...Fujitsu India
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataLviv Startup Club
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Lviv Startup Club
 
Webinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Webinar: Three Reasons Why NAS is No Good for AI and Machine LearningWebinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Webinar: Three Reasons Why NAS is No Good for AI and Machine LearningStorage Switzerland
 
Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad IIIT ALLAHABAD
 
Aerospike: Enabling Your Digital Transformation
Aerospike: Enabling Your Digital TransformationAerospike: Enabling Your Digital Transformation
Aerospike: Enabling Your Digital TransformationBrillix
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyNati Shalom
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...StampedeCon
 
Presentazione PernixData @ VMUGIT UserCon 2015
Presentazione PernixData @ VMUGIT UserCon 2015Presentazione PernixData @ VMUGIT UserCon 2015
Presentazione PernixData @ VMUGIT UserCon 2015VMUG IT
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERinside-BigData.com
 
Macroview Netapp Overview
Macroview Netapp OverviewMacroview Netapp Overview
Macroview Netapp OverviewAlex Tsui
 
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Prolifics
 
EMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized PortfolioEMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized PortfolioNetApp
 
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 TechnologyAdd Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 TechnologyIBM India Smarter Computing
 
Tendencias Storage
Tendencias StorageTendencias Storage
Tendencias StorageFran Navarro
 

Semelhante a New Business Applications Powered by In-Memory Technology @MIT Forum for Supply Chain Innovation, 2011 (20)

NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverNGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015
 
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
 
NetApp All Flash storage
NetApp All Flash storageNetApp All Flash storage
NetApp All Flash storage
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
 
Webinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Webinar: Three Reasons Why NAS is No Good for AI and Machine LearningWebinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Webinar: Three Reasons Why NAS is No Good for AI and Machine Learning
 
Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad
 
Aerospike: Enabling Your Digital Transformation
Aerospike: Enabling Your Digital TransformationAerospike: Enabling Your Digital Transformation
Aerospike: Enabling Your Digital Transformation
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
 
Presentazione PernixData @ VMUGIT UserCon 2015
Presentazione PernixData @ VMUGIT UserCon 2015Presentazione PernixData @ VMUGIT UserCon 2015
Presentazione PernixData @ VMUGIT UserCon 2015
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWER
 
Macroview Netapp Overview
Macroview Netapp OverviewMacroview Netapp Overview
Macroview Netapp Overview
 
Autodesk Technical Webinar: SAP HANA in-memory database
Autodesk Technical Webinar: SAP HANA in-memory databaseAutodesk Technical Webinar: SAP HANA in-memory database
Autodesk Technical Webinar: SAP HANA in-memory database
 
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
 
EMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized PortfolioEMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized Portfolio
 
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 TechnologyAdd Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
 
Tendencias Storage
Tendencias StorageTendencias Storage
Tendencias Storage
 

Último

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Último (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

New Business Applications Powered by In-Memory Technology @MIT Forum for Supply Chain Innovation, 2011

  • 1. New Business Applications Powered by In-Memory Technologies from Academia MIT Forum for Supply Chain Innovation HPI February 2011 Paul Hofmann, PhD VP, Group of the Chief Scientist SAP Chief Scientist Group
  • 2. Examples of Projects with In-Memory Technology in Collaboration with Academia © SAP AG 2011. All rights reserved. / Page 2 Taking In-Memory Computing Seriously Coherent Shared Memory – BigIron in Palo Alto Keep programming model simple AND solve very complex problems Princeton University  Optimal pricing for energy management, online pricing, truck scheduling, … Infinite DRAM - RAMCloud Stanford University and HPI  Extremely low latency and very high bandwidth  Facebook like problems with high read AND write rate  Advanced analytics, what-if scenarios, demand planning, ... Hybrid In-Memory Store MIT CSAIL and HPI  Aggregate column store – the best of both worlds Multithreading Real Time Event Platform MIT Auto-ID Lab and HPI  500k events/s and millions of threads in-memory or distributed  Automatic meter reading, online billing, mobile billing, Smart Grid
  • 3. Taking In-Memory Computing Seriously Chief Scientist Group Princeton University – Operations Research and Financial Engineering Warren Powell, et al © SAP AG 2011. All rights reserved. / Page 3
  • 4. Taking In-Memory Computing Seriously! Basic Assumptions  Disk is tape - active data must be in DRAM  Data locality is king  avoid cache misses and stalled CPU Problems and Opportunities for In-Memory Computing  Addressable DRAM per box is limited – different than hard disks. We need to scale memory independently from physical boxes  Scaling Architecture – Arbitrary scaling of the amount of data stored in DRAM – Arbitrary & independent scaling of number of active users & associated computing load  Inter-Process Communication is slow and hard to program (latencies are in the area of 0.5-1ms ) We can do better © SAP AG 2011. All rights reserved. / Page 4
  • 5. Taking In-Memory Computing Seriously! How can we do better?  Coherent Shared Memory or ccNUMA All CPUs can access all memory and all I/O channels in about 1 μs We can scale independently with DRAM and CPUs You need more computing power – add another board … You need more DRAM – add another board …  Merge Application Server & DB Server – reference memory directly from app © SAP AG 2011. All rights reserved. / Page 5
  • 6. BigIron - A System We Architected For Hana with Leading-Edge, Cluster Server Components System Specifications Architecture, Assembly, © SAP AG 2011. All rights reserved. / Page 6 System architecture: SAP Technology Infrastructure Research Practice Assembly and Test: Colfax International Hosting: Bay Area Internet Solutions, Santa Clara, CA • Large shared coherent memory (5TB) across servers via Scale MP • 160 cores(320 HT) Big Iron 2 Extreme Performance, Scalability, and much simpler system model Research Server Cluster  5 x 4U Servers (4 Intel XEON x7560 2.26Ghz)  160 cores (32 Cores/Server)  5TB memory (64 x 16MB DDR3/Server)  30TB SSD (solid state disk) storage  5 Networks  VPN of ScaleMP (40- 160GbIB)  VPN of Server Cluster (10GbE)  VPN of Storage Array (10GbE)  VPN of SAP Internal Network (10MbE metered)  Firewalled GW to Internet (1GbE Expandable)  1 NAS (72TB Expandable to 180)  1 x 48U Rack  System Software  SLES11 Linux OS Licenses  ScaleMP vSMP Licenses  System cost: $618K with tax and support & Hosting
  • 7. Coherent Shared Memory – Improved Productivity at Low Price Traditional server clusters – Distributed memory Programming of Queries, Distribution of Data Physical Server #1 © SAP AG 2011. All rights reserved. / Page 7 Server Cluster with Coherent Shared Memory Traditional SAP developer Coherent Shared Memory via software approach  Developers can treat system as one “big” server and let the operating system and lower level software handle the problem  Initial design is timeless – hardware scaling handled below app design layer  Developers do not need additional skills for in-memory computing Before Physical Server #2 SAP developer for in-memory computing Physical Server n …  Developers need to distribute queries and data across physical servers; access to data requires mastering complex communications protocols  Design trapped at a single “scale” – platform growth forces redesign every couple years  Specialized programming skills held only by SAP’s top developers After Physical Server #1 Physical Server #2 Physical Server n …
  • 8. Solve Very Compute Intensive Problems Like Stochastic Optimization We need to juggle intermittent energy from wind or solar and volatile electricity prices to meet time-varying loads – Princeton has the necessary algorithms Wind speed Electricity prices We can reduce compute time from days to minutes! © SAP AG 2011. All rights reserved. / Page 8 Load
  • 9. Modeling uncertainty in power scheduling 12 10 8 6 4 2 © SAP AG 2011. All rights reserved. / Page 9 The effect of modeling uncertainty in wind 0 Uncertain forecast Perfect forecast Constant wind 2% wind 40% wind
  • 10. Modeling Uncertainty In Power Scheduling Designing energy portfolios…. … is like building a stone wall. You can do a perfect job with a perfect forecast. The challenge is dealing with uncertainty. © SAP AG 2011. All rights reserved. / Page 10
  • 11. In the beginning…. © SAP AG 2011. All rights reserved. / Page 11
  • 12. Infinite DRAM with RAM Cloud Stanford University and HPI John Ousterhout, Mendel Rosenblum, Christian Tinnefeld, et al © SAP AG 2011. All rights reserved. / Page 12
  • 13. Impact of Latency for Internet Applications © SAP AG 2011. All rights reserved. / Page 13 Large-scale Apps Struggle with High Latency Web Application Application Server Storage Server 0.5-10ms latency
  • 14. RAMCloud – Stanford University Create distributed storage system that keeps data entirely in DRAM Combining the main memories of thousands of servers Use high-end network (10 Gigabit Ethernet / Infiniband) Replicate data synchronously into the main memories of other nodes Asynchronous writes to disk only for backup/archival purpose This results in:  Gracefully scaling storage solution  Very low access latency AND high bandwidth  Latency of memory access via network 1-5 μs  Eventually consistent data storage – consistency is sub second © SAP AG 2011. All rights reserved. / Page 14
  • 15. RAMCloud – Stanford University RamCoud ideal for apps with millions of concurrent users that need more complex data structures than key value store e.g. Facebook, Quora, Yelp, etc. Research Questions in the Context of Enterprise Applications What-If Analytics  Ideally, companies want to do what-if analytics on their complete history of transactional data, not on a subset (Think about WalMart or Unilever)  What-if Analytics need high bursts of read AND write access  How can the needed data be modeled and placed in a RAMCloud?  How can the needed data operations be implemented in a scalable way? Transactional Properties  Enterprise applications rely on transactional properties  RAMCloud provides extremely low latency, but only eventually consistent  Are the reduced transaction times sufficient to avoid lock wait times or reduce aborted transactions? © SAP AG 2011. All rights reserved. / Page 15
  • 16. Hybrid In-Memory Store HIRYSE MIT CSAIL and HPI Sam Madden, Philippe Cudre-Mauroux, Jens Krueger, Martin Grund, et al © SAP AG 2011. All rights reserved. / Page 16
  • 17. Hybrid Partitioning for Mixed Workloads OLTP OLAP Row 1 Row 2 Row 3 Row 1 Row 2 Row 3 © SAP AG 2011. All rights reserved. / Page 17 Row Store Column Store Hybrid Store Row 4 Row 4 Doc Num Doc Date Sold- To Value Sales Org Status Doc Num Doc Date Sold- To Value Sales Org Status Row 1 Row 2 Row 3 Row 4 Doc Num Sold- To Doc Date Value Sales Org Status
  • 18. HYRISE Architecture © SAP AG 2011. All rights reserved. / Page 18 ■ Query Processor - chooses the best possible query plan for the hybrid data storage structure ■ Layout Manager - given a workload it performs an evaluation based on a cache miss based cost model and generates the best possible layout ■ Storage Manager - main memory hybrid storage manager capable of vertical partitioning in single relations
  • 19. Cost Model Used to describe the layout dependent costs based on cache misses Combine complex access patterns from simpler ones (scan, lookup,…) Measure  All different access variants to the hybrid storage layer - full projection, partial projection, selection, tuple reconstruction  Different performance parameters - compiler settings, hardware pre-fetcher, … Observe  Combined behavior of all components – look at total workload  Count cache misses using the CPU hardware performance counter Understand  Develop the cost model © SAP AG 2011. All rights reserved. / Page 19
  • 20. Results © SAP AG 2011. All rights reserved. / Page 20
  • 21. Multithreading Real Time Event Platform MIT Auto ID Lab and HPI John Williams, Sergio Herrero, Abel Sanchez, et al © SAP AG 2011. All rights reserved. / Page 21
  • 22. Motivation: Rapid Growth of Events and Messaging Platforms Verizon and T-Mobile: 2-3 days to generate phone bill © SAP AG 2011. All rights reserved. / Page 22 A Comparative Study of Data Storage and Processing Architectures iTunes: 24 hours to generate bill Uninterrupted Growth of online billing systems (Hulu, Netflix…) Dynamic Pricing on SmartGrid requires design of infrastructure capable of ingesting millions of events in quasi-real time Goal: Design a multi-threaded system that produces the electricity consumption bill of a city of 1M households 8 hours  seconds
  • 23. Smart Meter Reading Problem Users © SAP AG 2011. All rights reserved. / Page 23 Energy Producers Data Generation Data Persistence Data Processing
  • 24. Multithreaded Simulator for Smart Meter Management Layer Generation Layer Consumption Layer © SAP AG 2011. All rights reserved. / Page 24 Storage Layer Meter Read Generators Head Ends Simulation Manager Query Client Query Client Query Client MDUS Meter Data Unification System DRYAD HADOOPDB SAP HANA Web Service Interfaces Web Service Interfaces
  • 25. Conclusion Platform that handles billions of events/day AND large numbers of threads on one machine (> 1 million), e.g. Siemens 500k events/s RDBMS (used by today’s MDUS vendors) provides good query performance but does not scale to millions of households (8 h) Distributed File System (DFS): Provides the scalability, reliability & insert performance necessary for the storage of Smart Meter Reader data Using Map Reduce on top of a DFS (HDFS or CEPH) is good for batch processing systems Loading data from DFS and executing queries in-memory using HANA provides very good performance results for real time queries  RAPTOR Prototype for SmartGrid allowing to ingest smart meter data in real time, do dynamic pricing (4 buckets), store in DFS & do real time analytics © SAP AG 2011. All rights reserved. / Page 25 A Comparative Study of Data Storage and Processing Architectures Bill for 1 M households in seconds
  • 26. Sense • Collect current conditions at fine grain in real time Analyze • Access real-time data – analyze data, learn, build models Actuate • Control (soft, hard, persuasive, personal) ELECTRONIC NERVOUS SYSTEM Analytical Approach Inductive Approach No Data Complete Data No Prior Knowledge Perfect Knowledge Data Knowledge 2        2 1 exp 2 i i x y        X1 -1 Y1 0.02540487 X2 -0.9 Y2 0.02779527 X3 -0.8 Y3 0.03010825 X4 -0.7 Y4 0.03228947 X5 -0.6 Y5 0.03428442 .. .. .. .. . . . . GPS SIM Card