SlideShare uma empresa Scribd logo
1 de 33
Complex Analytics with NoSQL Data Store in 
Real Time 
Nested Queries, Projection, 
Transactions and more 
Nati Shalom 
@natishalom 
slideshare.net/giganati
What were here to discuss? 
Making Sense of the Exploding Data 
World 
How that World Could Look Like if 
Disk is no Longer the Bottleneck 
Live Demo
Making Sense of The Exploding Data World
Capacity and Performance Drives 
New Data Management Technologies 
PB 
TB 
GB 
Data Volume 
Data Mining 
Machine 
Learning 
Data 
Business Intelligence 
Warehouse High Throughput OLTP 
Yr Mo Day Hr Min Sec MS μS 
Data Velocity 
Operational Intelligence 
Exploratory Analytics 
OLTP 
Streaming
Let’s Look at 
Tradeoffs of 
Some Selected 
Solutions
SQL Queries 
• Query: SQL 
• Semantics: 
• CRUD 
• Aggregation 
• Projection 
• Partial update 
• Performance: 100’s/Sec 
• Consistency: Transactional 
• Scaling: Mostly Scale-UP 
• Availability: Disk Based
NoSQL 
• Query: Proprietary but rich 
• Semantics: 
• CRUD 
• Limited Aggregation 
(Map/Reduce) 
• No Projection 
• No Partial update 
• Performance: 1000s/Sec 
• Consistency: Eventual 
• Scaling: Mostly Scale-Out 
• Availability: Based on 
replication
IMDG 
• Query: Propriety but rich 
• Semantics: 
• CRUD 
• Aggregation API + 
Map/Reduce 
• Projection (GigaSpaces) 
• Partial Update 
(GigaSpaces) 
• Performance: 100k/sec 
• Consistency: Transactional 
• Scaling: Mostly Scale-Out 
• Availability: Replication
Key/Value 
• Query: Key, Value 
• Semantics: 
• Mostly Read 
• No Aggregation 
• No Projection 
• No Partial update 
• Performance: 1M’s/sec 
• Consistency: Atomic 
• Scaling: Mostly Scale-Out 
• Availability: Limited (varies 
quite substantially between 
implementations)
Stream Processing (Storm) 
• Semantics 
– Event driven data processing 
• Used for continues 
updates 
Spouts 
– No need for a costly “SELECT 
FOR UPDATE” 
• Performance: 10’sM/sec 
updates 
Bolt
Common Assumption 
Disk is the bottleneck 
100X 
10,000X 
HDD Latency (Seek & Rotate) = Little Improvement 
2010 
Performance^10 
2000 2020 
Source: GigaOM Research
Capacity and Performance Drives 
New Data Management Technologies 
(Source: IDC, 2013) 
Big Data (Hadoop) 
NoSQL 
In Memory, 
Stream 
Processing 
RDBMS
There’s No One Size Fits All
A Typical App Looks Like This.. 
Front End Analytics 
RT 
STORM 
Batch 
The Data Flow 
Complexity
What if Disk Was no Longer the 
Bottleneck? 
FLASH Closes the 
CPU to Storage Gap
Our Application Cloud Look Like This.. 
Front End 
High Speed 
Data Store 
(Using Flash/NVM) 
Key/Value 
SQL 
Document 
Graph 
Map/Reduce 
Transactional 
Disk Becomes 
the new Tape 
StreamBase 
Common Data Store serving 
Multiple Semantics/API
We're not there yet .. 
But..
We can use High Speed Data Bus for 
Integrating All of our Data Sources 
Front End Analytics 
RT 
STORM 
Batch 
High Speed 
Data Bus 
(Built-In 
Caching) 
RT 
Transactional 
Data Access 
Direct Access 
RT Streaming 
Hadoop Synch 
MySQL Synch 
Mongo Synch
High Speed Data Bus (Zoom In)
Designed for Transactional and 
Analytics Scenarios.. 
Homeland Security 
Real Time Search 
Social 
eCommerce 
User Tracking & 
Engagement 
Financial Services
Many API’s – Same Data 
Key/Value SQL Document Graph Map/Reduce Transactional
Let’s take a closer look..
Nested Queries & Projections
Aggregations.
Fast Update … 
Remains with strong consistency!
Transactions support
The Performance of RAM at a Cost/Capacity Closer to Disk 
Provides 2x – 3.6x Better TPS/$ 1:50 More Capacity 
ZetaScale-GigaSpaces on SSDs 
Stock GigaSpaces in DRAM 
62 
- 1KB object size and uniform distribution 
- 2 sockets 2.8GHz CPU with total 24 cores, 
CentOS 5.8, 2 FusionIO SLC PCIe cards RAID 
- YCSB measurements performed by SanDisk 
121 
17 
56 
160 
140 
120 
100 
80 
60 
40 
20 
0 
No Read / 100% Write 100 % Read / No Write 
FDF-GigaSpaces on SSDs Stock GigaSpaces in DRAM 
Assumptions: 1TB Flash = $2K; 1TB RAM = $20K 
ZetaScale-GigaSpaces 
1200 
1000 
800 
600 
400 
200 
ZetaScale™ – XAP MemoryXtend 
1:50 
20 
1000 
0 
Capacity 
XAP XAP Extend 
242k Read/Sec
Data is Moving to Cloud 
Source: Managing Storage: Trends, Challenges, and Options (2013-2014). (EMC, 2013)
Orchestration needs to be integrated 
into DataBase solution to make it 
Cloud Ready
Click on the relevant box to get the demo 
Many API’s Same 
Data 
Demo References 
Data Bus (Integration 
with Storm) 
Built In Orchestration
Summary
Nati Shalom 
Check out the slide on http://www.slideshare.net/giganati

Mais conteúdo relacionado

Mais procurados

Dell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with IsilonDell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with Isilon
Greg Kirchoff
 
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
Going from three nines to four nines using Kafka | Tejas Chopra, NetflixGoing from three nines to four nines using Kafka | Tejas Chopra, Netflix
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
HostedbyConfluent
 

Mais procurados (20)

AWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data Analytics
 
Level up your SQL and Azure, by using Rubrik
Level up your SQL and Azure, by using RubrikLevel up your SQL and Azure, by using Rubrik
Level up your SQL and Azure, by using Rubrik
 
Application Centric DevOps
Application Centric DevOpsApplication Centric DevOps
Application Centric DevOps
 
The Elephant in the Cloud: Bring True Cloud Economics to Hadoop/BigInsights
The Elephant in the Cloud:  Bring True Cloud Economics to Hadoop/BigInsightsThe Elephant in the Cloud:  Bring True Cloud Economics to Hadoop/BigInsights
The Elephant in the Cloud: Bring True Cloud Economics to Hadoop/BigInsights
 
Unified Data Access with Gimel
Unified Data Access with GimelUnified Data Access with Gimel
Unified Data Access with Gimel
 
Composable architectures The Lego of IT - Alessandro David
Composable architectures The Lego of IT - Alessandro DavidComposable architectures The Lego of IT - Alessandro David
Composable architectures The Lego of IT - Alessandro David
 
Tech Preview: Kubernetes on Mesosphere DC/OS 1.10
Tech Preview: Kubernetes on Mesosphere DC/OS 1.10Tech Preview: Kubernetes on Mesosphere DC/OS 1.10
Tech Preview: Kubernetes on Mesosphere DC/OS 1.10
 
Cloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GoogleCloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs Google
 
Norway VMUG Tour - The Architecture Behind Policy-Driven Data Protection - A ...
Norway VMUG Tour - The Architecture Behind Policy-Driven Data Protection - A ...Norway VMUG Tour - The Architecture Behind Policy-Driven Data Protection - A ...
Norway VMUG Tour - The Architecture Behind Policy-Driven Data Protection - A ...
 
Případová studie Fortuna aneb Veeam dostupnost v praxi
Případová studie Fortuna aneb Veeam dostupnost v praxiPřípadová studie Fortuna aneb Veeam dostupnost v praxi
Případová studie Fortuna aneb Veeam dostupnost v praxi
 
Dell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with IsilonDell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with Isilon
 
Speeding Up Atlas Deep Learning Platform with Alluxio + Fluid
Speeding Up Atlas Deep Learning Platform with Alluxio + FluidSpeeding Up Atlas Deep Learning Platform with Alluxio + Fluid
Speeding Up Atlas Deep Learning Platform with Alluxio + Fluid
 
Presto: Fast SQL-on-Anything Across Data Lakes, DBMS, and NoSQL Data Stores
Presto: Fast SQL-on-Anything Across Data Lakes, DBMS, and NoSQL Data StoresPresto: Fast SQL-on-Anything Across Data Lakes, DBMS, and NoSQL Data Stores
Presto: Fast SQL-on-Anything Across Data Lakes, DBMS, and NoSQL Data Stores
 
Responding to Digital Transformation With RDS Database Technology
Responding to Digital Transformation With RDS Database TechnologyResponding to Digital Transformation With RDS Database Technology
Responding to Digital Transformation With RDS Database Technology
 
Leveraging ApsaraDB to Deploy Business Data on the Cloud
Leveraging ApsaraDB to Deploy Business Data on the CloudLeveraging ApsaraDB to Deploy Business Data on the Cloud
Leveraging ApsaraDB to Deploy Business Data on the Cloud
 
Migration to Alibaba Cloud
Migration to Alibaba CloudMigration to Alibaba Cloud
Migration to Alibaba Cloud
 
Managing Ceph operational complexity with Juju
Managing Ceph operational complexity with JujuManaging Ceph operational complexity with Juju
Managing Ceph operational complexity with Juju
 
AliCloud Object Storage Service (OSS) Core Features
AliCloud Object Storage Service (OSS) Core FeaturesAliCloud Object Storage Service (OSS) Core Features
AliCloud Object Storage Service (OSS) Core Features
 
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
Going from three nines to four nines using Kafka | Tejas Chopra, NetflixGoing from three nines to four nines using Kafka | Tejas Chopra, Netflix
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
 
A Well Architected SaaS - A Holistic Look at Cloud Architecture - Pop-up Loft...
A Well Architected SaaS - A Holistic Look at Cloud Architecture - Pop-up Loft...A Well Architected SaaS - A Holistic Look at Cloud Architecture - Pop-up Loft...
A Well Architected SaaS - A Holistic Look at Cloud Architecture - Pop-up Loft...
 

Destaque

Introduction to R for Data Mining
Introduction to R for Data MiningIntroduction to R for Data Mining
Introduction to R for Data Mining
Revolution Analytics
 

Destaque (11)

Real-Time Queries in Hadoop w/ Cloudera Impala
Real-Time Queries in Hadoop w/ Cloudera ImpalaReal-Time Queries in Hadoop w/ Cloudera Impala
Real-Time Queries in Hadoop w/ Cloudera Impala
 
Open Stack Days israel Keynote 2017
Open Stack Days israel Keynote 2017Open Stack Days israel Keynote 2017
Open Stack Days israel Keynote 2017
 
Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...
 
The Storyteller's Secret: 3 Keys to Mastering Storytelling to Win Hearts and ...
The Storyteller's Secret: 3 Keys to Mastering Storytelling to Win Hearts and ...The Storyteller's Secret: 3 Keys to Mastering Storytelling to Win Hearts and ...
The Storyteller's Secret: 3 Keys to Mastering Storytelling to Win Hearts and ...
 
Introduction to R for Data Mining
Introduction to R for Data MiningIntroduction to R for Data Mining
Introduction to R for Data Mining
 
Intro to Data Science for Enterprise Big Data
Intro to Data Science for Enterprise Big DataIntro to Data Science for Enterprise Big Data
Intro to Data Science for Enterprise Big Data
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientist
 
A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)
 
Myths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsMyths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data Scientists
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 

Semelhante a Complex Analytics with NoSQL Data Store in Real Time

AWS Summit Tel Aviv - Enterprise Track - Data Warehouse
AWS Summit Tel Aviv - Enterprise Track - Data WarehouseAWS Summit Tel Aviv - Enterprise Track - Data Warehouse
AWS Summit Tel Aviv - Enterprise Track - Data Warehouse
Amazon Web Services
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
Kognitio
 

Semelhante a Complex Analytics with NoSQL Data Store in Real Time (20)

Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
 
Getting Started with Amazon Redshift
 Getting Started with Amazon Redshift Getting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Middle Tier Scalability - Present and Future
Middle Tier Scalability - Present and FutureMiddle Tier Scalability - Present and Future
Middle Tier Scalability - Present and Future
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big Data
 
Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
 
AWS Summit Tel Aviv - Enterprise Track - Data Warehouse
AWS Summit Tel Aviv - Enterprise Track - Data WarehouseAWS Summit Tel Aviv - Enterprise Track - Data Warehouse
AWS Summit Tel Aviv - Enterprise Track - Data Warehouse
 
Stsg17 speaker yousunjeong
Stsg17 speaker yousunjeongStsg17 speaker yousunjeong
Stsg17 speaker yousunjeong
 
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesGetting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
 
Kafka & Hadoop in Rakuten
Kafka & Hadoop in RakutenKafka & Hadoop in Rakuten
Kafka & Hadoop in Rakuten
 
Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon Redshift
 
Time Series Analytics Azure ADX
Time Series Analytics Azure ADXTime Series Analytics Azure ADX
Time Series Analytics Azure ADX
 
Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 

Mais de Nati Shalom

Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...
Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...
Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...
Nati Shalom
 
Case Studies for moving apps to the cloud - DLD 2013
Case Studies for moving apps to the cloud - DLD 2013Case Studies for moving apps to the cloud - DLD 2013
Case Studies for moving apps to the cloud - DLD 2013
Nati Shalom
 
Giga spaces cloudify road map-3 (citi)
Giga spaces cloudify road map-3 (citi)Giga spaces cloudify road map-3 (citi)
Giga spaces cloudify road map-3 (citi)
Nati Shalom
 

Mais de Nati Shalom (20)

Cloudify and terraform integration
Cloudify and terraform integrationCloudify and terraform integration
Cloudify and terraform integration
 
Why NFV and Digital Transformation Projects Fail!
Why NFV and Digital Transformation Projects Fail! Why NFV and Digital Transformation Projects Fail!
Why NFV and Digital Transformation Projects Fail!
 
Cloudify and terraform integration
Cloudify and terraform integrationCloudify and terraform integration
Cloudify and terraform integration
 
1 cloud, 2 clouds, 3 clouds, tons...
1 cloud, 2 clouds, 3 clouds, tons...1 cloud, 2 clouds, 3 clouds, tons...
1 cloud, 2 clouds, 3 clouds, tons...
 
What A No Compromises Hybrid Cloud Looks Like
What A No Compromises Hybrid Cloud Looks Like What A No Compromises Hybrid Cloud Looks Like
What A No Compromises Hybrid Cloud Looks Like
 
Running OpenStack in Production
Running OpenStack in Production Running OpenStack in Production
Running OpenStack in Production
 
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...
Orchestration tool roundup   kubernetes vs. docker vs. heat vs. terra form vs...Orchestration tool roundup   kubernetes vs. docker vs. heat vs. terra form vs...
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...
 
Real World Example of Orchestrating Docker, Node JS, NFV on OpenStack
Real World Example of Orchestrating Docker, Node JS, NFV on OpenStackReal World Example of Orchestrating Docker, Node JS, NFV on OpenStack
Real World Example of Orchestrating Docker, Node JS, NFV on OpenStack
 
Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...
Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...
Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...
 
OpenStack Juno The Complete Lowdown and Tales from the Summit
OpenStack Juno The Complete Lowdown and Tales from the SummitOpenStack Juno The Complete Lowdown and Tales from the Summit
OpenStack Juno The Complete Lowdown and Tales from the Summit
 
Application and Network Orchestration using Heat & Tosca
Application and Network Orchestration using Heat & ToscaApplication and Network Orchestration using Heat & Tosca
Application and Network Orchestration using Heat & Tosca
 
Introduction to Cloudify for OpenStack users
Introduction to Cloudify for OpenStack users Introduction to Cloudify for OpenStack users
Introduction to Cloudify for OpenStack users
 
Software Defined Operator
Software Defined OperatorSoftware Defined Operator
Software Defined Operator
 
Is Orchestration the Next Big Thing in DevOps
Is Orchestration the Next Big Thing in DevOpsIs Orchestration the Next Big Thing in DevOps
Is Orchestration the Next Big Thing in DevOps
 
Application Centric Approach to Devops
Application Centric Approach to DevopsApplication Centric Approach to Devops
Application Centric Approach to Devops
 
Case Studies for moving apps to the cloud - DLD 2013
Case Studies for moving apps to the cloud - DLD 2013Case Studies for moving apps to the cloud - DLD 2013
Case Studies for moving apps to the cloud - DLD 2013
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using Storm
 
Disaster Recovery on Demand on the Cloud
Disaster Recovery on Demand on the CloudDisaster Recovery on Demand on the Cloud
Disaster Recovery on Demand on the Cloud
 
Giga spaces cloudify road map-3 (citi)
Giga spaces cloudify road map-3 (citi)Giga spaces cloudify road map-3 (citi)
Giga spaces cloudify road map-3 (citi)
 
Avoiding Cloud Outage
Avoiding Cloud OutageAvoiding Cloud Outage
Avoiding Cloud Outage
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 

Complex Analytics with NoSQL Data Store in Real Time

  • 1. Complex Analytics with NoSQL Data Store in Real Time Nested Queries, Projection, Transactions and more Nati Shalom @natishalom slideshare.net/giganati
  • 2. What were here to discuss? Making Sense of the Exploding Data World How that World Could Look Like if Disk is no Longer the Bottleneck Live Demo
  • 3. Making Sense of The Exploding Data World
  • 4. Capacity and Performance Drives New Data Management Technologies PB TB GB Data Volume Data Mining Machine Learning Data Business Intelligence Warehouse High Throughput OLTP Yr Mo Day Hr Min Sec MS μS Data Velocity Operational Intelligence Exploratory Analytics OLTP Streaming
  • 5. Let’s Look at Tradeoffs of Some Selected Solutions
  • 6. SQL Queries • Query: SQL • Semantics: • CRUD • Aggregation • Projection • Partial update • Performance: 100’s/Sec • Consistency: Transactional • Scaling: Mostly Scale-UP • Availability: Disk Based
  • 7. NoSQL • Query: Proprietary but rich • Semantics: • CRUD • Limited Aggregation (Map/Reduce) • No Projection • No Partial update • Performance: 1000s/Sec • Consistency: Eventual • Scaling: Mostly Scale-Out • Availability: Based on replication
  • 8. IMDG • Query: Propriety but rich • Semantics: • CRUD • Aggregation API + Map/Reduce • Projection (GigaSpaces) • Partial Update (GigaSpaces) • Performance: 100k/sec • Consistency: Transactional • Scaling: Mostly Scale-Out • Availability: Replication
  • 9. Key/Value • Query: Key, Value • Semantics: • Mostly Read • No Aggregation • No Projection • No Partial update • Performance: 1M’s/sec • Consistency: Atomic • Scaling: Mostly Scale-Out • Availability: Limited (varies quite substantially between implementations)
  • 10. Stream Processing (Storm) • Semantics – Event driven data processing • Used for continues updates Spouts – No need for a costly “SELECT FOR UPDATE” • Performance: 10’sM/sec updates Bolt
  • 11. Common Assumption Disk is the bottleneck 100X 10,000X HDD Latency (Seek & Rotate) = Little Improvement 2010 Performance^10 2000 2020 Source: GigaOM Research
  • 12. Capacity and Performance Drives New Data Management Technologies (Source: IDC, 2013) Big Data (Hadoop) NoSQL In Memory, Stream Processing RDBMS
  • 13. There’s No One Size Fits All
  • 14. A Typical App Looks Like This.. Front End Analytics RT STORM Batch The Data Flow Complexity
  • 15. What if Disk Was no Longer the Bottleneck? FLASH Closes the CPU to Storage Gap
  • 16. Our Application Cloud Look Like This.. Front End High Speed Data Store (Using Flash/NVM) Key/Value SQL Document Graph Map/Reduce Transactional Disk Becomes the new Tape StreamBase Common Data Store serving Multiple Semantics/API
  • 17. We're not there yet .. But..
  • 18. We can use High Speed Data Bus for Integrating All of our Data Sources Front End Analytics RT STORM Batch High Speed Data Bus (Built-In Caching) RT Transactional Data Access Direct Access RT Streaming Hadoop Synch MySQL Synch Mongo Synch
  • 19. High Speed Data Bus (Zoom In)
  • 20. Designed for Transactional and Analytics Scenarios.. Homeland Security Real Time Search Social eCommerce User Tracking & Engagement Financial Services
  • 21. Many API’s – Same Data Key/Value SQL Document Graph Map/Reduce Transactional
  • 22. Let’s take a closer look..
  • 23. Nested Queries & Projections
  • 25. Fast Update … Remains with strong consistency!
  • 27. The Performance of RAM at a Cost/Capacity Closer to Disk Provides 2x – 3.6x Better TPS/$ 1:50 More Capacity ZetaScale-GigaSpaces on SSDs Stock GigaSpaces in DRAM 62 - 1KB object size and uniform distribution - 2 sockets 2.8GHz CPU with total 24 cores, CentOS 5.8, 2 FusionIO SLC PCIe cards RAID - YCSB measurements performed by SanDisk 121 17 56 160 140 120 100 80 60 40 20 0 No Read / 100% Write 100 % Read / No Write FDF-GigaSpaces on SSDs Stock GigaSpaces in DRAM Assumptions: 1TB Flash = $2K; 1TB RAM = $20K ZetaScale-GigaSpaces 1200 1000 800 600 400 200 ZetaScale™ – XAP MemoryXtend 1:50 20 1000 0 Capacity XAP XAP Extend 242k Read/Sec
  • 28. Data is Moving to Cloud Source: Managing Storage: Trends, Challenges, and Options (2013-2014). (EMC, 2013)
  • 29. Orchestration needs to be integrated into DataBase solution to make it Cloud Ready
  • 30.
  • 31. Click on the relevant box to get the demo Many API’s Same Data Demo References Data Bus (Integration with Storm) Built In Orchestration
  • 33. Nati Shalom Check out the slide on http://www.slideshare.net/giganati

Notas do Editor

  1. Some of the emerging NewSQL and NoSQL disk-based databases might have had the ability to deal with the more demanding data volume and variety but… But disk-based databases have always been I/O bound – in other words, keeping up with the new velocity demands of data is much harder. Disks have always gotten in the way of database velocity or throughput. The closer to real-time that transaction throughput or analytics must be, the harder it is for disk-based approaches to keep up.
  2. It constructs a processing graph that feeds data from an input source through processing nodes. The processing graph is called a "topology". The input data sources are called "spouts", and the processing nodes are called "bolts". The data model consists of tuples. Tuples flow from Spouts to the bolts, which execute user code.
  3. http://www.zdnet.com/storage-in-2014-an-overview-7000024712/
  4. http://blogs.technet.com/b/dataplatforminsider/archive/2013/05/01/leveraging-flash-across-the-microsoft-sql-server-stack.aspx
  5. http://www.zdnet.com/storage-in-2014-an-overview-7000024712/