SlideShare a Scribd company logo
1 of 26
Download to read offline
®
© 2015 MapR Technologies 1
®
© 2014 MapR Technologies
Maintaining Low Latency while Maximizing
Throughput
Yuliya Feldman
February 19, 2015
®
© 2015 MapR Technologies 2
Top-Ranked NoSQL
Top-Ranked Hadoop
Distribution
Top-Ranked SQL-on-Hadoop
Solution
®
®
© 2015 MapR Technologies 3
What We Have – Cluster per Use Case
YARN cluster Web ServersYARN
cluster
Too much isolation and poor resource utilization
®
© 2015 MapR Technologies 4
Need Datacenter-wide Resource Manager
What choices do we have?
•  YARN (capacity/fair scheduler)
•  Omega
•  Mesos
•  Others (e.g. Quasar)
®
© 2015 MapR Technologies 5
YARN
•  Motivated by Mesos, but is a Hadoop resource manager
•  Manages Hadoop resources well – “retail”
•  Pluggable schedulers for Hadoop
•  Started handling long-lived tasks
•  Can pre-empt tasks
•  YARN-1051 - YARN Admission Control/Planner: enhancing the
resource allocation model with time
®
© 2015 MapR Technologies 6
Mesos
•  Data-center wide resource manager – negotiator between
frameworks
•  Manages all resources for frameworks well, not particular
framework (e.g. Hadoop) – “wholesale”
•  Doing two-level scheduling
•  Excellent Docker support
•  Schedules, allocates, and isolates cpu, mem, disk, network, and
arbitrary custom resource types
®
© 2015 MapR Technologies 7
Can we….
–  Continue leveraging YARN resource scheduling capabilities
for YARN-based applications?
–  Treat YARN as “yet another” framework within Mesos?
–  Let YARN not bother about non-YARN applications
coexistence?
®
© 2015 MapR Technologies 8
Introducing
Myriad
®
© 2015 MapR Technologies 9
Apache Myriad: True Multi-tenancy
•  Open-source project launched Oct `14
–  MapR, eBay, Mesosphere, others participating
•  Allows Mesos and YARN to cooperate with each other
•  Mesos: datacenter-wide resource manager
–  Dockerized containers and/or cgroups used for isolation
•  Hadoop is launched inside cgroup containers
•  Myriad manages conversation between RM and Mesos master
and between NM and Mesos slaves
®
© 2015 MapR Technologies 10
Why Myriad
•  Run many types of compute frameworks side-by-side
–  Hadoop family, etc. (YARN, Spark, Kafka, Storm)
–  Web-server farm
–  MPP databases (e.g., Vertica)
–  Other services: SOA web-services, Jenkins/build-farm, cron-jobs, shell
scripts, Kubernetes, Cassandra, ElasticSearch, etc.
–  Each compute framework is a cluster in itself
•  Need to break up a physical cluster into many virtual clusters
–  Using Docker (containers) for good isolation
–  But most schedulers can only manage individual nodes inside a cluster
•  Move resources between virtual clusters on-demand
®
© 2015 MapR Technologies 11
Utilize Excess Capacity for Analytics
DC Server Farm Hadoop Analytics
Utilizatiion
Long lived excess
capacity situations
•  “Scale up” Hadoop during long periods of low utilization
•  “Scale down” Hadoop ahead of anticipated high utilization
®
© 2015 MapR Technologies 12
Myriad Again
•  Mesos creates virtual
clusters
•  YARN uses resources
provided by Mesos
•  Myriad can ask YARN to
release some resources
•  Or give it more
Mesos
YARN cluster
YARN
cluster
Web Servers
®
© 2015 MapR Technologies 13
Myriad Services Architecture
Node ManagerResource Manager
Executor
Mesos
Scheduler
Mesos
Container
Container
App
YARN
Scheduler
(fairshare)
Offers
Launch
Tasks
Launch
Tasks
Task
Status
Launch containers
via HB
Submit
Map<Node,
Capacity>
®
© 2015 MapR Technologies 14
REST API
Framework
+
Master
2.
Mesos
Resource
Manager
YARN
Mesos Slave
Mesos
Node
Node
Manager
YARN
Launch Node
Manager
2.5 CPU,
2.5 GB
Advertise
Resources
2 CPU,
2 GB
How it works
Mesos
scheduler
®
© 2015 MapR Technologies 15
REST API
Framework
+
Master
2.
Mesos
Resource
Manager
YARN
Mesos Slave
Mesos
Node
Node
Manager
YARN
Launch Containers
C1
C2
Mesos
scheduler
®
© 2015 MapR Technologies 16
2.
Slave
Mesos
Node1
Node
Manager
YARN
8 CPU, 8 GB
2.
Slave
Mesos
Node2
Node
Manager
YARN
8 CPU,8 GB
REST API
Framework
+
Master
Mesos
Resource
Manager
YARN
Web Traffic spike
Resize
NodeManager(s)
6 CPU, 6 GB 6 CPU, 6 GBWebService
2 CPU, 2 GB 2 CPU, 2 GB
WebService
Use Case – Web Traffic spikes
Mesos
scheduler
®
© 2015 MapR Technologies 17
2.
Slave
Mesos
Node1
Node
Manager
YARN
8 CPU, 8 GB
2.
Slave
Mesos
Node2
Node
Manager
YARN
8 CPU,8 GB
REST API
Framework
+
Master
Mesos
Resource
Manager
YARN
Web Traffic spike
over
Resize
NodeManager(s)
6 CPU, 6 GB 6 CPU, 6 GBWebService
2 CPU, 2 GB 2 CPU, 2 GB
WebService
Mesos
scheduler
®
© 2015 MapR Technologies 18
Myriad Demo
At MapR booth 1009
®
© 2015 MapR Technologies 19
Maintaining Low Latency while Maximizing
Throughput
on a single cluster
®
© 2015 MapR Technologies 20
Batch and Real-time Analytics Together
Compute Cluster
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
Cluster/DC
Scheduler
®
© 2015 MapR Technologies 21
Sharing Resources between Batch and Real-Time
•  Real-time services resource usage pattern can be unpredictable
–  Analysts use services during the day
–  Analysts on the other side of the globe work during the night
–  There are steady states, spikes and dips in the workloads
•  Batch resource usage – more or less predictable
–  Running same jobs all over again with some occasional spikes and dips
®
© 2015 MapR Technologies 22
Real-time Services Resource Utilization/Provisioning
Aggressive
resource provisioning.
< 10% utilization
Moderate resource
provisioning < 60%
utilization
Conservative
resource
provisioning >
80% utilization
®
© 2015 MapR Technologies 23
What Can We Do To Provision Conservatively?
Compute Cluster
NM
DrillBit
NM
DrillBit
NM
DrillBit
NM
DrillBit
Cluster/DC
Resource
Manager
Drill
Service
Watcher
Monitors
Drill
Performance
Latency
decrease
Accept Offers
(Mesos)
Need additional
Containers
(YARN)
Allocate
Resources
(Preempt if
needed)
C1
C2 C3
Dummy
containers
Latency
increase
®
© 2015 MapR Technologies 24
SHOWTIME
®
© 2015 MapR Technologies 25
®
© 2015 MapR Technologies 26
Q&A
@mapr maprtech
yfeldman@mapr.com
Engage with us!
MapR
maprtech
mapr-technologies

More Related Content

What's hot

Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconYiwei Ma
 
Operationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN based Hadoop Clusters in the CloudOperationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN based Hadoop Clusters in the CloudDataWorks Summit/Hadoop Summit
 
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...DataWorks Summit/Hadoop Summit
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobileDataWorks Summit
 
Hadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and FutureHadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and FutureDataWorks Summit
 
Scaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInScaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInDataWorks Summit
 
Cloudera Impala + PostgreSQL
Cloudera Impala + PostgreSQLCloudera Impala + PostgreSQL
Cloudera Impala + PostgreSQLliuknag
 
Dchug m7-30 apr2013
Dchug m7-30 apr2013Dchug m7-30 apr2013
Dchug m7-30 apr2013jdfiori
 
Syncsort et le retour d'expérience ComScore
Syncsort et le retour d'expérience ComScoreSyncsort et le retour d'expérience ComScore
Syncsort et le retour d'expérience ComScoreModern Data Stack France
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick viewRajesh Nadipalli
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksDataWorks Summit
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop DataWorks Summit/Hadoop Summit
 
The Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer ShiranThe Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer ShiranMapR Technologies
 
Data Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for HadoopData Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for HadoopGwen (Chen) Shapira
 
Apache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft LibraryApache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft LibraryTsz-Wo (Nicholas) Sze
 
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...Cloudera, Inc.
 

What's hot (20)

Time-oriented event search. A new level of scale
Time-oriented event search. A new level of scale Time-oriented event search. A new level of scale
Time-oriented event search. A new level of scale
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
 
Operationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN based Hadoop Clusters in the CloudOperationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN based Hadoop Clusters in the Cloud
 
Apache Spark & Hadoop
Apache Spark & HadoopApache Spark & Hadoop
Apache Spark & Hadoop
 
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
 
Hadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and FutureHadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and Future
 
Scaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInScaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedIn
 
Cloudera Impala + PostgreSQL
Cloudera Impala + PostgreSQLCloudera Impala + PostgreSQL
Cloudera Impala + PostgreSQL
 
Dchug m7-30 apr2013
Dchug m7-30 apr2013Dchug m7-30 apr2013
Dchug m7-30 apr2013
 
Syncsort et le retour d'expérience ComScore
Syncsort et le retour d'expérience ComScoreSyncsort et le retour d'expérience ComScore
Syncsort et le retour d'expérience ComScore
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
 
Big Data Journey
Big Data JourneyBig Data Journey
Big Data Journey
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
 
The Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer ShiranThe Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer Shiran
 
What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS
 
Data Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for HadoopData Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for Hadoop
 
Apache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft LibraryApache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft Library
 
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
 

Viewers also liked

The mysqlnd replication and load balancing plugin
The mysqlnd replication and load balancing pluginThe mysqlnd replication and load balancing plugin
The mysqlnd replication and load balancing pluginUlf Wendel
 
My sql cluster case study apr16
My sql cluster case study apr16My sql cluster case study apr16
My sql cluster case study apr16Sumi Ryu
 
Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersEnabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersDataWorks Summit
 
Buckle promotional campaign
Buckle promotional campaignBuckle promotional campaign
Buckle promotional campaignTaylor Pickering
 
Netflix: A Case Study
Netflix: A Case StudyNetflix: A Case Study
Netflix: A Case StudyMorgan Miller
 
MBA case study presentation template
MBA case study presentation templateMBA case study presentation template
MBA case study presentation templategorvis
 

Viewers also liked (7)

The mysqlnd replication and load balancing plugin
The mysqlnd replication and load balancing pluginThe mysqlnd replication and load balancing plugin
The mysqlnd replication and load balancing plugin
 
My sql cluster case study apr16
My sql cluster case study apr16My sql cluster case study apr16
My sql cluster case study apr16
 
Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersEnabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
 
Buckle promotional campaign
Buckle promotional campaignBuckle promotional campaign
Buckle promotional campaign
 
Netflix: A Case Study
Netflix: A Case StudyNetflix: A Case Study
Netflix: A Case Study
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
MBA case study presentation template
MBA case study presentation templateMBA case study presentation template
MBA case study presentation template
 

Similar to Maintaining Low Latency While Maximizing Throughput on a Single Cluster

State of Resource Management in Big Data
State of Resource Management in Big DataState of Resource Management in Big Data
State of Resource Management in Big DataKhalid Ahmed
 
State of Resource Management in Big Data
State of Resource Management in Big DataState of Resource Management in Big Data
State of Resource Management in Big DataYong Feng
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformBikas Saha
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopHortonworks
 
YARN: a resource manager for analytic platform
YARN: a resource manager for analytic platformYARN: a resource manager for analytic platform
YARN: a resource manager for analytic platformTsuyoshi OZAWA
 
MANTL Data Platform, Microservices and BigData Services
MANTL Data Platform, Microservices and BigData ServicesMANTL Data Platform, Microservices and BigData Services
MANTL Data Platform, Microservices and BigData ServicesCisco DevNet
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?DataWorks Summit
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformDataStax Academy
 
As fast as a grid, as safe as a database
As fast as a grid, as safe as a databaseAs fast as a grid, as safe as a database
As fast as a grid, as safe as a databasegojkoadzic
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopHortonworks
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesDataWorks Summit
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batchboorad
 
Zeta architecture - Hive London May15
Zeta architecture - Hive London May15Zeta architecture - Hive London May15
Zeta architecture - Hive London May15MapR Technologies
 
Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1Sandeep Kunkunuru
 
Containerization - The DevOps Revolution
Containerization - The DevOps RevolutionContainerization - The DevOps Revolution
Containerization - The DevOps RevolutionYulian Slobodyan
 
Running MongoDB on AWS
Running MongoDB on AWSRunning MongoDB on AWS
Running MongoDB on AWSMongoDB
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and FutureDataWorks Summit
 

Similar to Maintaining Low Latency While Maximizing Throughput on a Single Cluster (20)

State of Resource Management in Big Data
State of Resource Management in Big DataState of Resource Management in Big Data
State of Resource Management in Big Data
 
State of Resource Management in Big Data
State of Resource Management in Big DataState of Resource Management in Big Data
State of Resource Management in Big Data
 
MHUG - YARN
MHUG - YARNMHUG - YARN
MHUG - YARN
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
 
YARN: a resource manager for analytic platform
YARN: a resource manager for analytic platformYARN: a resource manager for analytic platform
YARN: a resource manager for analytic platform
 
MANTL Data Platform, Microservices and BigData Services
MANTL Data Platform, Microservices and BigData ServicesMANTL Data Platform, Microservices and BigData Services
MANTL Data Platform, Microservices and BigData Services
 
Yarnthug2014
Yarnthug2014Yarnthug2014
Yarnthug2014
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
 
As fast as a grid, as safe as a database
As fast as a grid, as safe as a databaseAs fast as a grid, as safe as a database
As fast as a grid, as safe as a database
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond Kubernetes
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batch
 
Zeta architecture - Hive London May15
Zeta architecture - Hive London May15Zeta architecture - Hive London May15
Zeta architecture - Hive London May15
 
Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1
 
Containerization - The DevOps Revolution
Containerization - The DevOps RevolutionContainerization - The DevOps Revolution
Containerization - The DevOps Revolution
 
Running MongoDB on AWS
Running MongoDB on AWSRunning MongoDB on AWS
Running MongoDB on AWS
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
 

More from MapR Technologies

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscapeMapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureMapR Technologies
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 

More from MapR Technologies (20)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 

Recently uploaded

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Recently uploaded (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

Maintaining Low Latency While Maximizing Throughput on a Single Cluster

  • 1. ® © 2015 MapR Technologies 1 ® © 2014 MapR Technologies Maintaining Low Latency while Maximizing Throughput Yuliya Feldman February 19, 2015
  • 2. ® © 2015 MapR Technologies 2 Top-Ranked NoSQL Top-Ranked Hadoop Distribution Top-Ranked SQL-on-Hadoop Solution ®
  • 3. ® © 2015 MapR Technologies 3 What We Have – Cluster per Use Case YARN cluster Web ServersYARN cluster Too much isolation and poor resource utilization
  • 4. ® © 2015 MapR Technologies 4 Need Datacenter-wide Resource Manager What choices do we have? •  YARN (capacity/fair scheduler) •  Omega •  Mesos •  Others (e.g. Quasar)
  • 5. ® © 2015 MapR Technologies 5 YARN •  Motivated by Mesos, but is a Hadoop resource manager •  Manages Hadoop resources well – “retail” •  Pluggable schedulers for Hadoop •  Started handling long-lived tasks •  Can pre-empt tasks •  YARN-1051 - YARN Admission Control/Planner: enhancing the resource allocation model with time
  • 6. ® © 2015 MapR Technologies 6 Mesos •  Data-center wide resource manager – negotiator between frameworks •  Manages all resources for frameworks well, not particular framework (e.g. Hadoop) – “wholesale” •  Doing two-level scheduling •  Excellent Docker support •  Schedules, allocates, and isolates cpu, mem, disk, network, and arbitrary custom resource types
  • 7. ® © 2015 MapR Technologies 7 Can we…. –  Continue leveraging YARN resource scheduling capabilities for YARN-based applications? –  Treat YARN as “yet another” framework within Mesos? –  Let YARN not bother about non-YARN applications coexistence?
  • 8. ® © 2015 MapR Technologies 8 Introducing Myriad
  • 9. ® © 2015 MapR Technologies 9 Apache Myriad: True Multi-tenancy •  Open-source project launched Oct `14 –  MapR, eBay, Mesosphere, others participating •  Allows Mesos and YARN to cooperate with each other •  Mesos: datacenter-wide resource manager –  Dockerized containers and/or cgroups used for isolation •  Hadoop is launched inside cgroup containers •  Myriad manages conversation between RM and Mesos master and between NM and Mesos slaves
  • 10. ® © 2015 MapR Technologies 10 Why Myriad •  Run many types of compute frameworks side-by-side –  Hadoop family, etc. (YARN, Spark, Kafka, Storm) –  Web-server farm –  MPP databases (e.g., Vertica) –  Other services: SOA web-services, Jenkins/build-farm, cron-jobs, shell scripts, Kubernetes, Cassandra, ElasticSearch, etc. –  Each compute framework is a cluster in itself •  Need to break up a physical cluster into many virtual clusters –  Using Docker (containers) for good isolation –  But most schedulers can only manage individual nodes inside a cluster •  Move resources between virtual clusters on-demand
  • 11. ® © 2015 MapR Technologies 11 Utilize Excess Capacity for Analytics DC Server Farm Hadoop Analytics Utilizatiion Long lived excess capacity situations •  “Scale up” Hadoop during long periods of low utilization •  “Scale down” Hadoop ahead of anticipated high utilization
  • 12. ® © 2015 MapR Technologies 12 Myriad Again •  Mesos creates virtual clusters •  YARN uses resources provided by Mesos •  Myriad can ask YARN to release some resources •  Or give it more Mesos YARN cluster YARN cluster Web Servers
  • 13. ® © 2015 MapR Technologies 13 Myriad Services Architecture Node ManagerResource Manager Executor Mesos Scheduler Mesos Container Container App YARN Scheduler (fairshare) Offers Launch Tasks Launch Tasks Task Status Launch containers via HB Submit Map<Node, Capacity>
  • 14. ® © 2015 MapR Technologies 14 REST API Framework + Master 2. Mesos Resource Manager YARN Mesos Slave Mesos Node Node Manager YARN Launch Node Manager 2.5 CPU, 2.5 GB Advertise Resources 2 CPU, 2 GB How it works Mesos scheduler
  • 15. ® © 2015 MapR Technologies 15 REST API Framework + Master 2. Mesos Resource Manager YARN Mesos Slave Mesos Node Node Manager YARN Launch Containers C1 C2 Mesos scheduler
  • 16. ® © 2015 MapR Technologies 16 2. Slave Mesos Node1 Node Manager YARN 8 CPU, 8 GB 2. Slave Mesos Node2 Node Manager YARN 8 CPU,8 GB REST API Framework + Master Mesos Resource Manager YARN Web Traffic spike Resize NodeManager(s) 6 CPU, 6 GB 6 CPU, 6 GBWebService 2 CPU, 2 GB 2 CPU, 2 GB WebService Use Case – Web Traffic spikes Mesos scheduler
  • 17. ® © 2015 MapR Technologies 17 2. Slave Mesos Node1 Node Manager YARN 8 CPU, 8 GB 2. Slave Mesos Node2 Node Manager YARN 8 CPU,8 GB REST API Framework + Master Mesos Resource Manager YARN Web Traffic spike over Resize NodeManager(s) 6 CPU, 6 GB 6 CPU, 6 GBWebService 2 CPU, 2 GB 2 CPU, 2 GB WebService Mesos scheduler
  • 18. ® © 2015 MapR Technologies 18 Myriad Demo At MapR booth 1009
  • 19. ® © 2015 MapR Technologies 19 Maintaining Low Latency while Maximizing Throughput on a single cluster
  • 20. ® © 2015 MapR Technologies 20 Batch and Real-time Analytics Together Compute Cluster NM DrillBit NM DrillBit NM DrillBit NM DrillBit NM DrillBit NM DrillBit NM DrillBit NM DrillBit Cluster/DC Scheduler
  • 21. ® © 2015 MapR Technologies 21 Sharing Resources between Batch and Real-Time •  Real-time services resource usage pattern can be unpredictable –  Analysts use services during the day –  Analysts on the other side of the globe work during the night –  There are steady states, spikes and dips in the workloads •  Batch resource usage – more or less predictable –  Running same jobs all over again with some occasional spikes and dips
  • 22. ® © 2015 MapR Technologies 22 Real-time Services Resource Utilization/Provisioning Aggressive resource provisioning. < 10% utilization Moderate resource provisioning < 60% utilization Conservative resource provisioning > 80% utilization
  • 23. ® © 2015 MapR Technologies 23 What Can We Do To Provision Conservatively? Compute Cluster NM DrillBit NM DrillBit NM DrillBit NM DrillBit Cluster/DC Resource Manager Drill Service Watcher Monitors Drill Performance Latency decrease Accept Offers (Mesos) Need additional Containers (YARN) Allocate Resources (Preempt if needed) C1 C2 C3 Dummy containers Latency increase
  • 24. ® © 2015 MapR Technologies 24 SHOWTIME
  • 25. ® © 2015 MapR Technologies 25
  • 26. ® © 2015 MapR Technologies 26 Q&A @mapr maprtech yfeldman@mapr.com Engage with us! MapR maprtech mapr-technologies