SlideShare uma empresa Scribd logo
1 de 24
Baixar para ler offline
MapR Learning Guide
Selvaraaju Murugesan
May 6, 2017
Selvaraaju Murugesan MapR Learning Guide
Storage Pool
MapR-FS groups disks into storage pools, usually made up of
two or three disks
Stripe Width parameter lets you congure number of disks
per storage pool
Each node in a MapR cluster can support up to 36 storage
pools
Use mrcong command to create, remove and manage storage
polols, disk groups and disks
Selvaraaju Murugesan MapR Learning Guide
Example 1
If you have 11 disks in a node, how many storage pools will be
created by default?
Selvaraaju Murugesan MapR Learning Guide
Example 1 Solution
If you have 11 disks in a node, how many storage pools will be
created by default?
3 storage pool of 3 disks each
1 storage pool of 2 disks
Selvaraaju Murugesan MapR Learning Guide
Example 2
If you have 9 disks in a node, how many storage pools will be
created by default?
Selvaraaju Murugesan MapR Learning Guide
Example 2 Solution
If you have 9 disks in a node, how many storage pools will be
created by default?
3 storage pool of 3 disks each
Selvaraaju Murugesan MapR Learning Guide
Tradeos
If a disk fails in a storage pool, then an entire storage pool is
taken oine and MapR will automatically begin data
replication
More disks increase more data to be replicated in case of disk
failure
Ideal scenario is have 3 disks per storage pool
Remember to have same size and speed disk drives in a
storage pool for good performance
Selvaraaju Murugesan MapR Learning Guide
List of Ports
Port Number Services
7221 CLDB
8443 MCS
9443 MapR Installer
8888 Hue
8047 Drill
5181 Zookeeper
19888 ResourceManager
Selvaraaju Murugesan MapR Learning Guide
Default Settings
If a disk fails, then the data replication starts immediately
If a node fails, then the data replication starts after an hour
(60 minutes)
Node maintenance default time out is 1 hour after which data
replication starts (timeout is congurable)
To see / change conguration use the comand maprcli cong
load
If the CLDB heartbeat is greater than 5 seconds, an alarm is
raised and must be cleared manually
Secondary CLDB in a node will perform read operations
Selvaraaju Murugesan MapR Learning Guide
CLDB
Name container holds the metadata for the les and directories
in the volume, and the rst 64 KB of each le
Data container and Name container can have dierent
replication factors
Data replication happens at volume level
For high availability, install more Zookeeper in the nodes
/opt/mapr/roles
Contains the list of congured services on a given node
/opt/cores
Core les are copies of the contents of memory when certain
anomalies are detected. Core les are located in /opt/cores,
and the name of the le will include the name of the service
that experienced an issue. When a core le is created, an
alarm is raised
Selvaraaju Murugesan MapR Learning Guide
Zookeeper
If you want to start zookeeper
service mapr-zookeeper start
If you want to stop zookeeper
service mapr-zookeeper stop
If you want to know the status of zookeeper
service mapr-zookeeper qstatus
ZooKeeper should always be the rst service that is started
Selvaraaju Murugesan MapR Learning Guide
MapR Commands
To know list of services in a node
maprcli service list
maprcli node list -columns id,ip,svc
To list CLDBs
maprcli node listcldbs
CLDB master
maprcli node cldbmaster
Node topology
maprcli node topo
Selvaraaju Murugesan MapR Learning Guide
Cluster Permissions
Log into the MCS (login)
This level also includes permissions to use the API and
command-line interface, and grants read access on the cluster
and its volumes
Start and stop services (SS)
Create volumes (CV)
Edit and view Access Control Lists, or permissions (A)
Full control gives user the ability to do everything except edit
permissions (FC)
Selvaraaju Murugesan MapR Learning Guide
Volume Permissions
Dump or back up the volume (dump)
Mirror or restore the volume (restore)
Modify volume properties, which includes creating and deleting
snapshots, (m)
Delete the volume (d)
View and edit volume permissions (A)
Perform all operations except view and edit volume
permissions (FC)
Selvaraaju Murugesan MapR Learning Guide
MapR Utilities
Congure.sh
To setup a cluster node
To change services such as zookeeper, CLDB, etc..
Disksetup
formats specied disks for use by MapR storage
fsck
used to nd and x inconsistencies in the lesystem
to make the metadata consistent on the next load of the
storage pool
gfsck
performs a scan and repair operation on a cluster, volume, or
snapshot
Selvaraaju Murugesan MapR Learning Guide
MapR Utilities
mrcong
create, remove, and manage storage pools, disk groups, and
disks; and provide information about containers
mapr-support-collect.sh
collect diagnostic information from all nodes in the cluster
mapr-support-dump.sh
ollects node and cluster-level information about the node
where the script is invoked
cldbguts
monitor the activity of the CLDB
Selvaraaju Murugesan MapR Learning Guide
NTP Server
All nodes should synchronize to one internal NTP server
systemctl command
ntpq command
Selvaraaju Murugesan MapR Learning Guide
Logs
Centralised logging
Logs kept for 30 days by default
symbolic links to the logs
Local logging
logs kept for 3 hours by default
YARN logs expire after 3 hours
time starts after the job begins
Logs stord in /opt/mapr/logs deleted after 10 days by default
Change the settings in yarn-site.xml le
Retention time are given in seconds
Selvaraaju Murugesan MapR Learning Guide
Space Requirements
/opt - 128GB
/tmp - 10GB
/opt/mapr/zkdata  500MB
Swap space
110% physical memory
Minimum of 24GB and maximum of 128GB
Use LVM for boot drives
Selvaraaju Murugesan MapR Learning Guide
Volume Quota
Once the Advisory Quota is reached
alarm raised
Once Hard Quota is reached
no futher data is written
Only compressed data size is counted against the volume quota
Selvaraaju Murugesan MapR Learning Guide
Pre / Post-Installation Check
Pre-installation check
Stream  CPU
Iozone  I/O speed memory (destructive write/read)
Rpctest  network speed
Post-installation check
DFSIO - I/O speed memory (mapreduce job)
RWspeedtest
TerraGen / Terrasort  mapreduce job
Terrasort job suggest possible problem with hard drive or
controller
Selvaraaju Murugesan MapR Learning Guide
Snapshot / Mirror
Snapshots are stored at top level of every volume (hidden
directory)
Scheduled snapshots expire automatically
Mirror start - start mirror operation between source 
destination
Mirror push - push updates from source volume to all mirror
volume
Mirror operation uses
70% network bandwidth
les are compressed
Selvaraaju Murugesan MapR Learning Guide
Role / Disk Balancer
Disk balancer
redistributes the data in all nodes
use disk balancer after you have added many new nodes
% concurrent disk rebalancer  2 to 30%
Role balancer 
evenly distriburtes master containers
o by default; starts after 30 minutes after CLDB (can be
congured)
Delay for active data 120 sec  1800 sec (2 min  30 min)
Selvaraaju Murugesan MapR Learning Guide
Job Scheduler
Fair scheduler is default
FIFO  Capacity scheduler
Can be on memory; also on CPU
User has each own queue
Weights to set resources
Allocation le (reloaded every 10 seconds) to modify resource
managers
/opt/mapr/Hadoop/version/etc/hadoop /fair-scheduler.xml
Selvaraaju Murugesan MapR Learning Guide

Mais conteúdo relacionado

Mais procurados

Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Eric Sun
 
Oracle RAC 19c and Later - Best Practices #OOWLON
Oracle RAC 19c and Later - Best Practices #OOWLONOracle RAC 19c and Later - Best Practices #OOWLON
Oracle RAC 19c and Later - Best Practices #OOWLONMarkus Michalewicz
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesYoshinori Matsunobu
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introductionPooyan Mehrparvar
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache KuduJeff Holoman
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseDataWorks Summit/Hadoop Summit
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...Flink Forward
 
Rman Presentation
Rman PresentationRman Presentation
Rman PresentationRick van Ek
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsSpark Summit
 
Apache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceApache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceSijie Guo
 
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming dataUsing Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming dataMike Percy
 

Mais procurados (20)

Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)
 
Oracle RAC 19c and Later - Best Practices #OOWLON
Oracle RAC 19c and Later - Best Practices #OOWLONOracle RAC 19c and Later - Best Practices #OOWLON
Oracle RAC 19c and Later - Best Practices #OOWLON
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Big data architectures
Big data architecturesBig data architectures
Big data architectures
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...
 
Rman Presentation
Rman PresentationRman Presentation
Rman Presentation
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
 
Apache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceApache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage Service
 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL Database
 
Kudu Deep-Dive
Kudu Deep-DiveKudu Deep-Dive
Kudu Deep-Dive
 
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming dataUsing Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
 

Destaque

MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APImcsrivas
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkDatabricks
 
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)Amazon Web Services
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsAnton Kirillov
 
Apache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterApache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterDatabricks
 
MapR Data Analyst
MapR Data AnalystMapR Data Analyst
MapR Data Analystselvaraaju
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark InternalsPietro Michiardi
 

Destaque (12)

Deep Learning for Fraud Detection
Deep Learning for Fraud DetectionDeep Learning for Fraud Detection
Deep Learning for Fraud Detection
 
MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase API
 
Apache Spark & Hadoop
Apache Spark & HadoopApache Spark & Hadoop
Apache Spark & Hadoop
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache Spark
 
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Apache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterApache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and Smarter
 
MapR Data Analyst
MapR Data AnalystMapR Data Analyst
MapR Data Analyst
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 

Semelhante a MapR Tutorial Series

Best Practices with PostgreSQL on Solaris
Best Practices with PostgreSQL on SolarisBest Practices with PostgreSQL on Solaris
Best Practices with PostgreSQL on SolarisJignesh Shah
 
MySQL 内存分析
MySQL 内存分析MySQL 内存分析
MySQL 内存分析YUCHENG HU
 
Feed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysedFeed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysedRaghavendra Prabhu
 
General commands for navisphere cli
General commands for navisphere cliGeneral commands for navisphere cli
General commands for navisphere climsaleh1234
 
MongoDB Replication and Sharding
MongoDB Replication and ShardingMongoDB Replication and Sharding
MongoDB Replication and ShardingTharun Srinivasa
 
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...Joao Galdino Mello de Souza
 
제2회난공불락 오픈소스 세미나 커널튜닝
제2회난공불락 오픈소스 세미나 커널튜닝제2회난공불락 오픈소스 세미나 커널튜닝
제2회난공불락 오픈소스 세미나 커널튜닝Tommy Lee
 
LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)Pekka Männistö
 
LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webLizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webSzymon Haly
 
SO-Memoria.pdf
SO-Memoria.pdfSO-Memoria.pdf
SO-Memoria.pdfKadu37
 
Pain points with M3, some things to address them and how replication works
Pain points with M3, some things to address them and how replication worksPain points with M3, some things to address them and how replication works
Pain points with M3, some things to address them and how replication worksRob Skillington
 
Champion Fas Deduplication
Champion Fas DeduplicationChampion Fas Deduplication
Champion Fas DeduplicationMichael Hudak
 
Advanced Namespaces and cgroups
Advanced Namespaces and cgroupsAdvanced Namespaces and cgroups
Advanced Namespaces and cgroupsKernel TLV
 
Shift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
Shift into High Gear: Dramatically Improve Hadoop & NoSQL PerformanceShift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
Shift into High Gear: Dramatically Improve Hadoop & NoSQL PerformanceMapR Technologies
 

Semelhante a MapR Tutorial Series (20)

Best Practices with PostgreSQL on Solaris
Best Practices with PostgreSQL on SolarisBest Practices with PostgreSQL on Solaris
Best Practices with PostgreSQL on Solaris
 
MySQL 内存分析
MySQL 内存分析MySQL 内存分析
MySQL 内存分析
 
Feed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysedFeed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysed
 
General commands for navisphere cli
General commands for navisphere cliGeneral commands for navisphere cli
General commands for navisphere cli
 
MongoDB Replication and Sharding
MongoDB Replication and ShardingMongoDB Replication and Sharding
MongoDB Replication and Sharding
 
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
 
Cassandra admin
Cassandra adminCassandra admin
Cassandra admin
 
제2회난공불락 오픈소스 세미나 커널튜닝
제2회난공불락 오픈소스 세미나 커널튜닝제2회난공불락 오픈소스 세미나 커널튜닝
제2회난공불락 오픈소스 세미나 커널튜닝
 
LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)
 
LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webLizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-web
 
Cs8493 unit 4
Cs8493 unit 4Cs8493 unit 4
Cs8493 unit 4
 
Tune hadoop
Tune hadoopTune hadoop
Tune hadoop
 
Operating Systems
Operating SystemsOperating Systems
Operating Systems
 
Vmfs
VmfsVmfs
Vmfs
 
SO-Memoria.pdf
SO-Memoria.pdfSO-Memoria.pdf
SO-Memoria.pdf
 
SO-Memoria.pdf
SO-Memoria.pdfSO-Memoria.pdf
SO-Memoria.pdf
 
Pain points with M3, some things to address them and how replication works
Pain points with M3, some things to address them and how replication worksPain points with M3, some things to address them and how replication works
Pain points with M3, some things to address them and how replication works
 
Champion Fas Deduplication
Champion Fas DeduplicationChampion Fas Deduplication
Champion Fas Deduplication
 
Advanced Namespaces and cgroups
Advanced Namespaces and cgroupsAdvanced Namespaces and cgroups
Advanced Namespaces and cgroups
 
Shift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
Shift into High Gear: Dramatically Improve Hadoop & NoSQL PerformanceShift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
Shift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
 

Último

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Último (20)

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

MapR Tutorial Series

  • 1. MapR Learning Guide Selvaraaju Murugesan May 6, 2017 Selvaraaju Murugesan MapR Learning Guide
  • 2. Storage Pool MapR-FS groups disks into storage pools, usually made up of two or three disks Stripe Width parameter lets you congure number of disks per storage pool Each node in a MapR cluster can support up to 36 storage pools Use mrcong command to create, remove and manage storage polols, disk groups and disks Selvaraaju Murugesan MapR Learning Guide
  • 3. Example 1 If you have 11 disks in a node, how many storage pools will be created by default? Selvaraaju Murugesan MapR Learning Guide
  • 4. Example 1 Solution If you have 11 disks in a node, how many storage pools will be created by default? 3 storage pool of 3 disks each 1 storage pool of 2 disks Selvaraaju Murugesan MapR Learning Guide
  • 5. Example 2 If you have 9 disks in a node, how many storage pools will be created by default? Selvaraaju Murugesan MapR Learning Guide
  • 6. Example 2 Solution If you have 9 disks in a node, how many storage pools will be created by default? 3 storage pool of 3 disks each Selvaraaju Murugesan MapR Learning Guide
  • 7. Tradeos If a disk fails in a storage pool, then an entire storage pool is taken oine and MapR will automatically begin data replication More disks increase more data to be replicated in case of disk failure Ideal scenario is have 3 disks per storage pool Remember to have same size and speed disk drives in a storage pool for good performance Selvaraaju Murugesan MapR Learning Guide
  • 8. List of Ports Port Number Services 7221 CLDB 8443 MCS 9443 MapR Installer 8888 Hue 8047 Drill 5181 Zookeeper 19888 ResourceManager Selvaraaju Murugesan MapR Learning Guide
  • 9. Default Settings If a disk fails, then the data replication starts immediately If a node fails, then the data replication starts after an hour (60 minutes) Node maintenance default time out is 1 hour after which data replication starts (timeout is congurable) To see / change conguration use the comand maprcli cong load If the CLDB heartbeat is greater than 5 seconds, an alarm is raised and must be cleared manually Secondary CLDB in a node will perform read operations Selvaraaju Murugesan MapR Learning Guide
  • 10. CLDB Name container holds the metadata for the les and directories in the volume, and the rst 64 KB of each le Data container and Name container can have dierent replication factors Data replication happens at volume level For high availability, install more Zookeeper in the nodes /opt/mapr/roles Contains the list of congured services on a given node /opt/cores Core les are copies of the contents of memory when certain anomalies are detected. Core les are located in /opt/cores, and the name of the le will include the name of the service that experienced an issue. When a core le is created, an alarm is raised Selvaraaju Murugesan MapR Learning Guide
  • 11. Zookeeper If you want to start zookeeper service mapr-zookeeper start If you want to stop zookeeper service mapr-zookeeper stop If you want to know the status of zookeeper service mapr-zookeeper qstatus ZooKeeper should always be the rst service that is started Selvaraaju Murugesan MapR Learning Guide
  • 12. MapR Commands To know list of services in a node maprcli service list maprcli node list -columns id,ip,svc To list CLDBs maprcli node listcldbs CLDB master maprcli node cldbmaster Node topology maprcli node topo Selvaraaju Murugesan MapR Learning Guide
  • 13. Cluster Permissions Log into the MCS (login) This level also includes permissions to use the API and command-line interface, and grants read access on the cluster and its volumes Start and stop services (SS) Create volumes (CV) Edit and view Access Control Lists, or permissions (A) Full control gives user the ability to do everything except edit permissions (FC) Selvaraaju Murugesan MapR Learning Guide
  • 14. Volume Permissions Dump or back up the volume (dump) Mirror or restore the volume (restore) Modify volume properties, which includes creating and deleting snapshots, (m) Delete the volume (d) View and edit volume permissions (A) Perform all operations except view and edit volume permissions (FC) Selvaraaju Murugesan MapR Learning Guide
  • 15. MapR Utilities Congure.sh To setup a cluster node To change services such as zookeeper, CLDB, etc.. Disksetup formats specied disks for use by MapR storage fsck used to nd and x inconsistencies in the lesystem to make the metadata consistent on the next load of the storage pool gfsck performs a scan and repair operation on a cluster, volume, or snapshot Selvaraaju Murugesan MapR Learning Guide
  • 16. MapR Utilities mrcong create, remove, and manage storage pools, disk groups, and disks; and provide information about containers mapr-support-collect.sh collect diagnostic information from all nodes in the cluster mapr-support-dump.sh ollects node and cluster-level information about the node where the script is invoked cldbguts monitor the activity of the CLDB Selvaraaju Murugesan MapR Learning Guide
  • 17. NTP Server All nodes should synchronize to one internal NTP server systemctl command ntpq command Selvaraaju Murugesan MapR Learning Guide
  • 18. Logs Centralised logging Logs kept for 30 days by default symbolic links to the logs Local logging logs kept for 3 hours by default YARN logs expire after 3 hours time starts after the job begins Logs stord in /opt/mapr/logs deleted after 10 days by default Change the settings in yarn-site.xml le Retention time are given in seconds Selvaraaju Murugesan MapR Learning Guide
  • 19. Space Requirements /opt - 128GB /tmp - 10GB /opt/mapr/zkdata 500MB Swap space 110% physical memory Minimum of 24GB and maximum of 128GB Use LVM for boot drives Selvaraaju Murugesan MapR Learning Guide
  • 20. Volume Quota Once the Advisory Quota is reached alarm raised Once Hard Quota is reached no futher data is written Only compressed data size is counted against the volume quota Selvaraaju Murugesan MapR Learning Guide
  • 21. Pre / Post-Installation Check Pre-installation check Stream CPU Iozone I/O speed memory (destructive write/read) Rpctest network speed Post-installation check DFSIO - I/O speed memory (mapreduce job) RWspeedtest TerraGen / Terrasort mapreduce job Terrasort job suggest possible problem with hard drive or controller Selvaraaju Murugesan MapR Learning Guide
  • 22. Snapshot / Mirror Snapshots are stored at top level of every volume (hidden directory) Scheduled snapshots expire automatically Mirror start - start mirror operation between source destination Mirror push - push updates from source volume to all mirror volume Mirror operation uses 70% network bandwidth les are compressed Selvaraaju Murugesan MapR Learning Guide
  • 23. Role / Disk Balancer Disk balancer redistributes the data in all nodes use disk balancer after you have added many new nodes % concurrent disk rebalancer 2 to 30% Role balancer evenly distriburtes master containers o by default; starts after 30 minutes after CLDB (can be congured) Delay for active data 120 sec 1800 sec (2 min 30 min) Selvaraaju Murugesan MapR Learning Guide
  • 24. Job Scheduler Fair scheduler is default FIFO Capacity scheduler Can be on memory; also on CPU User has each own queue Weights to set resources Allocation le (reloaded every 10 seconds) to modify resource managers /opt/mapr/Hadoop/version/etc/hadoop /fair-scheduler.xml Selvaraaju Murugesan MapR Learning Guide