Mais conteúdo relacionado Semelhante a Managing your Hadoop Clusters with Apache Ambari (20) Mais de DataWorks Summit (20) Managing your Hadoop Clusters with Apache Ambari1. © Hortonworks Inc. 2013
Managing Your Hadoop Clusters
with Apache Ambari
Hadoop Summit
June 2013
2. © Hortonworks Inc. 2013
Hello!
• Yusaku Sako
–Committer / PPMC member, Apache Ambari
–Member of Technical Staff @ Hortonworks
–yusaku@hortonworks.com
• Jeff Sposetti
–Contributor, Apache Ambari
–Director of Product Management @ Hortonworks
–jeff@hortonworks.com
Page 2
3. © Hortonworks Inc. 2013
Today, We’ll Go Over…
• Intro
• Open Source Activity
• Demo
• Futures
• Architecture
• Recent Developments
• Q & A
Page 3
4. © Hortonworks Inc. 2013
Ambari: Enterprise Hadoop Operations
Ambari is the only 100% open source framework for
provisioning, managing and monitoring Apache
Hadoop clusters
HADOOP
Storage & Process
at Scale
AMBARI
PROVISION
MANAGE
MONITOR
AMBARI
WEB
5. © Hortonworks Inc. 2013
Features Today
Provisioning: Simplified deployment across platforms
Managing: Consistent controls across the Stack
Monitoring: Visibility into key cluster metrics
- Single pane of glass for Hadoop & System status
- Pre-configured metrics & alerts
- Single point for cluster operations
- Customize w/o dealing with Hadoop complexities
- Advanced configurations and host controls
- Wizard-driven cluster install experience
- Deploy 10s,100s or 1000s of Hadoop servers
- Cloud, virtual and physical environments
6. © Hortonworks Inc. 2013
Apache Ambari – 100% Open Source!
• Active community
• 50+ Contributors / 20+ Committers
• 140+ Ambari User Group Members
• Steady progress/release cycle
Page 6
Release
Version
Release
Date
JIRAs
Resolved
0.9.0 Sep 2012 402
1.2.0 Feb 2013 441
1.2.1 Mar 2013 134
1.2.2 Apr 2013 106
1.2.3 Jun 2013 515
1.2.4 Jul 2013 109+
1.2.5 Jul 2013 131+
Current Release
Today’s Demo
7. © Hortonworks Inc. 2013
Ambari System Architecture
7
Ambari Server
Host
Agent
gmond
Host
Agent
gmond
Ganglia
Server
Agent
Host
Agent
gmondgmetad
gmond
Ambari
Web
DB
REST
/clusters
Nagios
Server
Agent
10. © Hortonworks Inc. 2013
Host Group Configuration Controls
• Set custom configuration properties at the host level
for one or more hosts
• Important for handing “heterogeneous” clusters
• AMBARI-1509 and AMBARI-1370
10
HEAPSIZE= 1024
HEAPSIZE= 2048
11. © Hortonworks Inc. 2013
Cluster Blueprints
11
• Perform “Headless Install”
• Perform “Cluster Takeover”
• Export blueprint from cluster
• Boot & save wizard w/blueprint
• AMBARI-1783
BLUEPRINT
<stack>
<host>
<service>
<component>
<config>
Ambari
Server
HOST
MANIFEST
<host>
<meta>
SERVICE
CONFIGS
<props>
BLUEPRINT
12. © Hortonworks Inc. 2013
Hadoop 2.0 Support
• Provision, manage and monitoring Hadoop 2.0 Stack
• HDFS2, YARN, Tez
• Rolling Cluster Upgrades
–Enable cluster upgrade, one host at a time, in such a way that
services and resources offered by the cluster are always available
through out the upgrade process
Page 12
13. © Hortonworks Inc. 2013
Ambari Architecture
Page 13
DB
Orchestrator SPI
REST API
Request Dispatcher
Ambari
Web
Ambari
Server
Metrics
AuthProvider
/clusters
/services
/hosts
/workflows/jobs
/users, …
User
Store
java
RDBMS
javascript
RDBM
S
AD/
LDAP
REST API for
integration
Auth
Provider
Cluster
Configurations
Web Client
100% REST
Ambari
Agents ganglia nagios
Alerts
Pluggable
Service
Providersfalcon
Data Mgmt
jmx
python
puppet
14. © Hortonworks Inc. 2013
REST API – Centralized & Consistent
Page 14
Ambari REST API
Alerts
Job
HistoryMetricsConfigurations
Config
DB
Nagios
Server
Ganglia
Server
…
HTTP GET, POST, PUT, DELETE
:8080
HTTP Status Code / JSON
core-
site.xml
core-
site.xml
Config
files
Config
files
Config
files
JMX
Realtime Historical*-site.xml…
Job History
DB
Hosts / ServicesCluster
15. © Hortonworks Inc. 2013
REST API Resource Tree
• Resources
• Clusters
• Services (HDFS, MR, HIVE…)
• Components (NAMENODE, DATANODE…)
• Hosts
• Host Components (DATANODE on host1…)
• Configurations (core-site, mapred-site, …)
• Workflows (Hive queries, Pig scripts, MR programs)
• Jobs (spawned MR jobs…)
• Task Attempts (Map, Shuffle, Reduce…)
• Stacks (HDP, other distros)
• https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md
Page 15
16. © Hortonworks Inc. 2013
Ambari + Teradata Viewpoint Integration
Page 16
• Ambari = Key enabler for
integrating Hadoop monitoring
capabilities to Viewpoint
• Viewpoint uses Ambari REST API
and Custom Service Providers to
get Hadoop metrics from a non-
Ambari deployed cluster
17. © Hortonworks Inc. 2013
Stack Definitions
• Design Goals
–Ambari should be able to support choice of Hadoop stacks
–Ambari should enable adding new components to an existing
stack
• Define which Services are available (services)
• Define where to get the packages (repos)
17
S S S SStack B
repos
services
S S S SStack A
repos
services
S S S S
Stack C
extends
Stack B
repos
services
S
S+
18. © Hortonworks Inc. 2013
Ambari + Redhat GlusterFS Integration
• Using Ambari to deploy / manage cluster with
distributed file system other than HDFS
–HCFS: GlusterFS as first implementation
–Pluggability with other HCFS’s
–See AMBARI-1817
Page 18
MapReduce
Hive
Distributed File System
HDFS
GlusterFS
HBasePig
Other HCFS …
19. © Hortonworks Inc. 2013
Ambari + Accumulo Integration
• Using Ambari to deploy / manage cluster with
Accumulo
–Google Summer of Code project
–See AMBARI-1930
MapReduce
Hive
Distributed File System
HBasePig
20. © Hortonworks Inc. 2013
Ambari + Splunk Integration
• Head over to Splunk’s Expo booth to learn about
Ambari integrated into Splunk’s Management UI
Page 20
+
21. © Hortonworks Inc. 2013
Get Involved!
• Project Website
– http://incubator.apache.org/ambari/
• Check out Ambari
– Try installing your own cluster! (See project website for instructions)
• Mailing Lists
– ambari-user@incubator.apache.org
– ambari-dev@incubator.apache.org
• IRC Chanel
– @apacheambari
Page 6