SlideShare uma empresa Scribd logo
1 de 29
How Ambari manifest files
are used by System Center
and Windows Azure
Brian Swan
Program Manager, HDInsight Team
Microsoft
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resources, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
Deployment using System Center
Note: The tools described here for deploying Hadoop clusters using System
Center are prototype tools used internally at Microsoft. The intent here is to
demonstrate one consumer of cluster manifest files.
System Center – Prerequisites
Deployment
DB
System Center
Virtual Machine Manager
(VMM)
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• System Center 2013
• VM running Virtual Machine Manager
(VMM) with…
• Hadoop Service Template
• Windows Server VHD
• HDInsight Deployment Tool
• Deployment Database (SQL Server)
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
Manifest
Files
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
• Update the Deployment Tool configuration file.
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
• Update HDInsightDeployment.exe.config.
• Start deployment with HDInsightDeployment.exe.
• Deployment tool reads and validates manifest files.
• Schema validation.
• Dependency validation.
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
• Update HDInsightDeployment.exe.config.
• Start deployment with HDInsightDeployment.exe.
• Deployment tool reads and validates manifest files.
• Schema validation.
• Dependency validation.
• Deployment DB is populated with steps for creating system
resources on hosts (e.g. Users/Groups/Firewall Rules/etc.)
• Deployment DB is populated with ordered steps for installing
Hadoop (and other packages).
Phase 2: Download Packages
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• Deployment tool downloads/copies packages to VMM based on
information in PackageDefinition.json.
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
VM1
VM2
VM3
VM4
MASTER_HOSTS
SLAVE_HOSTS
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
• Hadoop Service Template (a VMM template) specifies which
system components to install (e.g. Deployment Agent)
• Starts Deployment Agent
VM1
VM2
VM3
VM4
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
• Template specifies which system components to install (e.g.
Deployment Agent)
• Starts Deployment Agent
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest file.
• Template specifies which system components to install (e.g.
Deployment Agent)
• Starts Deployment Agent
• Deployment Agents pull packages from SCVMM
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
Phase 4: Create System Resources, Install
Packages
Deployment
DB
System Center
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
• Deployment Agents create system resources
(Users/Groups/Firewall Rules/etc.) from steps in
Deployment DB hdfs_user
hadoop_admin
mapred_user
hadoop_admin
hdfs_user
mapred_user
hdfs_user
mapred_user
Phase 4: Create System Resources, Install
Packages
Deployment
DB
System Center
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
• Deployment Agents create system resources
(Users/Groups/Firewall Rules/etc.) from steps in
Deployment DB
• Deployment Agents work through steps for
installing Hadoop (and other packages)
• Packages contain scripts that will be invoked
for installing custom components (e.g. Java,
Python, etc.)
HDFS
NameNode
MapReduce
JobTracker
HDFS, MapReduce
DataNode, TaskTracker
HDFS, MapReduce
DataNode, TaskTracker
Phase 4: Create System Resources, Install
Packages
Deployment
DB
System Center
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
• Deployment Agents create system resources
(Users/Groups/Firewall Rules/etc.) from steps in
Deployment DB
• Deployment Agents work through steps for
installing Hadoop (and other packages)
• Packages contain scripts that will be invoked
for installing custom components (e.g. Java,
Python, etc.)
• Deployment Agents stores states of steps for re-trys
upon failures.
Deployment in Windows Azure
WA Blob Storage
Phase 1: Submit request, generate
manifest files
Windows Azure
Deployment Service
• Cluster creation request submitted via Windows Azure Portal.
• Deployment Service generates and validates manifest files.
• DA stores manifest files in Blob Storage.
• (Hadoop package files are already in Blob Storage.)
Windows Azure Fabric
WA Blob Storage
Phase 2: Generate/submit deployment
files
Windows Azure
Deployment Service
• Deployment Service generates Cloud Service deployment files.
• .cspkg: contains Deployment Agent
• .cscfg: contains instance counts for VMs and location of
generated manifest files.
• Cloud Service deployment files are submitted to Windows Azure
Fabric.
.cspkg .cscfg
WA Blob Storage
Phase 3: Provision VMs, Deployment
Agent
Windows Azure
Deployment Service
• Windows Azure Fabric provisions VMs and deploys Deployment
Agent on VMs
Windows Azure Fabric
WA Blob Storage
Phase 3: Provision VMs, Deployment
Agent
Windows Azure
• Windows Azure Fabric provisions VMs and deploys Deployment
Agent on VMsWindows Azure Fabric
VM1
VM2
VM3
VM4
WEB_ROLES
WORKER_ROLES
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
VM1
WA Blob Storage
Phase 4: Get manifest files, install
components
Windows Azure
• Deployment Agent determines environment and VM type.
• Deployment Agent gets manifest files based on location in .cscfg
file.
Windows Azure Fabric
VM2
VM3
VM4
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
WEB_ROLES
WORKER_ROLES
VM1
WA Blob Storage
Phase 4: Get manifest files, install
components
Windows Azure
• Deployment Agent generates in-memory list of activities for
installing components.
• Deployment Agent retrieves packages (based on repo location in
PackageDefinition file).
Windows Azure Fabric
VM2
VM3
VM4
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
VM1
WA Blob Storage
Phase 4: Get manifest files, install
components
Windows Azure
• Deployment Agent installs components.Windows Azure Fabric
VM2
VM3
VM4
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
NameNode JobTracker
DataNode, TaskTracker DataNode, TaskTracker
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------

Mais conteúdo relacionado

Mais procurados

Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillHenry Saputra
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambariHortonworks
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoophitesh1892
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARNDataWorks Summit
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache AmbariHortonworks
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN ApplicationsHortonworks
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwordsSzehon Ho
 
One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)DataWorks Summit
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureHortonworks
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARNSteve Loughran
 
Hive on kafka
Hive on kafkaHive on kafka
Hive on kafkaSzehon Ho
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesDataWorks Summit
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Hortonworks
 
Writing app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnWriting app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnDataWorks Summit
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big dataSergiy Matusevych
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Hortonworks
 

Mais procurados (20)

Effective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant ClustersEffective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant Clusters
 
Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twill
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambari
 
YARN and the Docker container runtime
YARN and the Docker container runtimeYARN and the Docker container runtime
YARN and the Docker container runtime
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN Applications
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwords
 
One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARN
 
Hive on kafka
Hive on kafkaHive on kafka
Hive on kafka
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
 
Writing app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnWriting app framworks for hadoop on yarn
Writing app framworks for hadoop on yarn
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big data
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0
 

Semelhante a Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013

Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?Katherine Golovinova
 
Content server installation guide
Content server installation guideContent server installation guide
Content server installation guideNaveed Bashir
 
Best practices for share point solution deployment
Best practices for share point solution deploymentBest practices for share point solution deployment
Best practices for share point solution deploymentSalaudeen Rajack
 
Docker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a MinuteDocker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a Minutedchq
 
FabricServer Technology Overview
FabricServer Technology OverviewFabricServer Technology Overview
FabricServer Technology OverviewIvan_datasynapse
 
Extend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation stepsExtend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation stepsDragos_Mihailescu
 
Practical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloadsPractical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloadsJarek Miszczyk
 
Ranger v0.3 20180327
Ranger v0.3 20180327Ranger v0.3 20180327
Ranger v0.3 20180327현우 한
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentationpuneet yadav
 
Professional deployment
Professional deploymentProfessional deployment
Professional deploymentIvelina Dimova
 
AWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic BeanstalkAWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic BeanstalkAmazon Web Services
 
Managing Your Runtime With P2
Managing Your Runtime With P2Managing Your Runtime With P2
Managing Your Runtime With P2Pascal Rapicault
 
Information on Apache Handlers
Information on Apache HandlersInformation on Apache Handlers
Information on Apache HandlersHTS Hosting
 
R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20zeesniper
 
Talk on .NET assemblies
Talk on .NET assembliesTalk on .NET assemblies
Talk on .NET assembliesVidya Agarwal
 
IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation khawkwf
 
Azure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web ServicesAzure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web ServicesBob German
 

Semelhante a Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013 (20)

Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?
 
Content server installation guide
Content server installation guideContent server installation guide
Content server installation guide
 
iac.pptx
iac.pptxiac.pptx
iac.pptx
 
Best practices for share point solution deployment
Best practices for share point solution deploymentBest practices for share point solution deployment
Best practices for share point solution deployment
 
Docker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a MinuteDocker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a Minute
 
FabricServer Technology Overview
FabricServer Technology OverviewFabricServer Technology Overview
FabricServer Technology Overview
 
Extend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation stepsExtend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation steps
 
Practical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloadsPractical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloads
 
Ranger v0.3 20180327
Ranger v0.3 20180327Ranger v0.3 20180327
Ranger v0.3 20180327
 
Apache ppt
Apache pptApache ppt
Apache ppt
 
OMG D&C Tutorial
OMG D&C TutorialOMG D&C Tutorial
OMG D&C Tutorial
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentation
 
Professional deployment
Professional deploymentProfessional deployment
Professional deployment
 
AWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic BeanstalkAWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic Beanstalk
 
Managing Your Runtime With P2
Managing Your Runtime With P2Managing Your Runtime With P2
Managing Your Runtime With P2
 
Information on Apache Handlers
Information on Apache HandlersInformation on Apache Handlers
Information on Apache Handlers
 
R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20
 
Talk on .NET assemblies
Talk on .NET assembliesTalk on .NET assemblies
Talk on .NET assemblies
 
IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation
 
Azure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web ServicesAzure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web Services
 

Mais de Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

Mais de Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Último

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 

Último (20)

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 

Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013

  • 1. How Ambari manifest files are used by System Center and Windows Azure Brian Swan Program Manager, HDInsight Team Microsoft
  • 2. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resources, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 3. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 4. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 5. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 6. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 7. Deployment using System Center Note: The tools described here for deploying Hadoop clusters using System Center are prototype tools used internally at Microsoft. The intent here is to demonstrate one consumer of cluster manifest files.
  • 8. System Center – Prerequisites Deployment DB System Center Virtual Machine Manager (VMM) HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • System Center 2013 • VM running Virtual Machine Manager (VMM) with… • Hadoop Service Template • Windows Server VHD • HDInsight Deployment Tool • Deployment Database (SQL Server)
  • 9. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. Manifest Files
  • 10. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. • Update the Deployment Tool configuration file.
  • 11. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. • Update HDInsightDeployment.exe.config. • Start deployment with HDInsightDeployment.exe. • Deployment tool reads and validates manifest files. • Schema validation. • Dependency validation.
  • 12. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. • Update HDInsightDeployment.exe.config. • Start deployment with HDInsightDeployment.exe. • Deployment tool reads and validates manifest files. • Schema validation. • Dependency validation. • Deployment DB is populated with steps for creating system resources on hosts (e.g. Users/Groups/Firewall Rules/etc.) • Deployment DB is populated with ordered steps for installing Hadoop (and other packages).
  • 13. Phase 2: Download Packages Deployment DB System Center VMM HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • Deployment tool downloads/copies packages to VMM based on information in PackageDefinition.json.
  • 14. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file.
  • 15. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file. VM1 VM2 VM3 VM4 MASTER_HOSTS SLAVE_HOSTS
  • 16. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file. • Hadoop Service Template (a VMM template) specifies which system components to install (e.g. Deployment Agent) • Starts Deployment Agent VM1 VM2 VM3 VM4
  • 17. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file. • Template specifies which system components to install (e.g. Deployment Agent) • Starts Deployment Agent VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent
  • 18. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest file. • Template specifies which system components to install (e.g. Deployment Agent) • Starts Deployment Agent • Deployment Agents pull packages from SCVMM VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent
  • 19. Phase 4: Create System Resources, Install Packages Deployment DB System Center VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent • Deployment Agents create system resources (Users/Groups/Firewall Rules/etc.) from steps in Deployment DB hdfs_user hadoop_admin mapred_user hadoop_admin hdfs_user mapred_user hdfs_user mapred_user
  • 20. Phase 4: Create System Resources, Install Packages Deployment DB System Center VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent • Deployment Agents create system resources (Users/Groups/Firewall Rules/etc.) from steps in Deployment DB • Deployment Agents work through steps for installing Hadoop (and other packages) • Packages contain scripts that will be invoked for installing custom components (e.g. Java, Python, etc.) HDFS NameNode MapReduce JobTracker HDFS, MapReduce DataNode, TaskTracker HDFS, MapReduce DataNode, TaskTracker
  • 21. Phase 4: Create System Resources, Install Packages Deployment DB System Center VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent • Deployment Agents create system resources (Users/Groups/Firewall Rules/etc.) from steps in Deployment DB • Deployment Agents work through steps for installing Hadoop (and other packages) • Packages contain scripts that will be invoked for installing custom components (e.g. Java, Python, etc.) • Deployment Agents stores states of steps for re-trys upon failures.
  • 23. WA Blob Storage Phase 1: Submit request, generate manifest files Windows Azure Deployment Service • Cluster creation request submitted via Windows Azure Portal. • Deployment Service generates and validates manifest files. • DA stores manifest files in Blob Storage. • (Hadoop package files are already in Blob Storage.)
  • 24. Windows Azure Fabric WA Blob Storage Phase 2: Generate/submit deployment files Windows Azure Deployment Service • Deployment Service generates Cloud Service deployment files. • .cspkg: contains Deployment Agent • .cscfg: contains instance counts for VMs and location of generated manifest files. • Cloud Service deployment files are submitted to Windows Azure Fabric. .cspkg .cscfg
  • 25. WA Blob Storage Phase 3: Provision VMs, Deployment Agent Windows Azure Deployment Service • Windows Azure Fabric provisions VMs and deploys Deployment Agent on VMs Windows Azure Fabric
  • 26. WA Blob Storage Phase 3: Provision VMs, Deployment Agent Windows Azure • Windows Azure Fabric provisions VMs and deploys Deployment Agent on VMsWindows Azure Fabric VM1 VM2 VM3 VM4 WEB_ROLES WORKER_ROLES Deployment Agent Deployment Agent Deployment Agent Deployment Agent
  • 27. VM1 WA Blob Storage Phase 4: Get manifest files, install components Windows Azure • Deployment Agent determines environment and VM type. • Deployment Agent gets manifest files based on location in .cscfg file. Windows Azure Fabric VM2 VM3 VM4 Deployment Agent Deployment Agent Deployment Agent Deployment Agent WEB_ROLES WORKER_ROLES
  • 28. VM1 WA Blob Storage Phase 4: Get manifest files, install components Windows Azure • Deployment Agent generates in-memory list of activities for installing components. • Deployment Agent retrieves packages (based on repo location in PackageDefinition file). Windows Azure Fabric VM2 VM3 VM4 Deployment Agent Deployment Agent Deployment Agent Deployment Agent • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ----------
  • 29. VM1 WA Blob Storage Phase 4: Get manifest files, install components Windows Azure • Deployment Agent installs components.Windows Azure Fabric VM2 VM3 VM4 Deployment Agent Deployment Agent Deployment Agent Deployment Agent NameNode JobTracker DataNode, TaskTracker DataNode, TaskTracker • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ----------

Notas do Editor

  1. Dependency validation is validation to make sure the cluster can run once it is deployed.In azure, Deployment DB is replaced by in-memory storage of info.In Azure and VMM, hostmanifest only specifies the number of instances in each logical host group. The host groups are defined in the template (in VMM), or by Azure.PackageDefinition: specifes settings for components selected in Host-Component-Mapping file
  2. Dependency validation is validation to make sure the cluster can run once it is deployed.In azure, Deployment DB is replaced by in-memory storage of info.In Azure and VMM, hostmanifest only specifies the number of instances in each logical host group. The host groups are defined in the template (in VMM), or by Azure.PackageDefinition: specifes settings for components selected in Host-Component-Mapping file.Note that SQL Authentication is shown in the sqlConnectionString. In production environment, Integrated Authentication is/should be used.
  3. Dependency validation is validation to make sure the cluster can run once it is deployed.Examples include…Is there Package Definition that matches the package specified in the Host-Component-Mapping?Are host groups consistent across Host-Component-Mapping and Host Manifest files?If Hive is selected to install, are its dependencies selected and available?In azure, Deployment DB is replaced by in-memory storage of info.In Azure and VMM, hostmanifest only specifies the number of instances in each logical host group. The host groups are defined in the template (in VMM), or by Azure.PackageDefinition: specifes settings for components selected in Host-Component-Mapping file
  4. Deployment DB is populated with ordered steps for installing Hadoop (and other packages). For example…Install HDFS service before MapReduceInstall NameNode component before DataNode component
  5. Deployment Agents stores states of steps for re-trys upon failures.E.g. if namenode install fails, it will retryIf namenode install fails, datanode will not proceed.Once issue is resolved, deployment agent will pick from last successful step
  6. Deployment Service is transparent to users.Deployment Service is a Cloud Service running in Windows Azure.Currently, the manifest files are mostly static. The HostManifest file isn’t used at all. VM information is handled by Azure Fabric.We have flexibility going forward to incorporate user input (e.g. configuration overrides).Manifest files are stored in user storage account.HDP and other packages are in HDInsight blob storage account.
  7. Web/Worker Roles are logical host groups in Windows Azure (the types of VMs)VM sizes are fixed (for now).
  8. Deployment Agent is the same code that is used in System Center scenario. Logic is forked based on environment.