SlideShare a Scribd company logo
1 of 31
Reliable open-source
99.9% availability SLA
Monitoring
(OMS)
Visual Studio, IntelliJ and Eclipse support for
developers and data scientists
Enterprise grade Security Kerberos
Apache Ranger
Install & use big data applications
Azure Marketplace
Azure
HDInsight
Cloud Spark and Hadoop
service for your enterprise
(Spark, Hive, MR, LLAP,
Kafka, HBase, Storm)
*IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
and many more…
• Managed Kafka clusters with 99.9% service level
SLA
• Native integration with Azure Managed Disks.
Allows for exponentially lower costs, and higher
scale.
• Scalable On Demand clusters - Kafka clusters
with 16 TB/node and Zookeeper up and running
in 15 minutes
• Rack awareness for Kafka on the Azure cloud
• Alerting and predictive cluster maintenance
through Azure Operations Management Suite
• Extensibility via one click deploy of leading ISVs
such as StreamSets
• Disaster recovery support via MirrorMaker
• Deploy End to End streaming pipelines with
Storm, Spark, Storage via automated ARM
templates in the same VNET.
Modern Data Warehouse: Real-time analytics
Unstructured data
Azure storage
Azure HDInsight (LLAP)
Azure HDInsight
(Kafka)
Analytic Dashboards
Model & ServePrep & TrainStoreIngest Intelligence
SQL DW
Azure Databricks
(Spark)
Azure HDInsight
(Spark)
Kafka is a distributed, horizontally-scalable, fault-tolerant pub-sub store
Broker 1
Producer 1
IoT Hub
Storm
Spark
Streaming
1
2
3
ZK 1 ZK 2 ZK 3
Broker 2
Broker 3
3
1
2
Topic 1
Topic 2 Topic 1
Topic 2
Topic 2
Topic 1
4 5
Setup the broker
configuration
Publish the
message
The consumer
reads the messages
Azure
Gateway
Services
Open source Stream Processing on Azure HDInsight
Real-time applications
Long term storage
Real-time dashboards
IoT Hubs
Azure VNet Boundary
Siphon on HDInsight Kafka 8 million
EVENTS PER SECOND PEAK INGRESS
800 TB (10 GB per Sec)
INGRESS PER DAY
1,800; 450
PRODUCTION KAFKA BROKERS; TOPICS
15 Sec
99th PERCENTILE LATENCY
KEY CUSTOMER
SCENARIOS
Ads Monetization (Fast BI)
O365 Customer Fabric NRT – Tenant & User insights
BingNRT Operational Intelligence
Presto (Fast SML) interactive analysis
Delve Analytics
0
5
10
15
20
25
30
35
40
45
Jan-15
Feb-15
Mar-15
Apr-15
May-15
Jun-15
Jul-15
Aug-15
Sep-15
Oct-15
Nov-15
Dec-15
Jan-16
Feb-16
Mar-16
Apr-16
May-16
Jun-16
Jul-16
Aug-16
Sep-16
Oct-16
Nov-16
Dec-16
Throughput(inGBps)
Siphon Data Volume (Ingress and Egress)
Volume published (GBps) Volume subscribed (GBps)
0
5
10
15
20
25
Jan-15
Feb-15
Mar-15
Apr-15
May-15
Jun-15
Jul-15
Aug-15
Sep-15
Oct-15
Nov-15
Dec-15
Jan-16
Feb-16
Mar-16
Apr-16
May-16
Jun-16
Jul-16
Aug-16
Sep-16
Oct-16
Nov-16
Dec-16
Throughput(eventspersec)Millions
Siphon Events per second (Ingress and Egress)
EPS In Eps Out
Getting Started with Kafka for HDInsight
Structured Streaming with HDInsight Kafka and Spark
Deploy HDInsight Kafka + Storm
Stream data from on-premise to HDInsight Kafka in the cloud
https://academy.microsoft.com/en-us/professional-program/big-data/
https://www.pluralsight.com/courses/spark-kafka-cassandra-applying-lambda-architecture
https://azure.microsoft.com/en-us/blog/announcing-apache-kafka-for-azure-
hdinsight-general-availability/
https://azure.microsoft.com/en-us/blog/announcing-public-preview-of-apache-kafka-
on-hdinsight-with-azure-managed-disks/
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-kafka-high-
availability
Costin$
Throughput MBps
Kafka Cost Estimator
Non Managed Disks Managed Disks
#KAFKANODES
THROUGHPUT MBPS
Kafka scale forecast
Kafka nodes (OS VHDs) Kafka nodes (managed disks)
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-kafka-mirroring
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-kafka-connect-vpn-gateway
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-kafka-connect-vpn-gateway
Azure VNet Boundary
Message Rate 10,000 messages/sec
Message size 150 KB upperbound
Replica count 3
Retention Policy 12 hours
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight

More Related Content

What's hot

Airbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackAirbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackMichel Tricot
 
Observability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing PrimerObservability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing PrimerVMware Tanzu
 
Elastic SIEM (Endpoint Security)
Elastic SIEM (Endpoint Security)Elastic SIEM (Endpoint Security)
Elastic SIEM (Endpoint Security)Kangaroot
 
Monitoring with Dynatrace Presentation.pptx
Monitoring with Dynatrace Presentation.pptxMonitoring with Dynatrace Presentation.pptx
Monitoring with Dynatrace Presentation.pptxKnoldus Inc.
 
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...Splunk
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Lift SSIS package to Azure Data Factory V2
Lift SSIS package to Azure Data Factory V2Lift SSIS package to Azure Data Factory V2
Lift SSIS package to Azure Data Factory V2Manjeet Singh
 
An Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a ServiceAn Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a Serviceconfluent
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOpsDatabricks
 
Explore your prometheus data in grafana - Promcon 2018
Explore your prometheus data in grafana - Promcon 2018Explore your prometheus data in grafana - Promcon 2018
Explore your prometheus data in grafana - Promcon 2018Grafana Labs
 
Event Streaming in the Telco Industry with Apache Kafka® and Confluent
Event Streaming in the Telco Industry with Apache Kafka® and ConfluentEvent Streaming in the Telco Industry with Apache Kafka® and Confluent
Event Streaming in the Telco Industry with Apache Kafka® and Confluentconfluent
 
Monolithic vs Microservices Architecture
Monolithic vs Microservices ArchitectureMonolithic vs Microservices Architecture
Monolithic vs Microservices ArchitectureSimplesolve
 
Server monitoring using grafana and prometheus
Server monitoring using grafana and prometheusServer monitoring using grafana and prometheus
Server monitoring using grafana and prometheusCeline George
 
A Kafka journey and why migrate to Confluent Cloud?
A Kafka journey and why migrate to Confluent Cloud?A Kafka journey and why migrate to Confluent Cloud?
A Kafka journey and why migrate to Confluent Cloud?confluent
 

What's hot (20)

Airbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackAirbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stack
 
Observability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing PrimerObservability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing Primer
 
Elastic SIEM (Endpoint Security)
Elastic SIEM (Endpoint Security)Elastic SIEM (Endpoint Security)
Elastic SIEM (Endpoint Security)
 
MULTI-CLOUD ARCHITECTURE
MULTI-CLOUD ARCHITECTUREMULTI-CLOUD ARCHITECTURE
MULTI-CLOUD ARCHITECTURE
 
Monitoring with Dynatrace Presentation.pptx
Monitoring with Dynatrace Presentation.pptxMonitoring with Dynatrace Presentation.pptx
Monitoring with Dynatrace Presentation.pptx
 
Modern Data Platform on AWS
Modern Data Platform on AWSModern Data Platform on AWS
Modern Data Platform on AWS
 
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
 
Cloud Monitoring tool Grafana
Cloud Monitoring  tool Grafana Cloud Monitoring  tool Grafana
Cloud Monitoring tool Grafana
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Lift SSIS package to Azure Data Factory V2
Lift SSIS package to Azure Data Factory V2Lift SSIS package to Azure Data Factory V2
Lift SSIS package to Azure Data Factory V2
 
An Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a ServiceAn Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a Service
 
Cloud Computing Using OpenStack
Cloud Computing Using OpenStack Cloud Computing Using OpenStack
Cloud Computing Using OpenStack
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
 
Introduction to MuleSoft
Introduction to MuleSoftIntroduction to MuleSoft
Introduction to MuleSoft
 
Explore your prometheus data in grafana - Promcon 2018
Explore your prometheus data in grafana - Promcon 2018Explore your prometheus data in grafana - Promcon 2018
Explore your prometheus data in grafana - Promcon 2018
 
Event Streaming in the Telco Industry with Apache Kafka® and Confluent
Event Streaming in the Telco Industry with Apache Kafka® and ConfluentEvent Streaming in the Telco Industry with Apache Kafka® and Confluent
Event Streaming in the Telco Industry with Apache Kafka® and Confluent
 
Monolithic vs Microservices Architecture
Monolithic vs Microservices ArchitectureMonolithic vs Microservices Architecture
Monolithic vs Microservices Architecture
 
Server monitoring using grafana and prometheus
Server monitoring using grafana and prometheusServer monitoring using grafana and prometheus
Server monitoring using grafana and prometheus
 
A Kafka journey and why migrate to Confluent Cloud?
A Kafka journey and why migrate to Confluent Cloud?A Kafka journey and why migrate to Confluent Cloud?
A Kafka journey and why migrate to Confluent Cloud?
 

Similar to Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight

Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterPaolo Castagna
 
StreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics PlatformStreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics PlatformAtul Sharma
 
Fom io t_to_bigdata_step_by_step-final
Fom io t_to_bigdata_step_by_step-finalFom io t_to_bigdata_step_by_step-final
Fom io t_to_bigdata_step_by_step-finalLuis Filipe Silva
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Kai Wähner
 
Devoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basDevoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basFlorent Ramiere
 
DS_2016_StreamAnalytix_real_time_streaming_analytics_platform
DS_2016_StreamAnalytix_real_time_streaming_analytics_platformDS_2016_StreamAnalytix_real_time_streaming_analytics_platform
DS_2016_StreamAnalytix_real_time_streaming_analytics_platformAditya Singh
 
Reinventing Kafka in the Data Streaming Era - Jun Rao
Reinventing Kafka in the Data Streaming Era - Jun RaoReinventing Kafka in the Data Streaming Era - Jun Rao
Reinventing Kafka in the Data Streaming Era - Jun Raoconfluent
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAData Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAAndrew Morgan
 
Webinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDBWebinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDBMongoDB
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Timothy Spann
 
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...HostedbyConfluent
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBconfluent
 
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...HostedbyConfluent
 
OpenStack for VMware Administrators
OpenStack for VMware AdministratorsOpenStack for VMware Administrators
OpenStack for VMware AdministratorsTrevor Roberts Jr.
 
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...Kai Wähner
 
Partner Keynote: Intel - The New Frontier of Cloud Computing
Partner Keynote: Intel - The New Frontier of Cloud ComputingPartner Keynote: Intel - The New Frontier of Cloud Computing
Partner Keynote: Intel - The New Frontier of Cloud ComputingAmazon Web Services
 
IoT and Event Streaming at Scale with Apache Kafka
IoT and Event Streaming at Scale with Apache KafkaIoT and Event Streaming at Scale with Apache Kafka
IoT and Event Streaming at Scale with Apache Kafkaconfluent
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...confluent
 
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaTop 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaKai Wähner
 

Similar to Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight (20)

Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
StreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics PlatformStreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics Platform
 
Fom io t_to_bigdata_step_by_step-final
Fom io t_to_bigdata_step_by_step-finalFom io t_to_bigdata_step_by_step-final
Fom io t_to_bigdata_step_by_step-final
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
 
Devoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basDevoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en bas
 
DS_2016_StreamAnalytix_real_time_streaming_analytics_platform
DS_2016_StreamAnalytix_real_time_streaming_analytics_platformDS_2016_StreamAnalytix_real_time_streaming_analytics_platform
DS_2016_StreamAnalytix_real_time_streaming_analytics_platform
 
Reinventing Kafka in the Data Streaming Era - Jun Rao
Reinventing Kafka in the Data Streaming Era - Jun RaoReinventing Kafka in the Data Streaming Era - Jun Rao
Reinventing Kafka in the Data Streaming Era - Jun Rao
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAData Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEA
 
Webinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDBWebinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDB
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
 
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDB
 
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
 
OpenStack for VMware Administrators
OpenStack for VMware AdministratorsOpenStack for VMware Administrators
OpenStack for VMware Administrators
 
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
 
Partner Keynote: Intel - The New Frontier of Cloud Computing
Partner Keynote: Intel - The New Frontier of Cloud ComputingPartner Keynote: Intel - The New Frontier of Cloud Computing
Partner Keynote: Intel - The New Frontier of Cloud Computing
 
IoT and Event Streaming at Scale with Apache Kafka
IoT and Event Streaming at Scale with Apache KafkaIoT and Event Streaming at Scale with Apache Kafka
IoT and Event Streaming at Scale with Apache Kafka
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
 
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaTop 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
 

More from Microsoft Tech Community

Removing Security Roadblocks to IoT Deployment Success
Removing Security Roadblocks to IoT Deployment SuccessRemoving Security Roadblocks to IoT Deployment Success
Removing Security Roadblocks to IoT Deployment SuccessMicrosoft Tech Community
 
Building mobile apps with Visual Studio and Xamarin
Building mobile apps with Visual Studio and XamarinBuilding mobile apps with Visual Studio and Xamarin
Building mobile apps with Visual Studio and XamarinMicrosoft Tech Community
 
Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...Microsoft Tech Community
 
Interactive emails in Outlook with Adaptive Cards
Interactive emails in Outlook with Adaptive CardsInteractive emails in Outlook with Adaptive Cards
Interactive emails in Outlook with Adaptive CardsMicrosoft Tech Community
 
Unlocking security insights with Microsoft Graph API
Unlocking security insights with Microsoft Graph APIUnlocking security insights with Microsoft Graph API
Unlocking security insights with Microsoft Graph APIMicrosoft Tech Community
 
Break through the serverless barriers with Durable Functions
Break through the serverless barriers with Durable FunctionsBreak through the serverless barriers with Durable Functions
Break through the serverless barriers with Durable FunctionsMicrosoft Tech Community
 
Multiplayer Server Scaling with Azure Container Instances
Multiplayer Server Scaling with Azure Container InstancesMultiplayer Server Scaling with Azure Container Instances
Multiplayer Server Scaling with Azure Container InstancesMicrosoft Tech Community
 
Media Streaming Apps with Azure and Xamarin
Media Streaming Apps with Azure and XamarinMedia Streaming Apps with Azure and Xamarin
Media Streaming Apps with Azure and XamarinMicrosoft Tech Community
 
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexity
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexityReal-World Solutions with PowerApps: Tips & tricks to manage your app complexity
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexityMicrosoft Tech Community
 
Getting Started with Visual Studio Tools for AI
Getting Started with Visual Studio Tools for AIGetting Started with Visual Studio Tools for AI
Getting Started with Visual Studio Tools for AIMicrosoft Tech Community
 
Mobile Workforce Location Tracking with Bing Maps
Mobile Workforce Location Tracking with Bing MapsMobile Workforce Location Tracking with Bing Maps
Mobile Workforce Location Tracking with Bing MapsMicrosoft Tech Community
 
Cognitive Services Labs in action Anomaly detection
Cognitive Services Labs in action Anomaly detectionCognitive Services Labs in action Anomaly detection
Cognitive Services Labs in action Anomaly detectionMicrosoft Tech Community
 
LinkedIn Learning presents: Securing web applications in ASP.NET Core 2.1
LinkedIn Learning presents: Securing web applications in ASP.NET Core 2.1LinkedIn Learning presents: Securing web applications in ASP.NET Core 2.1
LinkedIn Learning presents: Securing web applications in ASP.NET Core 2.1Microsoft Tech Community
 

More from Microsoft Tech Community (20)

100 ways to use Yammer
100 ways to use Yammer100 ways to use Yammer
100 ways to use Yammer
 
10 Yammer Group Suggestions
10 Yammer Group Suggestions10 Yammer Group Suggestions
10 Yammer Group Suggestions
 
Removing Security Roadblocks to IoT Deployment Success
Removing Security Roadblocks to IoT Deployment SuccessRemoving Security Roadblocks to IoT Deployment Success
Removing Security Roadblocks to IoT Deployment Success
 
Building mobile apps with Visual Studio and Xamarin
Building mobile apps with Visual Studio and XamarinBuilding mobile apps with Visual Studio and Xamarin
Building mobile apps with Visual Studio and Xamarin
 
Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...
 
Interactive emails in Outlook with Adaptive Cards
Interactive emails in Outlook with Adaptive CardsInteractive emails in Outlook with Adaptive Cards
Interactive emails in Outlook with Adaptive Cards
 
Unlocking security insights with Microsoft Graph API
Unlocking security insights with Microsoft Graph APIUnlocking security insights with Microsoft Graph API
Unlocking security insights with Microsoft Graph API
 
Break through the serverless barriers with Durable Functions
Break through the serverless barriers with Durable FunctionsBreak through the serverless barriers with Durable Functions
Break through the serverless barriers with Durable Functions
 
Multiplayer Server Scaling with Azure Container Instances
Multiplayer Server Scaling with Azure Container InstancesMultiplayer Server Scaling with Azure Container Instances
Multiplayer Server Scaling with Azure Container Instances
 
Explore Azure Cosmos DB
Explore Azure Cosmos DBExplore Azure Cosmos DB
Explore Azure Cosmos DB
 
Media Streaming Apps with Azure and Xamarin
Media Streaming Apps with Azure and XamarinMedia Streaming Apps with Azure and Xamarin
Media Streaming Apps with Azure and Xamarin
 
DevOps for Data Science
DevOps for Data ScienceDevOps for Data Science
DevOps for Data Science
 
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexity
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexityReal-World Solutions with PowerApps: Tips & tricks to manage your app complexity
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexity
 
Azure Functions and Microsoft Graph
Azure Functions and Microsoft GraphAzure Functions and Microsoft Graph
Azure Functions and Microsoft Graph
 
Getting Started with Visual Studio Tools for AI
Getting Started with Visual Studio Tools for AIGetting Started with Visual Studio Tools for AI
Getting Started with Visual Studio Tools for AI
 
Using AML Python SDK
Using AML Python SDKUsing AML Python SDK
Using AML Python SDK
 
Mobile Workforce Location Tracking with Bing Maps
Mobile Workforce Location Tracking with Bing MapsMobile Workforce Location Tracking with Bing Maps
Mobile Workforce Location Tracking with Bing Maps
 
Cognitive Services Labs in action Anomaly detection
Cognitive Services Labs in action Anomaly detectionCognitive Services Labs in action Anomaly detection
Cognitive Services Labs in action Anomaly detection
 
Speech Devices SDK
Speech Devices SDKSpeech Devices SDK
Speech Devices SDK
 
LinkedIn Learning presents: Securing web applications in ASP.NET Core 2.1
LinkedIn Learning presents: Securing web applications in ASP.NET Core 2.1LinkedIn Learning presents: Securing web applications in ASP.NET Core 2.1
LinkedIn Learning presents: Securing web applications in ASP.NET Core 2.1
 

Recently uploaded

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 

Recently uploaded (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight

  • 1.
  • 2.
  • 3.
  • 4. Reliable open-source 99.9% availability SLA Monitoring (OMS) Visual Studio, IntelliJ and Eclipse support for developers and data scientists Enterprise grade Security Kerberos Apache Ranger Install & use big data applications Azure Marketplace Azure HDInsight Cloud Spark and Hadoop service for your enterprise (Spark, Hive, MR, LLAP, Kafka, HBase, Storm) *IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
  • 5.
  • 7. • Managed Kafka clusters with 99.9% service level SLA • Native integration with Azure Managed Disks. Allows for exponentially lower costs, and higher scale. • Scalable On Demand clusters - Kafka clusters with 16 TB/node and Zookeeper up and running in 15 minutes • Rack awareness for Kafka on the Azure cloud • Alerting and predictive cluster maintenance through Azure Operations Management Suite • Extensibility via one click deploy of leading ISVs such as StreamSets • Disaster recovery support via MirrorMaker • Deploy End to End streaming pipelines with Storm, Spark, Storage via automated ARM templates in the same VNET.
  • 8. Modern Data Warehouse: Real-time analytics Unstructured data Azure storage Azure HDInsight (LLAP) Azure HDInsight (Kafka) Analytic Dashboards Model & ServePrep & TrainStoreIngest Intelligence SQL DW Azure Databricks (Spark) Azure HDInsight (Spark)
  • 9. Kafka is a distributed, horizontally-scalable, fault-tolerant pub-sub store Broker 1 Producer 1 IoT Hub Storm Spark Streaming 1 2 3 ZK 1 ZK 2 ZK 3 Broker 2 Broker 3 3 1 2 Topic 1 Topic 2 Topic 1 Topic 2 Topic 2 Topic 1
  • 10. 4 5 Setup the broker configuration Publish the message The consumer reads the messages
  • 11.
  • 12. Azure Gateway Services Open source Stream Processing on Azure HDInsight Real-time applications Long term storage Real-time dashboards IoT Hubs Azure VNet Boundary
  • 13. Siphon on HDInsight Kafka 8 million EVENTS PER SECOND PEAK INGRESS 800 TB (10 GB per Sec) INGRESS PER DAY 1,800; 450 PRODUCTION KAFKA BROKERS; TOPICS 15 Sec 99th PERCENTILE LATENCY KEY CUSTOMER SCENARIOS Ads Monetization (Fast BI) O365 Customer Fabric NRT – Tenant & User insights BingNRT Operational Intelligence Presto (Fast SML) interactive analysis Delve Analytics 0 5 10 15 20 25 30 35 40 45 Jan-15 Feb-15 Mar-15 Apr-15 May-15 Jun-15 Jul-15 Aug-15 Sep-15 Oct-15 Nov-15 Dec-15 Jan-16 Feb-16 Mar-16 Apr-16 May-16 Jun-16 Jul-16 Aug-16 Sep-16 Oct-16 Nov-16 Dec-16 Throughput(inGBps) Siphon Data Volume (Ingress and Egress) Volume published (GBps) Volume subscribed (GBps) 0 5 10 15 20 25 Jan-15 Feb-15 Mar-15 Apr-15 May-15 Jun-15 Jul-15 Aug-15 Sep-15 Oct-15 Nov-15 Dec-15 Jan-16 Feb-16 Mar-16 Apr-16 May-16 Jun-16 Jul-16 Aug-16 Sep-16 Oct-16 Nov-16 Dec-16 Throughput(eventspersec)Millions Siphon Events per second (Ingress and Egress) EPS In Eps Out
  • 14.
  • 15.
  • 16.
  • 17. Getting Started with Kafka for HDInsight Structured Streaming with HDInsight Kafka and Spark Deploy HDInsight Kafka + Storm Stream data from on-premise to HDInsight Kafka in the cloud https://academy.microsoft.com/en-us/professional-program/big-data/ https://www.pluralsight.com/courses/spark-kafka-cassandra-applying-lambda-architecture https://azure.microsoft.com/en-us/blog/announcing-apache-kafka-for-azure- hdinsight-general-availability/ https://azure.microsoft.com/en-us/blog/announcing-public-preview-of-apache-kafka- on-hdinsight-with-azure-managed-disks/
  • 18.
  • 19.
  • 20.
  • 22.
  • 23.
  • 24. Costin$ Throughput MBps Kafka Cost Estimator Non Managed Disks Managed Disks #KAFKANODES THROUGHPUT MBPS Kafka scale forecast Kafka nodes (OS VHDs) Kafka nodes (managed disks)
  • 25.
  • 29.
  • 30. Message Rate 10,000 messages/sec Message size 150 KB upperbound Replica count 3 Retention Policy 12 hours

Editor's Notes

  1. 4