SlideShare a Scribd company logo
1 of 37
AZURE CLOUD AND CORE
SERVICES
• Azure Synapse Analytics-HDInsight-Azure Data bricks -
Usage of Internet of Things (IoT) Hub-IoT Central-Azure
Sphere-Azure Cloud shell and Mobile Apps.
Azure Synapse Analytics
• Azure Synapse Analytics is a limitless analytics service that
brings together data integration, enterprise data warehousing and
big data analytics.
• It gives you the freedom to query data on your terms, using either
server less or dedicated options—at scale.
• Azure Synapse brings these worlds together with a unified
experience to ingest, explore, prepare, transform, manage and
serve data for immediate BI and machine learning needs.
Azure Synapse Analytics
Azure Synapse Components
and Features
• Azure Synapse Analytics is a limitless analytics service that
brings together data integration, enterprise data warehousing and
big data analytics.
• It gives you the freedom to query data on your terms, using either
server less or dedicated options—at scale.
• Azure Synapse brings these worlds together with a unified
experience to ingest, explore, prepare, transform, manage and
serve data for immediate BI and machine learning needs.
Azure Synapse Components
and Features
• Synapse Analytics is basically an analytics service that has a virtually unlimited scale to
support analytics workloads
• Synapse Workspaces (in preview as of Sept 2020) provides an integrated console to
administer and operate different components and services of Azure Synapse Analytics
• Synapse Analytics Studio is a web-based IDE to enable code-free or low-code developer
experience to work with Synapse Analytics
• Synapse supports a number of languages like SQL, Python, .NET, Java, Scala, and R that are
typically used by analytic workloads
• Synapse supports two types of analytics runtimes – SQL and Spark (in preview as of Sept
2020) based that can process data in a batch, streaming, and interactive manner
• Synapse is integrated with numerous Azure data services as well, for example, Azure Data
Catalog, Azure Lake Storage, Azure Databricks, Azure HDInsight, Azure Machine Learning,
and Power BI
• Synapse also provides integrated management, security, and monitoring related services to
support monitoring and operations on the data and services supported by Synapse
• Data Lake Storage is suited for big data scale of data volumes that are modeled in a data
lake model. This storage layer acts as the data source layer for Synapse. Data is typically
populated in Synapse from Data Lake Storage for various analytical purposes
Core Pillars of Synapse
• Azure Synapse Studio
• Azure Synapse Data Integration
• Synapse SQL Pools
• Apache Spark for Azure Synapse
• Azure Synapse Security
Features
• Centralized Data Management
• Workload Isolation
• Machine Learning Integration
• Supply chain forecasting
• Inventory reporting
• Predictive maintenance
• Anomaly detection
Benefits
• Accelerate Analytics & Reporting
• Better BI & Data Visualization
• Increased IT productivity
• Limitless Scaling
Azure HDInsight
• What is Azure HDInsight?
• Azure HDInsight Features
• Azure HDInsight Architecture
• Azure HDInsight Metastore Best Practices
• Azure HDInsight Migration
• Azure HDInsight Security and DevOps
• Azure HDInsight Uses
• Data Warehousing
• Internet of Things (IoT)
• Data Science
• Hybrid Cloud
Azure HDInsight
• Azure HDInsight is a service offered by Microsoft, that enables
us to use open source frameworks for big data analytics.
• Azure HDInsight allows the use of frameworks like Hadoop,
Apache Spark, Apache Hive, LLAP, Apache Kafka, Apache
Storm, R, etc., for processing large volumes of data.
• These tools can be used on data to perform extract, transform,
and load (ETL,) data warehousing, machine learning, and IoT.
Azure HDInsight Features
• Cloud and on-premises availability
• Scalable and economical
• Security
• Monitoring and analytics
• Global availability
• Highly productive
Azure HDInsight
Architecture
• It is recommended that you migrate an on-premises Hadoop
cluster to Azure HDInsight using multiple workload clusters
rather than a single cluster.
• On-demand transient clusters are used so that the clusters are
deleted after the workload is complete.
• In HDInsight clusters, as storage-and-compute can be used from
Azure Storage, Azure Data Lake Storage, or both, it is best to
separate data storage from processing.
Azure HDInsight Metastore
• The Apache Hive Metastore is an important aspect of the Apache
Hadoop architecture since it serves as a central schema
repository for other big data access resources including Apache
Spark, Interactive Query (LLAP), Presto, and Apache Pig.
Types of HDInsight
Metastore
• A default metastore can be created for free for any cluster type,
but if one is created it cannot be shared.
• The use of custom metastores is recommended for production
clusters since they can be created and removed without loss of
metadata. It is suggested to use a custom metastore to isolate
compute and metadata and to periodically back it up.
Azure HDInsight Migration
• Script migration or replication can be used to migrate Hive
metastore.
• Migration Using DB Replication
• Migrating over TLS
• Migrating offline
Azure HDInsight Uses
• Data Warehousing
• Internet of Things (IoT)
• Data Science
• Hybrid Cloud
Azure Data bricks
• Azure Databricks is a data analytics platform optimized for the
Microsoft Azure cloud services platform.
• Azure Databricks offers three environments:
1.Databricks SQL
2.Databricks data science and engineering
3.Databricks machine learning
Databricks SQL
• Databricks SQL provides a user-friendly platform.
• This helps analysts, who work on SQL queries, to run queries on
Azure Data Lake, create multiple virtualizations, and build and
share dashboards.
Databricks Data Science and
Engineering
• Databricks data science and engineering provide an interactive
working environment for data engineers, data scientists, and
machine learning engineers.
• The two ways to send data through the big data pipeline are:
1.Ingest into Azure through Azure Data Factory in batches
2.Stream real-time by using Apache Kafka, Event Hubs, or
IoT Hub
Databricks Machine
Learning
• Databricks machine learning is a complete machine learning
environment.
• It helps to manage services for experiment tracking, model
training, feature development, and management.
• It also does model serving.
Pros and Cons of Azure
Databricks
Pros
• It can process large amounts of data with Databricks and since it
is part of Azure; the data is cloud-native.
• The clusters are easy to set up and configure.
• It has an Azure Synapse Analytics connector as well as the
ability to connect to Azure DB.
• It is integrated with Active Directory.
• It supports multiple languages. Scala is the main language, but it
also works well with Python, SQL, and R.
Pros and Cons of Azure
Databricks
Cons
• It does not integrate with Git or any other versioning tool.
• It, currently, only supports HDInsight and not Azure Batch or
AZTK.
Databricks SQL
• Databricks SQL allows you to run quick ad-hoc SQL queries on Data
Lake. Integrating with Azure Active Directory enables to run of
complete Azure-based solutions by using Databricks SQL.
• The interface that allows the automation of Databricks SQL objects is
REST API.
1. Data Management
2. Computation Management
3. Authorization
Data Management
It has three parts:
• Visualization: A graphical presentation of the result of running a query
• Dashboard: A presentation of query visualizations and commentary
• Alert: A notification that a field returned by a query has reached a
threshold
Computation Management
Here, we will know about the terms that will help to run SQL queries in
Databricks SQL.
• Query: A valid SQL statement
• SQL endpoint: A resource where SQL queries are executed
• Query history: A list of previously executed queries and their
characteristics
Authorization
• User and group: The user is an individual who has access to the system.
The set of multiple users is known as a group.
• Personal access token: An opaque string is used to authenticate to the
REST API.
• Access control list: Set of permissions attached to a principal that
requires access to an object. ACL (Access Control List) specifies the
object and actions allowed in it.
Databricks Data Science &
Engineering
• Databricks Data Science & Engineering is, sometimes, also called
Workspace. It is an analytics platform that is based on Apache Spark.
• Databricks Data Science & Engineering comprises complete open-
source Apache Spark cluster technologies and capabilities.
Databricks Data Science &
Engineering
Spark in Databricks Data Science & Engineering includes the following
components:
• Spark SQL and DataFrames: This is the Spark module for working with
structured data. A DataFrame is a distributed collection of data that is organized
into named columns. It is very similar to a table in a relational database or a data
frame in R or Python.
• Streaming: This integrates with HDFS, Flume, and Kafka. Streaming is real-
time data processing and analysis for analytical and interactive applications.
• MLlib: It is short for Machine Learning Library consisting of common learning
algorithms and utilities including classification, regression, clustering,
collaborative filtering, dimensionality reduction as well as underlying
optimization primitives.
• GraphX: Graphs and graph computation for a broad scope of use cases from
cognitive analytics to data exploration.
• Spark Core API: This has the support for R, SQL, Python, Scala, and Java.
Databricks Machine
Learning
With Databricks machine learning, we can:
• Train models either manually or with AutoML
• Track training parameters and models by using experiments with
MLflow tracking
• Create feature tables and access them for model training and
inference
• Share, manage, and serve models by using Model Registry
Usage of Internet of Things
(IoT) Hub
• Azure IoT Hub allows full-featured and scalable IoT solutions.
Virtually, any device can be connected to Azure IoT Hub and it
can scale up to millions of devices.
• Events can be tracked and monitored, such as the creation,
failure, and connection of devices.
• Azure IoT hub is a managed IoT service which is hosted in the
cloud. It allows bi-directional communication between IoT
applications and the devices it manages.
• This cloud-to-device connectivity means that you can receive
data from your devices, but you can also send commands and
policies back to the devices
Usage of Internet of Things
(IoT) Hub
Azure IoT Hub provides,
• • Device libraries for the most commonly used platforms and
languages for easy device connectivity.
• • Secure communications with multiple options for device-to-
cloud and cloud-to-device hyper-scale communication.
• • Queryable storage of per-device state information as well as
meta-data.
The lifecycle of devices with
IoT Hub
The lifecycle of devices with
IoT Hub
• Plan
Operators can create a device metadata scheme that allows them to
easily carry out bulk management operations.
•
Provision
New devices can be securely provisioned to IoT Hub and operators can
quickly discover device capabilities. The IoT Hub identity registry is
used to create device identities and credentials.
•
Configure
Device management operations, such as configuration changes and
firmware updates can be done in bulk or by direct methods, while still
maintaining system security.
The lifecycle of devices with
IoT Hub
• Monitor
Operators can be easily alerted to any issues arising and at the same
time the device collection health can be monitored, as well as the status
of any ongoing operations.
•
Retire
Devices need to be replaced, retired or decommissioned. The IoT Hub
identity registry is used to withdraw device identities and credentials.
Device management patterns
IoT Hub supports a range of device management patterns including,
• Reboot
• Factory reset
• Configuration
• Firmware update
• Reporting progress and status
These patterns can be extended to fit your exact situation. Alternatively,
new patterns can be designed based on these templates.
Connecting your devices
You can build applications which run on your devices and interact with IoT
Hub using the Azure IoT device SDK. Windows, Linux distributions, and
real-time operating systems are supported platforms. Supported languages
currently include,
• C
• C#
• Java
• Python
• Node.js.
Services
• Messaging Patterns
• Device data can be routed
• Message routing and event grid
• Building end-to-end solutions
• Security

More Related Content

Similar to CC -Unit4.pptx

Similar to CC -Unit4.pptx (20)

USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
 
Azure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptxAzure Data Engineering course in hyderabad.pptx
Azure Data Engineering course in hyderabad.pptx
 
Azure Data Engineering Course in Hyderabad
Azure Data Engineering  Course in HyderabadAzure Data Engineering  Course in Hyderabad
Azure Data Engineering Course in Hyderabad
 
"Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad ""Azure Data Engineering Course in Hyderabad "
"Azure Data Engineering Course in Hyderabad "
 
Azure Data Engineering.pptx
Azure Data Engineering.pptxAzure Data Engineering.pptx
Azure Data Engineering.pptx
 
1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release
 
Best Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsightBest Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsight
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
 
Big Data Analytics .pptx
Big Data Analytics .pptxBig Data Analytics .pptx
Big Data Analytics .pptx
 
Azure Data Platform Overview.pdf
Azure Data Platform Overview.pdfAzure Data Platform Overview.pdf
Azure Data Platform Overview.pdf
 
Scalable relational database with SQL Azure
Scalable relational database with SQL AzureScalable relational database with SQL Azure
Scalable relational database with SQL Azure
 
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
 
Migration to Databricks - On-prem HDFS.pptx
Migration to Databricks - On-prem HDFS.pptxMigration to Databricks - On-prem HDFS.pptx
Migration to Databricks - On-prem HDFS.pptx
 
Azure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data LakeAzure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data Lake
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Big Data_Architecture.pptx
Big Data_Architecture.pptxBig Data_Architecture.pptx
Big Data_Architecture.pptx
 
Accelerating Business Intelligence Solutions with Microsoft Azure pass
Accelerating Business Intelligence Solutions with Microsoft Azure   passAccelerating Business Intelligence Solutions with Microsoft Azure   pass
Accelerating Business Intelligence Solutions with Microsoft Azure pass
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Introduction to Azure Data Lake
Introduction to Azure Data LakeIntroduction to Azure Data Lake
Introduction to Azure Data Lake
 
Big Data training
Big Data trainingBig Data training
Big Data training
 

More from Revathiparamanathan (20)

UNIT 1 NOTES.docx
UNIT 1 NOTES.docxUNIT 1 NOTES.docx
UNIT 1 NOTES.docx
 
Unit 3,4.docx
Unit 3,4.docxUnit 3,4.docx
Unit 3,4.docx
 
UNIT II.docx
UNIT II.docxUNIT II.docx
UNIT II.docx
 
UNIT V.docx
UNIT V.docxUNIT V.docx
UNIT V.docx
 
COMPILER DESIGN.docx
COMPILER DESIGN.docxCOMPILER DESIGN.docx
COMPILER DESIGN.docx
 
UNIT -III.docx
UNIT -III.docxUNIT -III.docx
UNIT -III.docx
 
UNIT -IV.docx
UNIT -IV.docxUNIT -IV.docx
UNIT -IV.docx
 
UNIT - II.docx
UNIT - II.docxUNIT - II.docx
UNIT - II.docx
 
UNIT -V.docx
UNIT -V.docxUNIT -V.docx
UNIT -V.docx
 
UNIT - I.docx
UNIT - I.docxUNIT - I.docx
UNIT - I.docx
 
CC -Unit3.pptx
CC -Unit3.pptxCC -Unit3.pptx
CC -Unit3.pptx
 
CC.pptx
CC.pptxCC.pptx
CC.pptx
 
Unit 4 notes.pdf
Unit 4 notes.pdfUnit 4 notes.pdf
Unit 4 notes.pdf
 
Unit 3 notes.pdf
Unit 3 notes.pdfUnit 3 notes.pdf
Unit 3 notes.pdf
 
Unit 1 notes.pdf
Unit 1 notes.pdfUnit 1 notes.pdf
Unit 1 notes.pdf
 
Unit 2 notes.pdf
Unit 2 notes.pdfUnit 2 notes.pdf
Unit 2 notes.pdf
 
Unit 5 notes.pdf
Unit 5 notes.pdfUnit 5 notes.pdf
Unit 5 notes.pdf
 
CC.pptx
CC.pptxCC.pptx
CC.pptx
 
Unit-4 Day1.pptx
Unit-4 Day1.pptxUnit-4 Day1.pptx
Unit-4 Day1.pptx
 
Scala Introduction.pptx
Scala Introduction.pptxScala Introduction.pptx
Scala Introduction.pptx
 

Recently uploaded

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 

Recently uploaded (20)

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 

CC -Unit4.pptx

  • 1. AZURE CLOUD AND CORE SERVICES • Azure Synapse Analytics-HDInsight-Azure Data bricks - Usage of Internet of Things (IoT) Hub-IoT Central-Azure Sphere-Azure Cloud shell and Mobile Apps.
  • 2. Azure Synapse Analytics • Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. • It gives you the freedom to query data on your terms, using either server less or dedicated options—at scale. • Azure Synapse brings these worlds together with a unified experience to ingest, explore, prepare, transform, manage and serve data for immediate BI and machine learning needs.
  • 4. Azure Synapse Components and Features • Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. • It gives you the freedom to query data on your terms, using either server less or dedicated options—at scale. • Azure Synapse brings these worlds together with a unified experience to ingest, explore, prepare, transform, manage and serve data for immediate BI and machine learning needs.
  • 5. Azure Synapse Components and Features • Synapse Analytics is basically an analytics service that has a virtually unlimited scale to support analytics workloads • Synapse Workspaces (in preview as of Sept 2020) provides an integrated console to administer and operate different components and services of Azure Synapse Analytics • Synapse Analytics Studio is a web-based IDE to enable code-free or low-code developer experience to work with Synapse Analytics • Synapse supports a number of languages like SQL, Python, .NET, Java, Scala, and R that are typically used by analytic workloads • Synapse supports two types of analytics runtimes – SQL and Spark (in preview as of Sept 2020) based that can process data in a batch, streaming, and interactive manner • Synapse is integrated with numerous Azure data services as well, for example, Azure Data Catalog, Azure Lake Storage, Azure Databricks, Azure HDInsight, Azure Machine Learning, and Power BI • Synapse also provides integrated management, security, and monitoring related services to support monitoring and operations on the data and services supported by Synapse • Data Lake Storage is suited for big data scale of data volumes that are modeled in a data lake model. This storage layer acts as the data source layer for Synapse. Data is typically populated in Synapse from Data Lake Storage for various analytical purposes
  • 6. Core Pillars of Synapse • Azure Synapse Studio • Azure Synapse Data Integration • Synapse SQL Pools • Apache Spark for Azure Synapse • Azure Synapse Security
  • 7. Features • Centralized Data Management • Workload Isolation • Machine Learning Integration • Supply chain forecasting • Inventory reporting • Predictive maintenance • Anomaly detection
  • 8. Benefits • Accelerate Analytics & Reporting • Better BI & Data Visualization • Increased IT productivity • Limitless Scaling
  • 9. Azure HDInsight • What is Azure HDInsight? • Azure HDInsight Features • Azure HDInsight Architecture • Azure HDInsight Metastore Best Practices • Azure HDInsight Migration • Azure HDInsight Security and DevOps • Azure HDInsight Uses • Data Warehousing • Internet of Things (IoT) • Data Science • Hybrid Cloud
  • 10. Azure HDInsight • Azure HDInsight is a service offered by Microsoft, that enables us to use open source frameworks for big data analytics. • Azure HDInsight allows the use of frameworks like Hadoop, Apache Spark, Apache Hive, LLAP, Apache Kafka, Apache Storm, R, etc., for processing large volumes of data. • These tools can be used on data to perform extract, transform, and load (ETL,) data warehousing, machine learning, and IoT.
  • 11. Azure HDInsight Features • Cloud and on-premises availability • Scalable and economical • Security • Monitoring and analytics • Global availability • Highly productive
  • 12. Azure HDInsight Architecture • It is recommended that you migrate an on-premises Hadoop cluster to Azure HDInsight using multiple workload clusters rather than a single cluster. • On-demand transient clusters are used so that the clusters are deleted after the workload is complete. • In HDInsight clusters, as storage-and-compute can be used from Azure Storage, Azure Data Lake Storage, or both, it is best to separate data storage from processing.
  • 13. Azure HDInsight Metastore • The Apache Hive Metastore is an important aspect of the Apache Hadoop architecture since it serves as a central schema repository for other big data access resources including Apache Spark, Interactive Query (LLAP), Presto, and Apache Pig.
  • 14. Types of HDInsight Metastore • A default metastore can be created for free for any cluster type, but if one is created it cannot be shared. • The use of custom metastores is recommended for production clusters since they can be created and removed without loss of metadata. It is suggested to use a custom metastore to isolate compute and metadata and to periodically back it up.
  • 15. Azure HDInsight Migration • Script migration or replication can be used to migrate Hive metastore. • Migration Using DB Replication • Migrating over TLS • Migrating offline
  • 16. Azure HDInsight Uses • Data Warehousing • Internet of Things (IoT) • Data Science • Hybrid Cloud
  • 17. Azure Data bricks • Azure Databricks is a data analytics platform optimized for the Microsoft Azure cloud services platform. • Azure Databricks offers three environments: 1.Databricks SQL 2.Databricks data science and engineering 3.Databricks machine learning
  • 18. Databricks SQL • Databricks SQL provides a user-friendly platform. • This helps analysts, who work on SQL queries, to run queries on Azure Data Lake, create multiple virtualizations, and build and share dashboards.
  • 19. Databricks Data Science and Engineering • Databricks data science and engineering provide an interactive working environment for data engineers, data scientists, and machine learning engineers. • The two ways to send data through the big data pipeline are: 1.Ingest into Azure through Azure Data Factory in batches 2.Stream real-time by using Apache Kafka, Event Hubs, or IoT Hub
  • 20. Databricks Machine Learning • Databricks machine learning is a complete machine learning environment. • It helps to manage services for experiment tracking, model training, feature development, and management. • It also does model serving.
  • 21. Pros and Cons of Azure Databricks Pros • It can process large amounts of data with Databricks and since it is part of Azure; the data is cloud-native. • The clusters are easy to set up and configure. • It has an Azure Synapse Analytics connector as well as the ability to connect to Azure DB. • It is integrated with Active Directory. • It supports multiple languages. Scala is the main language, but it also works well with Python, SQL, and R.
  • 22. Pros and Cons of Azure Databricks Cons • It does not integrate with Git or any other versioning tool. • It, currently, only supports HDInsight and not Azure Batch or AZTK.
  • 23. Databricks SQL • Databricks SQL allows you to run quick ad-hoc SQL queries on Data Lake. Integrating with Azure Active Directory enables to run of complete Azure-based solutions by using Databricks SQL. • The interface that allows the automation of Databricks SQL objects is REST API. 1. Data Management 2. Computation Management 3. Authorization
  • 24. Data Management It has three parts: • Visualization: A graphical presentation of the result of running a query • Dashboard: A presentation of query visualizations and commentary • Alert: A notification that a field returned by a query has reached a threshold
  • 25. Computation Management Here, we will know about the terms that will help to run SQL queries in Databricks SQL. • Query: A valid SQL statement • SQL endpoint: A resource where SQL queries are executed • Query history: A list of previously executed queries and their characteristics
  • 26. Authorization • User and group: The user is an individual who has access to the system. The set of multiple users is known as a group. • Personal access token: An opaque string is used to authenticate to the REST API. • Access control list: Set of permissions attached to a principal that requires access to an object. ACL (Access Control List) specifies the object and actions allowed in it.
  • 27. Databricks Data Science & Engineering • Databricks Data Science & Engineering is, sometimes, also called Workspace. It is an analytics platform that is based on Apache Spark. • Databricks Data Science & Engineering comprises complete open- source Apache Spark cluster technologies and capabilities.
  • 28. Databricks Data Science & Engineering Spark in Databricks Data Science & Engineering includes the following components: • Spark SQL and DataFrames: This is the Spark module for working with structured data. A DataFrame is a distributed collection of data that is organized into named columns. It is very similar to a table in a relational database or a data frame in R or Python. • Streaming: This integrates with HDFS, Flume, and Kafka. Streaming is real- time data processing and analysis for analytical and interactive applications. • MLlib: It is short for Machine Learning Library consisting of common learning algorithms and utilities including classification, regression, clustering, collaborative filtering, dimensionality reduction as well as underlying optimization primitives. • GraphX: Graphs and graph computation for a broad scope of use cases from cognitive analytics to data exploration. • Spark Core API: This has the support for R, SQL, Python, Scala, and Java.
  • 29. Databricks Machine Learning With Databricks machine learning, we can: • Train models either manually or with AutoML • Track training parameters and models by using experiments with MLflow tracking • Create feature tables and access them for model training and inference • Share, manage, and serve models by using Model Registry
  • 30. Usage of Internet of Things (IoT) Hub • Azure IoT Hub allows full-featured and scalable IoT solutions. Virtually, any device can be connected to Azure IoT Hub and it can scale up to millions of devices. • Events can be tracked and monitored, such as the creation, failure, and connection of devices. • Azure IoT hub is a managed IoT service which is hosted in the cloud. It allows bi-directional communication between IoT applications and the devices it manages. • This cloud-to-device connectivity means that you can receive data from your devices, but you can also send commands and policies back to the devices
  • 31. Usage of Internet of Things (IoT) Hub Azure IoT Hub provides, • • Device libraries for the most commonly used platforms and languages for easy device connectivity. • • Secure communications with multiple options for device-to- cloud and cloud-to-device hyper-scale communication. • • Queryable storage of per-device state information as well as meta-data.
  • 32. The lifecycle of devices with IoT Hub
  • 33. The lifecycle of devices with IoT Hub • Plan Operators can create a device metadata scheme that allows them to easily carry out bulk management operations. • Provision New devices can be securely provisioned to IoT Hub and operators can quickly discover device capabilities. The IoT Hub identity registry is used to create device identities and credentials. • Configure Device management operations, such as configuration changes and firmware updates can be done in bulk or by direct methods, while still maintaining system security.
  • 34. The lifecycle of devices with IoT Hub • Monitor Operators can be easily alerted to any issues arising and at the same time the device collection health can be monitored, as well as the status of any ongoing operations. • Retire Devices need to be replaced, retired or decommissioned. The IoT Hub identity registry is used to withdraw device identities and credentials.
  • 35. Device management patterns IoT Hub supports a range of device management patterns including, • Reboot • Factory reset • Configuration • Firmware update • Reporting progress and status These patterns can be extended to fit your exact situation. Alternatively, new patterns can be designed based on these templates.
  • 36. Connecting your devices You can build applications which run on your devices and interact with IoT Hub using the Azure IoT device SDK. Windows, Linux distributions, and real-time operating systems are supported platforms. Supported languages currently include, • C • C# • Java • Python • Node.js.
  • 37. Services • Messaging Patterns • Device data can be routed • Message routing and event grid • Building end-to-end solutions • Security