SlideShare uma empresa Scribd logo
1 de 21
An Introduction to
Azure Data Factory
v1
Everything you need to know to start developing today
Who are these people?!
• Parinaz Kallick – Business Intelligence Consultant
 Working with BI for 10 years (Origins in databases and reporting &
 MSBI Stack)
 B.S in Computer Science
 MBA-IT
• Eric Bragas – Business Intelligence Consultant, MCP
 Working with Microsoft BI for 5+ years
 Azure and Power BI for 3+ years
 California native, based in San Francisco
 Eastern cuisine aficionado
Agenda
What is Data Factory?
How does it work?
Core Components
How to Develop
• Demo
Monitoring & Management
Use Cases
Challenges Best Practices
What is Azure Data Factory (ADF)?
• "[Azure Data Factory] is a cloud-based data integration service that
allows you to create data-driven workflows in the cloud that
orchestrate and automate data movement and data transformation.“
• In short - it's Azure's PaaS service for time series data integration
How Does it Work?
• Leverages cloud resources to Extract, Load, and Transform your data
 Storage - Azure Blob Storage, HDInsight, Azure SQL DW, etc.
 Compute - Hive Query, Azure SQL DW, etc.
• ELT over ETL
• Time-series paradigm, ie. web logs, social sentiment, sensor data
v1 Supported Services
Components
• Pipeline - the unit of orchestration, and container for activities
• Activity - a data movement or transformation component
 ie. Copy, HiveQuery, StoredProcedure, etc.
• Linked Service - connection manager
 i.e. Azure Blob Storage, Azure SQL DW, etc.
• Data Set - a data structure within a linked service
 i.e. a table or storage container, etc.
Components
Why is Data Factory Different than
Other Integration Tools (*cough* *cough* SSIS)
• Extract, Load, then Transform
 Leverage scale out compute resources to do you transforms instead of a
VM running your integration service which is bound by resource limits
• PaaS - pay-as-you-go
 Don't need a server constantly running and accruing charges
• Scheduling is time-series based and implicitly defined
 Major paradigm shift; kind of complex initially
• Built in task scheduler
• Works with structured and unstructured data
• Destinations are called "sinks"?
Scheduling in ADFv1
Scheduling in ADFv1
Developing Data Factories
Azure Portal
• Non-Microsoft clients
• Exploration
Visual Studio
• Mature development
environments
• Multiple
environments
• Team development –
easier collaboration
PowerShell
• Monitoring and
Management
• Quick setup and tear
down
Demo Architecture
Blog Storage -
Daily Sales
Files
Azure SQL -
Sales DataMart
Data Factory
Staging
Table
Summary
Table
Demo!
• Tools and extensions:
 Microsoft Azure Data Factory Tools for Visual Studio 2015
 Cloud Explorer for Visual Studio 2015
• Spin up an Azure Data Factory
 Azure Storage with files and empty Azure SQL DB should be ready to go
• Copy Azure Blob Storage to Azure SQL Database
 Use SQL write cleanup script
How do we Monitor our New
Pipeline?
• Azure Portal > Data Factory > Monitor & Manage
• PowerShell
Use Cases
• Time-series, ie. web logs, social sentiment, etc.
• Hybrid integrations
• Advanced Analytics workflows
• Cloud migration
When ADF is NOT the Best Option
• Required data sources are not supported
• Loading Azure Data Warehouse
 Polybase is more performant
• Extracting from a non-time series source
• Anytime before v2 is Generally Available!
Challenges and Best Practices
Challenges
• The scheduling component can be very challenging to work with
• The lack of expressions and variables within a control flow is a big
gap
Best Practices
• Use consistent naming conventions
• Always publish pipelines with isPaused: True
• Test thoroughly before promoting to production
Azure Data Factory v2
High-level
 ADFv1 – is a service designed for the batch data processing of time series data
 ADFv2 – is a general purpose, hybrid data integration service with very flexible execution
patterns
New Features:
• Integration Runtime (publish SSIS
packages)
• Branching logic (On success, On failure, On
Completion, On skip)
• Web Development UI
• Expressions and Parameters
• System Variables
• Event and Scheduled Triggers
• Additional Activity Types
• Way more data sources! Eg. BigQuery,
Dynamics 365, and way more
Fin.

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

1- Introduction of Azure data factory.pptx
1- Introduction of Azure data factory.pptx1- Introduction of Azure data factory.pptx
1- Introduction of Azure data factory.pptx
 
Azure Data Factory presentation with links
Azure Data Factory presentation with linksAzure Data Factory presentation with links
Azure Data Factory presentation with links
 
Azure data factory
Azure data factoryAzure data factory
Azure data factory
 
Azure Data Factory for Azure Data Week
Azure Data Factory for Azure Data WeekAzure Data Factory for Azure Data Week
Azure Data Factory for Azure Data Week
 
Azure Data Factory Introduction.pdf
Azure Data Factory Introduction.pdfAzure Data Factory Introduction.pdf
Azure Data Factory Introduction.pdf
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMicrosoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
 
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data Factory
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
 
NOVA SQL User Group - Azure Synapse Analytics Overview - May 2020
NOVA SQL User Group - Azure Synapse Analytics Overview -  May 2020NOVA SQL User Group - Azure Synapse Analytics Overview -  May 2020
NOVA SQL User Group - Azure Synapse Analytics Overview - May 2020
 
Azure data bricks by Eugene Polonichko
Azure data bricks by Eugene PolonichkoAzure data bricks by Eugene Polonichko
Azure data bricks by Eugene Polonichko
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Migrating on premises workload to azure sql database
Migrating on premises workload to azure sql databaseMigrating on premises workload to azure sql database
Migrating on premises workload to azure sql database
 
Azure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudAzure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the Cloud
 
Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)
 
Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22
 
Lakehouse in Azure
Lakehouse in AzureLakehouse in Azure
Lakehouse in Azure
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 

Semelhante a Intro to Azure Data Factory v1

Geek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure EnvironmentsGeek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure Environments
IDERA Software
 

Semelhante a Intro to Azure Data Factory v1 (20)

Geek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure EnvironmentsGeek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure Environments
 
Migrate a successful transactional database to azure
Migrate a successful transactional database to azureMigrate a successful transactional database to azure
Migrate a successful transactional database to azure
 
Understanding System Design and Architecture Blueprints of Efficiency
Understanding System Design and Architecture Blueprints of EfficiencyUnderstanding System Design and Architecture Blueprints of Efficiency
Understanding System Design and Architecture Blueprints of Efficiency
 
Monitorando performance no Azure SQL Database
Monitorando performance no Azure SQL DatabaseMonitorando performance no Azure SQL Database
Monitorando performance no Azure SQL Database
 
SQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT SolutionsSQL Server 2019 hotlap - WARDY IT Solutions
SQL Server 2019 hotlap - WARDY IT Solutions
 
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL DatabaseModern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 
Microsoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the CloudMicrosoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the Cloud
 
Tech-Spark: Azure SQL Databases
Tech-Spark: Azure SQL DatabasesTech-Spark: Azure SQL Databases
Tech-Spark: Azure SQL Databases
 
Accelerating Business Intelligence Solutions with Microsoft Azure pass
Accelerating Business Intelligence Solutions with Microsoft Azure   passAccelerating Business Intelligence Solutions with Microsoft Azure   pass
Accelerating Business Intelligence Solutions with Microsoft Azure pass
 
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSISMicrosoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
 
Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)
 
AZURE Data Related Services
AZURE Data Related ServicesAZURE Data Related Services
AZURE Data Related Services
 
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the CloudSQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
 
Building cloud native data microservice
Building cloud native data microserviceBuilding cloud native data microservice
Building cloud native data microservice
 
SQL Azure - the good, the bad and the ugly.
SQL Azure - the good, the bad and the ugly.SQL Azure - the good, the bad and the ugly.
SQL Azure - the good, the bad and the ugly.
 
Afternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data ServicesAfternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data Services
 
A lap around Azure Data Factory
A lap around Azure Data FactoryA lap around Azure Data Factory
A lap around Azure Data Factory
 
Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...
Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...
Hybrid Analytics in Healthcare: Leveraging Power BI and Office 365 to Make Sm...
 

Último

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 

Intro to Azure Data Factory v1

  • 1. An Introduction to Azure Data Factory v1 Everything you need to know to start developing today
  • 2. Who are these people?! • Parinaz Kallick – Business Intelligence Consultant  Working with BI for 10 years (Origins in databases and reporting &  MSBI Stack)  B.S in Computer Science  MBA-IT • Eric Bragas – Business Intelligence Consultant, MCP  Working with Microsoft BI for 5+ years  Azure and Power BI for 3+ years  California native, based in San Francisco  Eastern cuisine aficionado
  • 3. Agenda What is Data Factory? How does it work? Core Components How to Develop • Demo Monitoring & Management Use Cases Challenges Best Practices
  • 4. What is Azure Data Factory (ADF)? • "[Azure Data Factory] is a cloud-based data integration service that allows you to create data-driven workflows in the cloud that orchestrate and automate data movement and data transformation.“ • In short - it's Azure's PaaS service for time series data integration
  • 5. How Does it Work? • Leverages cloud resources to Extract, Load, and Transform your data  Storage - Azure Blob Storage, HDInsight, Azure SQL DW, etc.  Compute - Hive Query, Azure SQL DW, etc. • ELT over ETL • Time-series paradigm, ie. web logs, social sentiment, sensor data
  • 7. Components • Pipeline - the unit of orchestration, and container for activities • Activity - a data movement or transformation component  ie. Copy, HiveQuery, StoredProcedure, etc. • Linked Service - connection manager  i.e. Azure Blob Storage, Azure SQL DW, etc. • Data Set - a data structure within a linked service  i.e. a table or storage container, etc.
  • 9.
  • 10. Why is Data Factory Different than Other Integration Tools (*cough* *cough* SSIS) • Extract, Load, then Transform  Leverage scale out compute resources to do you transforms instead of a VM running your integration service which is bound by resource limits • PaaS - pay-as-you-go  Don't need a server constantly running and accruing charges • Scheduling is time-series based and implicitly defined  Major paradigm shift; kind of complex initially • Built in task scheduler • Works with structured and unstructured data • Destinations are called "sinks"?
  • 13. Developing Data Factories Azure Portal • Non-Microsoft clients • Exploration Visual Studio • Mature development environments • Multiple environments • Team development – easier collaboration PowerShell • Monitoring and Management • Quick setup and tear down
  • 14. Demo Architecture Blog Storage - Daily Sales Files Azure SQL - Sales DataMart Data Factory Staging Table Summary Table
  • 15. Demo! • Tools and extensions:  Microsoft Azure Data Factory Tools for Visual Studio 2015  Cloud Explorer for Visual Studio 2015 • Spin up an Azure Data Factory  Azure Storage with files and empty Azure SQL DB should be ready to go • Copy Azure Blob Storage to Azure SQL Database  Use SQL write cleanup script
  • 16. How do we Monitor our New Pipeline? • Azure Portal > Data Factory > Monitor & Manage • PowerShell
  • 17. Use Cases • Time-series, ie. web logs, social sentiment, etc. • Hybrid integrations • Advanced Analytics workflows • Cloud migration
  • 18. When ADF is NOT the Best Option • Required data sources are not supported • Loading Azure Data Warehouse  Polybase is more performant • Extracting from a non-time series source • Anytime before v2 is Generally Available!
  • 19. Challenges and Best Practices Challenges • The scheduling component can be very challenging to work with • The lack of expressions and variables within a control flow is a big gap Best Practices • Use consistent naming conventions • Always publish pipelines with isPaused: True • Test thoroughly before promoting to production
  • 20. Azure Data Factory v2 High-level  ADFv1 – is a service designed for the batch data processing of time series data  ADFv2 – is a general purpose, hybrid data integration service with very flexible execution patterns New Features: • Integration Runtime (publish SSIS packages) • Branching logic (On success, On failure, On Completion, On skip) • Web Development UI • Expressions and Parameters • System Variables • Event and Scheduled Triggers • Additional Activity Types • Way more data sources! Eg. BigQuery, Dynamics 365, and way more
  • 21. Fin.

Notas do Editor

  1. All supported services in v1: https://docs.microsoft.com/en-us/azure/data-factory/v1/data-factory-create-datasets Supported Services in v2: https://docs.microsoft.com/en-us/azure/data-factory/concepts-datasets-linked-services
  2. https://docs.microsoft.com/en-us/azure/data-factory/v1/data-factory-scheduling-and-execution
  3. ADF Tools: https://marketplace.visualstudio.com/items?itemName=AzureDataFactory.MicrosoftAzureDataFactoryToolsforVisualStudio2015 Cloud Explorer: https://marketplace.visualstudio.com/items?itemName=MicrosoftCloudExplorer.CloudExplorerforVisualStudio2015
  4. https://docs.microsoft.com/en-us/azure/data-factory/compare-versions https://www.purplefrogsystems.com/paul/2017/09/whats-new-in-azure-data-factory-version-2-adfv2/