SlideShare uma empresa Scribd logo
1 de 22
Getting
to Know
AIRFLOW
Rosie Hoyem
PyMNtos
04/27/2017
Me.
Data Scientist
Web Developer
Landlord
Cyclist
Traveler
rosiehoyem@gmail.com
rosiehoyem.com
0.
Airflow
huh?
???
Airflow
In a
Nutshell
⊙Data Engineering tool
⊙Pimped out Flask app
⊙Useful for building
functional data pipelines and
automating workflow
1.
Why do I
care?
It’s Popular.
History of
Airflow
2014
Maxime
Beauchemin
began building
a tool at Airbnb
in October of
2014
2016
Airflow entered
incubation as
an Apache
project
Now
Officially used
by dozens of
companies
large and small
Who
Already
uses it?
Airbnb [@mistercrunch, @artwr]
Agari [@r39132]
allegro.pl [@kretes]
AltX [@pedromduarte]
Apigee [@btallman]
Astronomer [@schnie]
Auth0 [@sicarul]
BandwidthX [@dineshdsharma]
Bellhops
BlaBlaCar [@puckel & @wmorin]
Bloc [@dpaola2]
BlueApron [@jasonjho & @matthewdavidhauser]
Blue Yonder [@blue-yonder]
Celect [@superdosh & @chadcelect]
Change.org [@change, @vijaykramesh]
Children's Hospital of Philadelphia Division of
Genomic Diagnostics [@genomics-geek]
City of San Diego [@MrMaksimize, @andrell81 &
@arnaudvedy]
Clairvoyant @shekharv
Clover Health [@gwax & @vansivallab]
Chartboost [@cgelman & @dclubb]
Cotap [@maraca & @richardchew]
Digital First Media [@duffn & @mschmo &
@seanmuth]
Easy Taxi [@caique-lima & @WesleyBatista]
FreshBooks [@DinoCow]
Gentner Lab [@neuromusic]
Glassdoor [@syvineckruyk]
HelloFresh [@tammymendt & @davidsbatista &
@iuriinedostup]
Holimetrix [@thibault-ketterer]
Hootsuite
IFTTT [@apurvajoshi]
iHeartRadio[@yiwang]
ING
Jampp
Kiwi.com [@underyx]
Kogan.com [@geeknam]
Lemann Foundation [@fernandosjp]
LendUp [@lendup]
liligo [@tromika]
LingoChamp [@haitaoyao]
Lucid [@jbrownlucid & @kkourtchikov]
Lumos Labs [@rfroetscher & @zzztimbo]
Lyft[@SaurabhBajaj]
Madrone [@mbreining & @scotthb]
Markovian [@al-xv, @skogsbaeck, @waltherg]
Mercadoni [@demorenoc]
MiNODES [@dice89, @diazcelsa]
MFG Labs
mytaxi [@mytaxi]
Nerdwallet
OfferUp
OneFineStay [@slangwald]
Open Knowledge International @vitorbaptista
PayPal [@jhsenjaliya]
Postmates [@syeoryn]
Sense360 [@kamilmroczek]
Shopkick [@shopkick]
Sidecar [@getsidecar]
SimilarWeb [@similarweb]
SmartNews [@takus]
Spotify [@znichols]
Stackspace
Stripe [@jbalogh]
Thumbtack [@natekupp]
T2 Systems [@unclaimedpants]
Vente-Exclusive.com [@alexvanboxel]
Vnomics [@lpalum]
WePay [@criccomini & @mtagle]
WeTransfer [@jochem]
Whistle Labs [@ananya77041]
WiseBanyan
Wooga
Xoom [@gepser & @omarvides]
Yahoo!
Zapier [@drknexus & @statwonk]
Zendesk
Zenly [@cerisier & @jbdalido]
99 [@fbenevides, @gustavoamigo & @mmmaia]
GovTech GDS [@chrissng & @datagovsg]
Gusto [@frankhsu]
Handshake [@mhickman]
Handy [@marcintustin / @mtustin-handy]
Qubole [@msumit]
2
What is it?
A Brief Overview
Before
Airflow,
there
was...
Cron Jobs.
(And a hodge-podge of other tools
people would duct tape together.)
What’s a Cron Job you say?
Cron
cron is a Linux utility
which schedules a
command or script on
your server to run
automatically at a
specified time and
date.
Schedule
Jobs
Cron Job
A cron job is the
scheduled task itself.
Cron jobs can be
very useful to
automate repetitive
tasks.
Airflow
DAGs as CODE
Directed Acyclic
Graph
Config file that outlines HOW to carry out a workflow
Contains a
collection of
tasks
Determines what
order tasks will
be implemented
Determines
when they will
be implemented
OPERATORS
Operators are the building blocks of workflows
Action
Performs an action, or
tell another system to
perform an action
(i.e., PythonOperator)
Transfer
Move data from one
system to another
(i.e., RedshiftToS3Transfer
Sensor
Will keep running until a
certain criterion is met
(i.e., S3KeySensor
Let’s
review
some
concepts
Operators
Classes provided by
Airflow. Building blocks
of DAGs.
DAGS
Directed Acyclic
Graphs. Specialized
config files for series of
tasks.
Tasks
Tasks are connected via
directed edges that
represent an
"execute_after"
relationship.
Life
Stream
Example
Rails
Application
Airflow Process
Manager
PostgreSQL
Data Store
3
Let’s Try It.
4
Final
Thoughts.
What’s It
Good For
It Can:
⊙Schedule complex
chains of tasks
⊙Manage
dependencies
between tasks
⊙Define complex
relations even in a
large distributed
environment
It Can’t:
⊙Store your data
⊙Clean your house
⊙Feed your pets while
you are gone on
vacation (yet)
Competito
rs Luigi
Came out of Spotify
Simpler in scope
More object oriented
*Complementary to
Airflow?
Pachyderm
Containerized data
pipeline framework
Azkaban
Created at LinkedIn
Batch workflow job
scheduler to run
Hadoop jobs
“ Airflow provides a load of
functionality, but like any
popular, fast-moving
project, the documentation
gap can be a challenge to
adoption.
Thanks!
Any questions?
Getting to Know Airflow

Mais conteúdo relacionado

Mais procurados

How I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowHow I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowLaura Lorenz
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow ArchitectureGerard Toonstra
 
Building an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowBuilding an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowYohei Onishi
 
Airflow Clustering and High Availability
Airflow Clustering and High AvailabilityAirflow Clustering and High Availability
Airflow Clustering and High AvailabilityRobert Sanders
 
Super Fast Gevent Introduction
Super Fast Gevent IntroductionSuper Fast Gevent Introduction
Super Fast Gevent IntroductionWalter Liu
 
(CMP310) Data Processing Pipelines Using Containers & Spot Instances
(CMP310) Data Processing Pipelines Using Containers & Spot Instances(CMP310) Data Processing Pipelines Using Containers & Spot Instances
(CMP310) Data Processing Pipelines Using Containers & Spot InstancesAmazon Web Services
 
Orchestrating workflows Apache Airflow on GCP & AWS
Orchestrating workflows Apache Airflow on GCP & AWSOrchestrating workflows Apache Airflow on GCP & AWS
Orchestrating workflows Apache Airflow on GCP & AWSDerrick Qin
 
Contributing to Apache Airflow | Journey to becoming Airflow's leading contri...
Contributing to Apache Airflow | Journey to becoming Airflow's leading contri...Contributing to Apache Airflow | Journey to becoming Airflow's leading contri...
Contributing to Apache Airflow | Journey to becoming Airflow's leading contri...Kaxil Naik
 
From business requirements to working pipelines with apache airflow
From business requirements to working pipelines with apache airflowFrom business requirements to working pipelines with apache airflow
From business requirements to working pipelines with apache airflowDerrick Qin
 
Managing data workflows with Luigi
Managing data workflows with LuigiManaging data workflows with Luigi
Managing data workflows with LuigiTeemu Kurppa
 
Building a Data Ingestion & Processing Pipeline with Spark & Airflow
Building a Data Ingestion & Processing Pipeline with Spark & AirflowBuilding a Data Ingestion & Processing Pipeline with Spark & Airflow
Building a Data Ingestion & Processing Pipeline with Spark & AirflowTom Lous
 
Luigi presentation OA Summit
Luigi presentation OA SummitLuigi presentation OA Summit
Luigi presentation OA SummitOpen Analytics
 

Mais procurados (20)

How I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowHow I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with Airflow
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow Architecture
 
Airflow introduction
Airflow introductionAirflow introduction
Airflow introduction
 
Apache airflow
Apache airflowApache airflow
Apache airflow
 
Building an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowBuilding an analytics workflow using Apache Airflow
Building an analytics workflow using Apache Airflow
 
Workflow Engines + Luigi
Workflow Engines + LuigiWorkflow Engines + Luigi
Workflow Engines + Luigi
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentation
 
Airflow Clustering and High Availability
Airflow Clustering and High AvailabilityAirflow Clustering and High Availability
Airflow Clustering and High Availability
 
Super Fast Gevent Introduction
Super Fast Gevent IntroductionSuper Fast Gevent Introduction
Super Fast Gevent Introduction
 
Apache Airflow overview
Apache Airflow overviewApache Airflow overview
Apache Airflow overview
 
(CMP310) Data Processing Pipelines Using Containers & Spot Instances
(CMP310) Data Processing Pipelines Using Containers & Spot Instances(CMP310) Data Processing Pipelines Using Containers & Spot Instances
(CMP310) Data Processing Pipelines Using Containers & Spot Instances
 
Orchestrating workflows Apache Airflow on GCP & AWS
Orchestrating workflows Apache Airflow on GCP & AWSOrchestrating workflows Apache Airflow on GCP & AWS
Orchestrating workflows Apache Airflow on GCP & AWS
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
 
Contributing to Apache Airflow | Journey to becoming Airflow's leading contri...
Contributing to Apache Airflow | Journey to becoming Airflow's leading contri...Contributing to Apache Airflow | Journey to becoming Airflow's leading contri...
Contributing to Apache Airflow | Journey to becoming Airflow's leading contri...
 
From business requirements to working pipelines with apache airflow
From business requirements to working pipelines with apache airflowFrom business requirements to working pipelines with apache airflow
From business requirements to working pipelines with apache airflow
 
Managing data workflows with Luigi
Managing data workflows with LuigiManaging data workflows with Luigi
Managing data workflows with Luigi
 
Luigi future
Luigi futureLuigi future
Luigi future
 
Building a Data Ingestion & Processing Pipeline with Spark & Airflow
Building a Data Ingestion & Processing Pipeline with Spark & AirflowBuilding a Data Ingestion & Processing Pipeline with Spark & Airflow
Building a Data Ingestion & Processing Pipeline with Spark & Airflow
 
Go With The Flow
Go With The FlowGo With The Flow
Go With The Flow
 
Luigi presentation OA Summit
Luigi presentation OA SummitLuigi presentation OA Summit
Luigi presentation OA Summit
 

Semelhante a Getting to Know Airflow

Platform Engineering: Manage your infrastructure using Kubernetes and Crossplane
Platform Engineering: Manage your infrastructure using Kubernetes and CrossplanePlatform Engineering: Manage your infrastructure using Kubernetes and Crossplane
Platform Engineering: Manage your infrastructure using Kubernetes and CrossplaneAhmed AbouZaid
 
Deploying Web Apps with PaaS and Docker Tools
Deploying Web Apps with PaaS and Docker ToolsDeploying Web Apps with PaaS and Docker Tools
Deploying Web Apps with PaaS and Docker ToolsEddie Lau
 
Introduce Airflow.ppsx
Introduce Airflow.ppsxIntroduce Airflow.ppsx
Introduce Airflow.ppsxManKD
 
Building Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowBuilding Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowSid Anand
 
Building Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroBuilding Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroAlex Barbosa Coqueiro
 
How we spread out our service globally by utilizing AWS and open source soft...
How we spread out our service globally by utilizing  AWS and open source soft...How we spread out our service globally by utilizing  AWS and open source soft...
How we spread out our service globally by utilizing AWS and open source soft...Takashi Someda
 
DataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptxDataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptxJohn J Zhao
 
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas MongoDB
 
Crash Course in Open Source Cloud Computing
Crash Course in Open Source Cloud ComputingCrash Course in Open Source Cloud Computing
Crash Course in Open Source Cloud ComputingMark Hinkle
 
Automating your plugin with WP-Cron
Automating your plugin with WP-CronAutomating your plugin with WP-Cron
Automating your plugin with WP-CronDan Cannon
 
Building a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWSBuilding a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWSSmartNews, Inc.
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017Patrick Chanezon
 
Introduction to Apache Airflow
Introduction to Apache AirflowIntroduction to Apache Airflow
Introduction to Apache Airflowmutt_data
 
Automate the operation of your Oracle Cloud infrastructure v2.0
Automate the operation of your Oracle Cloud infrastructure v2.0Automate the operation of your Oracle Cloud infrastructure v2.0
Automate the operation of your Oracle Cloud infrastructure v2.0Nelson Calero
 
CloudFormation Dark Arts
CloudFormation Dark ArtsCloudFormation Dark Arts
CloudFormation Dark ArtsChase Douglas
 
OpenStack and serverless - long shot or sure thing
OpenStack and serverless - long shot or sure thingOpenStack and serverless - long shot or sure thing
OpenStack and serverless - long shot or sure thingCloudify Community
 
Airflow techtonic template
Airflow   techtonic templateAirflow   techtonic template
Airflow techtonic templateSampath Kumar
 
Annex1 kof hatem_9-11-2018
Annex1 kof hatem_9-11-2018Annex1 kof hatem_9-11-2018
Annex1 kof hatem_9-11-2018Hatem Wasfy
 

Semelhante a Getting to Know Airflow (20)

Platform Engineering: Manage your infrastructure using Kubernetes and Crossplane
Platform Engineering: Manage your infrastructure using Kubernetes and CrossplanePlatform Engineering: Manage your infrastructure using Kubernetes and Crossplane
Platform Engineering: Manage your infrastructure using Kubernetes and Crossplane
 
Deploying Web Apps with PaaS and Docker Tools
Deploying Web Apps with PaaS and Docker ToolsDeploying Web Apps with PaaS and Docker Tools
Deploying Web Apps with PaaS and Docker Tools
 
Introduce Airflow.ppsx
Introduce Airflow.ppsxIntroduce Airflow.ppsx
Introduce Airflow.ppsx
 
Building Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowBuilding Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache Airflow
 
Self-Service Supercomputing
Self-Service SupercomputingSelf-Service Supercomputing
Self-Service Supercomputing
 
Building Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroBuilding Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to Hero
 
How we spread out our service globally by utilizing AWS and open source soft...
How we spread out our service globally by utilizing  AWS and open source soft...How we spread out our service globally by utilizing  AWS and open source soft...
How we spread out our service globally by utilizing AWS and open source soft...
 
DataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptxDataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptx
 
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
 
Colloquium Report
Colloquium ReportColloquium Report
Colloquium Report
 
Crash Course in Open Source Cloud Computing
Crash Course in Open Source Cloud ComputingCrash Course in Open Source Cloud Computing
Crash Course in Open Source Cloud Computing
 
Automating your plugin with WP-Cron
Automating your plugin with WP-CronAutomating your plugin with WP-Cron
Automating your plugin with WP-Cron
 
Building a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWSBuilding a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWS
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017
 
Introduction to Apache Airflow
Introduction to Apache AirflowIntroduction to Apache Airflow
Introduction to Apache Airflow
 
Automate the operation of your Oracle Cloud infrastructure v2.0
Automate the operation of your Oracle Cloud infrastructure v2.0Automate the operation of your Oracle Cloud infrastructure v2.0
Automate the operation of your Oracle Cloud infrastructure v2.0
 
CloudFormation Dark Arts
CloudFormation Dark ArtsCloudFormation Dark Arts
CloudFormation Dark Arts
 
OpenStack and serverless - long shot or sure thing
OpenStack and serverless - long shot or sure thingOpenStack and serverless - long shot or sure thing
OpenStack and serverless - long shot or sure thing
 
Airflow techtonic template
Airflow   techtonic templateAirflow   techtonic template
Airflow techtonic template
 
Annex1 kof hatem_9-11-2018
Annex1 kof hatem_9-11-2018Annex1 kof hatem_9-11-2018
Annex1 kof hatem_9-11-2018
 

Último

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 

Último (20)

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 

Getting to Know Airflow

  • 4. Airflow In a Nutshell ⊙Data Engineering tool ⊙Pimped out Flask app ⊙Useful for building functional data pipelines and automating workflow
  • 6. History of Airflow 2014 Maxime Beauchemin began building a tool at Airbnb in October of 2014 2016 Airflow entered incubation as an Apache project Now Officially used by dozens of companies large and small
  • 7. Who Already uses it? Airbnb [@mistercrunch, @artwr] Agari [@r39132] allegro.pl [@kretes] AltX [@pedromduarte] Apigee [@btallman] Astronomer [@schnie] Auth0 [@sicarul] BandwidthX [@dineshdsharma] Bellhops BlaBlaCar [@puckel & @wmorin] Bloc [@dpaola2] BlueApron [@jasonjho & @matthewdavidhauser] Blue Yonder [@blue-yonder] Celect [@superdosh & @chadcelect] Change.org [@change, @vijaykramesh] Children's Hospital of Philadelphia Division of Genomic Diagnostics [@genomics-geek] City of San Diego [@MrMaksimize, @andrell81 & @arnaudvedy] Clairvoyant @shekharv Clover Health [@gwax & @vansivallab] Chartboost [@cgelman & @dclubb] Cotap [@maraca & @richardchew] Digital First Media [@duffn & @mschmo & @seanmuth] Easy Taxi [@caique-lima & @WesleyBatista] FreshBooks [@DinoCow] Gentner Lab [@neuromusic] Glassdoor [@syvineckruyk] HelloFresh [@tammymendt & @davidsbatista & @iuriinedostup] Holimetrix [@thibault-ketterer] Hootsuite IFTTT [@apurvajoshi] iHeartRadio[@yiwang] ING Jampp Kiwi.com [@underyx] Kogan.com [@geeknam] Lemann Foundation [@fernandosjp] LendUp [@lendup] liligo [@tromika] LingoChamp [@haitaoyao] Lucid [@jbrownlucid & @kkourtchikov] Lumos Labs [@rfroetscher & @zzztimbo] Lyft[@SaurabhBajaj] Madrone [@mbreining & @scotthb] Markovian [@al-xv, @skogsbaeck, @waltherg] Mercadoni [@demorenoc] MiNODES [@dice89, @diazcelsa] MFG Labs mytaxi [@mytaxi] Nerdwallet OfferUp OneFineStay [@slangwald] Open Knowledge International @vitorbaptista PayPal [@jhsenjaliya] Postmates [@syeoryn] Sense360 [@kamilmroczek] Shopkick [@shopkick] Sidecar [@getsidecar] SimilarWeb [@similarweb] SmartNews [@takus] Spotify [@znichols] Stackspace Stripe [@jbalogh] Thumbtack [@natekupp] T2 Systems [@unclaimedpants] Vente-Exclusive.com [@alexvanboxel] Vnomics [@lpalum] WePay [@criccomini & @mtagle] WeTransfer [@jochem] Whistle Labs [@ananya77041] WiseBanyan Wooga Xoom [@gepser & @omarvides] Yahoo! Zapier [@drknexus & @statwonk] Zendesk Zenly [@cerisier & @jbdalido] 99 [@fbenevides, @gustavoamigo & @mmmaia] GovTech GDS [@chrissng & @datagovsg] Gusto [@frankhsu] Handshake [@mhickman] Handy [@marcintustin / @mtustin-handy] Qubole [@msumit]
  • 8. 2 What is it? A Brief Overview
  • 9. Before Airflow, there was... Cron Jobs. (And a hodge-podge of other tools people would duct tape together.) What’s a Cron Job you say?
  • 10. Cron cron is a Linux utility which schedules a command or script on your server to run automatically at a specified time and date. Schedule Jobs Cron Job A cron job is the scheduled task itself. Cron jobs can be very useful to automate repetitive tasks.
  • 12. Directed Acyclic Graph Config file that outlines HOW to carry out a workflow Contains a collection of tasks Determines what order tasks will be implemented Determines when they will be implemented
  • 13. OPERATORS Operators are the building blocks of workflows Action Performs an action, or tell another system to perform an action (i.e., PythonOperator) Transfer Move data from one system to another (i.e., RedshiftToS3Transfer Sensor Will keep running until a certain criterion is met (i.e., S3KeySensor
  • 14. Let’s review some concepts Operators Classes provided by Airflow. Building blocks of DAGs. DAGS Directed Acyclic Graphs. Specialized config files for series of tasks. Tasks Tasks are connected via directed edges that represent an "execute_after" relationship.
  • 18. What’s It Good For It Can: ⊙Schedule complex chains of tasks ⊙Manage dependencies between tasks ⊙Define complex relations even in a large distributed environment It Can’t: ⊙Store your data ⊙Clean your house ⊙Feed your pets while you are gone on vacation (yet)
  • 19. Competito rs Luigi Came out of Spotify Simpler in scope More object oriented *Complementary to Airflow? Pachyderm Containerized data pipeline framework Azkaban Created at LinkedIn Batch workflow job scheduler to run Hadoop jobs
  • 20. “ Airflow provides a load of functionality, but like any popular, fast-moving project, the documentation gap can be a challenge to adoption.

Notas do Editor

  1. functional pipelines
  2. biggest unicorns — Spotify, Lyft, Airbnb, Stripe
  3. A finite directed graph with no directed cycles Graph: Vertices and edges Acyclic: No cycles Directed: One direction, beginning and end
  4. Execute Python code UNLOAD command to s3 as a CSV with headers Waits for a key (a file-like instance on S3) to be present in a S3 bucket. S3 being a key/value it does not support folders. The path is just a key a resource.
  5. task A finishes, do both tasks B and C, and when B finishes execute tasks D and E