SlideShare uma empresa Scribd logo
1 de 26
Ingesting click data for analytics
Francesco Furiani, CTO @
$ whoami
Francesco Furiani (@ilfurio):
 Backend Engineer
 Roamed these halls not too long ago
Ingesting clicks data for analytics
Loves:
 Studying new CS stuff
 PlayStation / Bike / Traveling / Soccer
 O RLY? books
How do I make a living:
 CTO @ ClickMeter
 Backend Engineer @ ClickMeter
 Enum.take_random(IT_ROLES,1) @ ClickMeter
Ingesting clicks data for analytics
ClickMeter
 100k+ customers
 Getting events for customers from 10 to 3000 req/sec
Ingesting clicks data for analytics
ClickMeter
We receive data anytime someone:
 Clicks our links
 Views our pixels
 Calls our postbacks
Our customers use us:
 Inside a famous app the day of the big release ✔
 Advertising on an extremely big video portal ✔
 A tiny travel blog ✔
 A physical device for advertising ✔
Ingesting clicks data for analytics
Getting the data
We need to:
 Try not to lose the events we receive (duh)
 Show customers data for better insight on their campaigns
 Scale up/down according to the incoming fluxes
 Improve the product by using the data we get
 Do it as fast as possible (wasn’t this ready a week ago?)
 Do it as cheap as possible
Ingesting clicks data for analytics
The challenge
Find the size of the problem you’re trying to solve
 How much data do you expect? Rate?
 What do you have to do with it?
 Do I have to do something with ALL of it?
 How fast do I have to do it?
Answers to these questions are a starting point.
Ingesting clicks data for analytics
Size
Once we know how big and bad the beast is, we
need to design the ranch that will keep it in check.
Iterative process and prone to a lot of failures, but
the world is out there to help us.
Think, write and draw a lot.
Ingesting clicks data for analytics
Design
… draw too much ...
Ingesting clicks data for analytics
Design
Most of us will never have the joy (and the horror) of
creating a new stack, novel in theory and practice.
Still we need to understand the theory behind every
brick.
Read the info, read the opinions, try little proof of
concept of the moving parts, it helps a lot!
Ingesting clicks data for analytics
Which bricks should I use
A very important brick.
Elasticity of computation power, many *aaS, managed solutions are
really a great help in terms of saved manpower and fast iterations.
It comes at a great cost to consider:
• $$$ (ymmv)
• Possible lock-ins
Ingesting clicks data for analytics
The cloud is a brick too
… well it’s never definitive ...
Ingesting clicks data for analytics
Design with bricks
Obviously we haven’t followed those guidelines.
One becomes savvy after crashing and burning
many times.
But still thanks to those errors we got there and
built, at every iteration, a better infrastructure.
Ingesting clicks data for analytics
How we did it
ClickMeter was already live and growing
It needed an overhaul in its infrastructure/backend.
The growth fueled the need to be ready for more power to handle more data.
Obviously this had to be a tablecloth trick migration 
Ingesting clicks data for analytics
How we did it
Already on the cloud (AWS), we thought of having a hybrid approach but it didn’t
make sense.
Review of old components already in production to see what to kill, keep or
update.
Kept good stuff and designed some new layers to make them work flawlessly in
the new infrastructure.
Ingesting clicks data for analytics
How we did it
Ingesting clicks data for analytics
Pretty important, they need to:
• Stay up
• Scale up/down depending on the incoming traffic
• Never lose anything
• Be as fast as possible in processing
They’re a custom web app application that undergoes a lot of testing.
We used stuff like Beanstalk, Scaling groups, Load Balancers and Health routing
offered by our cloud provider to manage the webapp scaling/availability
Ingesting clicks data for analytics
Redirect engine
aka events collector
Pipeline
Most of this part uses our cloud provider
technology.
This simplifies maintenance and provisioning,
keeping the focus on the value of our product.
Some moving parts are custom made by us to
interact with the cloud technology (might be
proprietary or just repackaged known one).
Ingesting clicks data for analytics
Tracking engine
and friends
SQS Pipeline
Kinesis
• Events • Preprocessing
• Postprocessing
• DynamoDB
Ingesting clicks data for analytics
Tracking engine
and friends
Combination of real-time and batch
technologies.
One of the scaling parts that actually provides
value to the customers.
Computes analysis on events data from a
simple count to some predictions.
Check the data produced by your processing
system to improve the pipeline step-by-step!
Ingesting clicks data for analytics
Pipeline
Ingesting clicks data for analytics
Pipeline
We employ different storage based on speed of delivery and data type.
All the data is accessible via a REST API.
This permits to develop a frontend layer with relative ease and allows customers
to take control of the data and use it in a way we may have not considered.
Ingesting clicks data for analytics
Storage and data delivery
Managed services on the cloud help us a lot!
Most of the team can focus on improvements
and shipping (users are happy, so is the CEO).
Some of us (me) still have to be the
CloudOp/DevOp.
p.s.: always prepare a Plan B for when you’ll
break things!
Ingesting clicks data for analytics
Operations
Cloud is typically more expensive of your own metal.
This extra money you have to spend is actually well spent:
• Flexibility
• Easier provisioning
• Easier management
• Easier operations
There are different types of clouds, so choose wisely.
Ingesting clicks data for analytics
Cloud co$t$
Creating and managing a “big data” ready infrastructure is no easy task,
but it can be done step-by-step also by startups.
The cloud is a cool starting ground providing you with many of the toys
you need, so you can focus on what part of “big data” gives you value!
Use the wisdom shared by the big/medium players that have already
been there (and built most of the stuff you’re using).
Ingesting clicks data for analytics
Conclusions
Thank You
Any questions?
@il_furio
francesco@clickmeter.com

Mais conteúdo relacionado

Mais procurados

Flows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
Flows in the Service Console, Gotta Go with the Flow! by Duncan StewartFlows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
Flows in the Service Console, Gotta Go with the Flow! by Duncan StewartSalesforce Admins
 
Join 2017_Deep Dive_Integrating Looker with R and Python
Join 2017_Deep Dive_Integrating Looker with R and PythonJoin 2017_Deep Dive_Integrating Looker with R and Python
Join 2017_Deep Dive_Integrating Looker with R and PythonLooker
 
Why, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For StartupsWhy, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For StartupsDhruv Gohil
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Looker
 
Build a Big Data Warehouse on the Cloud in 30 Minutes
Build a Big Data Warehouse on the Cloud in 30 MinutesBuild a Big Data Warehouse on the Cloud in 30 Minutes
Build a Big Data Warehouse on the Cloud in 30 MinutesCaserta
 
5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoptionQubole
 
An Introduction to ORYX Software
An Introduction to ORYX SoftwareAn Introduction to ORYX Software
An Introduction to ORYX SoftwareAccountagility
 
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)Cathrine Wilhelmsen
 
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)Cathrine Wilhelmsen
 
About Pragmatic Works
About Pragmatic WorksAbout Pragmatic Works
About Pragmatic WorksMILL5
 
Data Democracy: Hadoop + Redshift
Data Democracy: Hadoop + RedshiftData Democracy: Hadoop + Redshift
Data Democracy: Hadoop + RedshiftLooker
 
SplunkLive! Tampa: Getting Started Session
SplunkLive! Tampa: Getting Started SessionSplunkLive! Tampa: Getting Started Session
SplunkLive! Tampa: Getting Started SessionSplunk
 
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauAnalyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauDATAVERSITY
 
Why You Need to Move Your Website to the Cloud
Why You Need to Move Your Website to the CloudWhy You Need to Move Your Website to the Cloud
Why You Need to Move Your Website to the CloudEktron
 
SplunkLive! Tampa: Splunk for Security - Hands-On Session
SplunkLive! Tampa: Splunk for Security - Hands-On SessionSplunkLive! Tampa: Splunk for Security - Hands-On Session
SplunkLive! Tampa: Splunk for Security - Hands-On SessionSplunk
 
#DataOnCloud New York Event
#DataOnCloud New York Event#DataOnCloud New York Event
#DataOnCloud New York EventHARMAN Services
 
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)Cathrine Wilhelmsen
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionSplunk
 
Cloud Management for MSPs
Cloud Management for MSPsCloud Management for MSPs
Cloud Management for MSPsRightScale
 
Path to Event Sourcing/CQRS - Derya SEZEN
Path to Event Sourcing/CQRS - Derya SEZENPath to Event Sourcing/CQRS - Derya SEZEN
Path to Event Sourcing/CQRS - Derya SEZENkloia
 

Mais procurados (20)

Flows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
Flows in the Service Console, Gotta Go with the Flow! by Duncan StewartFlows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
Flows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
 
Join 2017_Deep Dive_Integrating Looker with R and Python
Join 2017_Deep Dive_Integrating Looker with R and PythonJoin 2017_Deep Dive_Integrating Looker with R and Python
Join 2017_Deep Dive_Integrating Looker with R and Python
 
Why, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For StartupsWhy, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For Startups
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016
 
Build a Big Data Warehouse on the Cloud in 30 Minutes
Build a Big Data Warehouse on the Cloud in 30 MinutesBuild a Big Data Warehouse on the Cloud in 30 Minutes
Build a Big Data Warehouse on the Cloud in 30 Minutes
 
5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption
 
An Introduction to ORYX Software
An Introduction to ORYX SoftwareAn Introduction to ORYX Software
An Introduction to ORYX Software
 
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
 
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
 
About Pragmatic Works
About Pragmatic WorksAbout Pragmatic Works
About Pragmatic Works
 
Data Democracy: Hadoop + Redshift
Data Democracy: Hadoop + RedshiftData Democracy: Hadoop + Redshift
Data Democracy: Hadoop + Redshift
 
SplunkLive! Tampa: Getting Started Session
SplunkLive! Tampa: Getting Started SessionSplunkLive! Tampa: Getting Started Session
SplunkLive! Tampa: Getting Started Session
 
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauAnalyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
 
Why You Need to Move Your Website to the Cloud
Why You Need to Move Your Website to the CloudWhy You Need to Move Your Website to the Cloud
Why You Need to Move Your Website to the Cloud
 
SplunkLive! Tampa: Splunk for Security - Hands-On Session
SplunkLive! Tampa: Splunk for Security - Hands-On SessionSplunkLive! Tampa: Splunk for Security - Hands-On Session
SplunkLive! Tampa: Splunk for Security - Hands-On Session
 
#DataOnCloud New York Event
#DataOnCloud New York Event#DataOnCloud New York Event
#DataOnCloud New York Event
 
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)
Building Dynamic Pipelines in Azure Data Factory (Data Saturday Holland)
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Cloud Management for MSPs
Cloud Management for MSPsCloud Management for MSPs
Cloud Management for MSPs
 
Path to Event Sourcing/CQRS - Derya SEZEN
Path to Event Sourcing/CQRS - Derya SEZENPath to Event Sourcing/CQRS - Derya SEZEN
Path to Event Sourcing/CQRS - Derya SEZEN
 

Semelhante a Ingesting Click Data for Analytics

Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida  Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida CLARA CAMPROVIN
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessInside Analysis
 
Partner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_dataPartner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_dataTreasure Data, Inc.
 
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Looker
 
Smarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationSmarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationInside Analysis
 
Keynote: Future of IT - future of enterprise it Canada
Keynote: Future of IT - future of enterprise it CanadaKeynote: Future of IT - future of enterprise it Canada
Keynote: Future of IT - future of enterprise it CanadaAmazon Web Services
 
Gov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/OverviewGov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/OverviewSplunk
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudInside Analysis
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with DatabricksGrega Kespret
 
Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesJeff Bertman
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningKai Wähner
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스Amazon Web Services Korea
 
AWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AIAWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AIAmazon Web Services
 
All you need to know about yelowsofts new version update
All you need to know about yelowsofts new version updateAll you need to know about yelowsofts new version update
All you need to know about yelowsofts new version updateYelowsoft
 
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...Nelson Petracek
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...InfluxData
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Amazon Web Services
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...BigDataEverywhere
 
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAmazon Web Services
 

Semelhante a Ingesting Click Data for Analytics (20)

Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida  Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven Business
 
Partner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_dataPartner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_data
 
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
 
Smarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationSmarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with Automation
 
Keynote: Future of IT - future of enterprise it Canada
Keynote: Future of IT - future of enterprise it CanadaKeynote: Future of IT - future of enterprise it Canada
Keynote: Future of IT - future of enterprise it Canada
 
Gov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/OverviewGov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/Overview
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the Cloud
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with Databricks
 
Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and Practices
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
 
AWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AIAWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AI
 
All you need to know about yelowsofts new version update
All you need to know about yelowsofts new version updateAll you need to know about yelowsofts new version update
All you need to know about yelowsofts new version update
 
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
 
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
 

Último

Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 

Último (20)

Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 

Ingesting Click Data for Analytics

  • 1. Ingesting click data for analytics Francesco Furiani, CTO @
  • 2. $ whoami Francesco Furiani (@ilfurio):  Backend Engineer  Roamed these halls not too long ago Ingesting clicks data for analytics Loves:  Studying new CS stuff  PlayStation / Bike / Traveling / Soccer  O RLY? books How do I make a living:  CTO @ ClickMeter  Backend Engineer @ ClickMeter  Enum.take_random(IT_ROLES,1) @ ClickMeter
  • 3. Ingesting clicks data for analytics ClickMeter
  • 4.  100k+ customers  Getting events for customers from 10 to 3000 req/sec Ingesting clicks data for analytics ClickMeter
  • 5. We receive data anytime someone:  Clicks our links  Views our pixels  Calls our postbacks Our customers use us:  Inside a famous app the day of the big release ✔  Advertising on an extremely big video portal ✔  A tiny travel blog ✔  A physical device for advertising ✔ Ingesting clicks data for analytics Getting the data
  • 6. We need to:  Try not to lose the events we receive (duh)  Show customers data for better insight on their campaigns  Scale up/down according to the incoming fluxes  Improve the product by using the data we get  Do it as fast as possible (wasn’t this ready a week ago?)  Do it as cheap as possible Ingesting clicks data for analytics The challenge
  • 7. Find the size of the problem you’re trying to solve  How much data do you expect? Rate?  What do you have to do with it?  Do I have to do something with ALL of it?  How fast do I have to do it? Answers to these questions are a starting point. Ingesting clicks data for analytics Size
  • 8. Once we know how big and bad the beast is, we need to design the ranch that will keep it in check. Iterative process and prone to a lot of failures, but the world is out there to help us. Think, write and draw a lot. Ingesting clicks data for analytics Design
  • 9. … draw too much ... Ingesting clicks data for analytics Design
  • 10. Most of us will never have the joy (and the horror) of creating a new stack, novel in theory and practice. Still we need to understand the theory behind every brick. Read the info, read the opinions, try little proof of concept of the moving parts, it helps a lot! Ingesting clicks data for analytics Which bricks should I use
  • 11. A very important brick. Elasticity of computation power, many *aaS, managed solutions are really a great help in terms of saved manpower and fast iterations. It comes at a great cost to consider: • $$$ (ymmv) • Possible lock-ins Ingesting clicks data for analytics The cloud is a brick too
  • 12. … well it’s never definitive ... Ingesting clicks data for analytics Design with bricks
  • 13. Obviously we haven’t followed those guidelines. One becomes savvy after crashing and burning many times. But still thanks to those errors we got there and built, at every iteration, a better infrastructure. Ingesting clicks data for analytics How we did it
  • 14. ClickMeter was already live and growing It needed an overhaul in its infrastructure/backend. The growth fueled the need to be ready for more power to handle more data. Obviously this had to be a tablecloth trick migration  Ingesting clicks data for analytics How we did it
  • 15. Already on the cloud (AWS), we thought of having a hybrid approach but it didn’t make sense. Review of old components already in production to see what to kill, keep or update. Kept good stuff and designed some new layers to make them work flawlessly in the new infrastructure. Ingesting clicks data for analytics How we did it
  • 16. Ingesting clicks data for analytics
  • 17. Pretty important, they need to: • Stay up • Scale up/down depending on the incoming traffic • Never lose anything • Be as fast as possible in processing They’re a custom web app application that undergoes a lot of testing. We used stuff like Beanstalk, Scaling groups, Load Balancers and Health routing offered by our cloud provider to manage the webapp scaling/availability Ingesting clicks data for analytics Redirect engine aka events collector
  • 18. Pipeline Most of this part uses our cloud provider technology. This simplifies maintenance and provisioning, keeping the focus on the value of our product. Some moving parts are custom made by us to interact with the cloud technology (might be proprietary or just repackaged known one). Ingesting clicks data for analytics Tracking engine and friends
  • 19. SQS Pipeline Kinesis • Events • Preprocessing • Postprocessing • DynamoDB Ingesting clicks data for analytics Tracking engine and friends
  • 20. Combination of real-time and batch technologies. One of the scaling parts that actually provides value to the customers. Computes analysis on events data from a simple count to some predictions. Check the data produced by your processing system to improve the pipeline step-by-step! Ingesting clicks data for analytics Pipeline
  • 21. Ingesting clicks data for analytics Pipeline
  • 22. We employ different storage based on speed of delivery and data type. All the data is accessible via a REST API. This permits to develop a frontend layer with relative ease and allows customers to take control of the data and use it in a way we may have not considered. Ingesting clicks data for analytics Storage and data delivery
  • 23. Managed services on the cloud help us a lot! Most of the team can focus on improvements and shipping (users are happy, so is the CEO). Some of us (me) still have to be the CloudOp/DevOp. p.s.: always prepare a Plan B for when you’ll break things! Ingesting clicks data for analytics Operations
  • 24. Cloud is typically more expensive of your own metal. This extra money you have to spend is actually well spent: • Flexibility • Easier provisioning • Easier management • Easier operations There are different types of clouds, so choose wisely. Ingesting clicks data for analytics Cloud co$t$
  • 25. Creating and managing a “big data” ready infrastructure is no easy task, but it can be done step-by-step also by startups. The cloud is a cool starting ground providing you with many of the toys you need, so you can focus on what part of “big data” gives you value! Use the wisdom shared by the big/medium players that have already been there (and built most of the stuff you’re using). Ingesting clicks data for analytics Conclusions

Notas do Editor

  1. What we do: Control of marketing links and maximize conversion rates Tool to monitor, compare and optimize all their links in one place
  2. The whole picture