SlideShare uma empresa Scribd logo
1 de 21
Analytics for the
Real-Time Web
        Maria Grineva
   Systems @ ETH Zurich
Real-Time Web

• Web 2.0 + mobile devices = Real-Time Web
• People share what they do now, discuss
  breaking news on Twitter, share their current
  locations on Foursquare...
Analytics for the Real-Time Web:
       new requirements
 • Batch processing (MapReduce) is too slow
 • New requirements:
    • real-time processing: aggregate values
       incrementally, as new data arrives
    • data-base intensive: aggregate values
       are stored in a database constantly
       being updated
Our System: Triggy
•   Based on Cassandra, a distributed key-value store
•   Provides programming model similar to MapReduce,
    adapted to push-style processing
•   Extends Cassandra with
    •   push-style procedures - to immediately propagate
        the data to computations;
    •   synchronization - to ensure consistency of aggregate
        results (counters)
•   Easily scalable
Cassandra Overview
                      Data Model

•   Data Model: key-value

•   Extends basic key-value
    with 2 levels of nesting

•   Super column - if the
    second level is presented

•   Column family ~ table;

    key-value pair ~ record

•   Keys are stored ordered
Cassandra Overview
          Incremental Scalability
•   Incremental scalability requires
    mechanism to dynamically
    partition data over the nodes

•   Data partitioned by key using
    consistent hashing

•   Advantage of consistent
    hashing: departure or arrival of
    a node affects only its
    immediate neighbors, other
    nodes remain unaffected
Cassandra Overview
Log-Structured Storage
• Optimized for write-intensive workloads
  (log-structured storage)
Triggy
        Programming Model
•   Modified MapReduce to support push-style
    processing
•   Modified only reduce function: reduce*
•   reduce* incrementally applies a new input value
    to an already existing aggregate value
              Map(k1,v1) -> list(k2,v2)
              Reduce(k2, list (v2)) -> (k2, v3)
Triggy
Programming Model
Triggy
                  Synchronization
•   reduce* functions have to be synchronized for the same key to guarantee
    correct results
•   we make use of Cassandra’s partitioning strategy: all keys are routed to the same
    node
•   synchronization within a node: locks on keys that are being processed right now
Triggy
Fault Tolerance and Scalability
• No fault tolerance guarantees
• Intermediate data and data in queue can be
  lost
• Triggy is easily scalable because the
  execution and data storage are tightly
  coupled
• A new node is placed near the most loaded
  node, part of data are transferred
Experiments
•   Generated workload: tweets with user ids (1 .. 100000) in uniform
    distribution

•   The load generator issues as many requests as the system with N
    can handle

•   Application: count the number of words posted by each user
    Map: tweet => (user_id, number_of_words_in_tweet)
    Reduce: (user_id, numer_of_words_total, number_of_words_in_tweet) =>
            (user_id, number_of_words_total)
Similar Systems: Yahoo!’s S4
• Distributed stream processing engine:
  •   Programming interface: Processing
      Elements written in Java

  •   Data routed between Processing Elements by
      key

  •   No database. All processing in memory

• Used to estimate Click-Through-Rate using
  user’s behavior within a time window
Similar Systems:
                     Google’s Percolator
•   Percolator is database-intensive: based on BigTable
•   BigTable:
    •   the same data model as in Cassandra
    •   the same log-structured storage
    •   BigTable - a distributed system with a master; Cassandra - peer2peer
•   Percolator extends BigTable with
    •   observers (similar to database triggers for push-style processing)

    •   ACID transactions

•   Triggy vs. Percolator:

    •   MapReduce programming model

    •   No ACID transactions (intermediate data can be lost) - less overhead. (What is
        the real overhead of full transaction support? )
Application
    Social Media Optimization for news sites
•    A/B testing for headlines of news stories

•    Optimization of front page to attract more clicks
Application
Real-Time News Recommendations

 • TwitterTim.es - new recommendations via
   Twitter’s friends graph
 • Now - rebuilt every 2 hours; goal - real-
   time updating newspaper
Application
         Real-Time Advertising
•   Real-Time bidding:
    •   Sites track your browsing behavior via cookies and sell it to
        advertising services
    •   Web publishers offer up display inventory to advertising services
    •   No fixed CPM, instead: each ad impression is sold to the highest
        bidder
•   Retargeting (remarketing)
    •   Advertisers can do remarketing after the following events: (1) the user
        visited your site and left (assume the site is within the Google content
        network); (2) the user visited your site and added products to their
        shopping cart then left; 3) went through purchase process but stop
        somewhere.

    •   Potentially interesting to use information from social networks
Other Applications
•   Recommendations on location checkins:
    Foursquare, Facebook places...
•   Social Games: monitoring events from millions
    of users in real-time, react in real-time
What other
applications?

Mais conteúdo relacionado

Mais procurados

Taboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache SparkTaboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache Spark
tsliwowicz
 
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
Going from three nines to four nines using Kafka | Tejas Chopra, NetflixGoing from three nines to four nines using Kafka | Tejas Chopra, Netflix
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
HostedbyConfluent
 

Mais procurados (20)

REDSHIFT - Amazon
REDSHIFT - AmazonREDSHIFT - Amazon
REDSHIFT - Amazon
 
MongoDB at Baidu
MongoDB at BaiduMongoDB at Baidu
MongoDB at Baidu
 
Real Time Data Infrastructure team overview
Real Time Data Infrastructure team overviewReal Time Data Infrastructure team overview
Real Time Data Infrastructure team overview
 
AWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual MeetupAWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual Meetup
 
Ml sprint16 thesis_intro
Ml sprint16 thesis_introMl sprint16 thesis_intro
Ml sprint16 thesis_intro
 
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
 
Druid
DruidDruid
Druid
 
July 2014 HUG : Pushing the limits of Realtime Analytics using Druid
July 2014 HUG : Pushing the limits of Realtime Analytics using DruidJuly 2014 HUG : Pushing the limits of Realtime Analytics using Druid
July 2014 HUG : Pushing the limits of Realtime Analytics using Druid
 
Traitement d'événements
Traitement d'événementsTraitement d'événements
Traitement d'événements
 
WSO2Con ASIA 2016: Patterns for Deploying Analytics in the Real World
WSO2Con ASIA 2016: Patterns for Deploying Analytics in the Real WorldWSO2Con ASIA 2016: Patterns for Deploying Analytics in the Real World
WSO2Con ASIA 2016: Patterns for Deploying Analytics in the Real World
 
Taboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache SparkTaboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache Spark
 
Real-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and SparkReal-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and Spark
 
Proofpoint: Fraud Detection and Security on Social Media
Proofpoint: Fraud Detection and Security on Social MediaProofpoint: Fraud Detection and Security on Social Media
Proofpoint: Fraud Detection and Security on Social Media
 
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
 
Kafka Summit NYC 2017 - Simplifying Omni-Channel Retail at Scale
Kafka Summit NYC 2017 - Simplifying Omni-Channel Retail at ScaleKafka Summit NYC 2017 - Simplifying Omni-Channel Retail at Scale
Kafka Summit NYC 2017 - Simplifying Omni-Channel Retail at Scale
 
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
 
The Fermilab HEPCloud Facility
The Fermilab HEPCloud FacilityThe Fermilab HEPCloud Facility
The Fermilab HEPCloud Facility
 
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
Going from three nines to four nines using Kafka | Tejas Chopra, NetflixGoing from three nines to four nines using Kafka | Tejas Chopra, Netflix
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
 
An introduction to cloud computing with Amazon Web Services and MongoDB
An introduction to cloud computing with Amazon Web Services and MongoDBAn introduction to cloud computing with Amazon Web Services and MongoDB
An introduction to cloud computing with Amazon Web Services and MongoDB
 
Cassandra Day SV 2014: Scaling Hulu’s Video Progress Tracking Service with Ap...
Cassandra Day SV 2014: Scaling Hulu’s Video Progress Tracking Service with Ap...Cassandra Day SV 2014: Scaling Hulu’s Video Progress Tracking Service with Ap...
Cassandra Day SV 2014: Scaling Hulu’s Video Progress Tracking Service with Ap...
 

Semelhante a Analytics for the Real-Time Web

Analytics for the Real-Time Web
Analytics for the Real-Time WebAnalytics for the Real-Time Web
Analytics for the Real-Time Web
maria.grineva
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
Mohit Tare
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Perficient, Inc.
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Open Analytics
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
Christopher Whitaker
 
Designing your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresDesigning your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with Postgres
Ozgun Erdogan
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data Stack
Zubair Nabi
 
In memory grids IMDG
In memory grids IMDGIn memory grids IMDG
In memory grids IMDG
Prateek Jain
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduce
sscdotopen
 

Semelhante a Analytics for the Real-Time Web (20)

Analytics for the Real-Time Web
Analytics for the Real-Time WebAnalytics for the Real-Time Web
Analytics for the Real-Time Web
 
[RightScale Webinar] Architecting Databases in the cloud: How RightScale Doe...
[RightScale Webinar] Architecting Databases in the cloud:  How RightScale Doe...[RightScale Webinar] Architecting Databases in the cloud:  How RightScale Doe...
[RightScale Webinar] Architecting Databases in the cloud: How RightScale Doe...
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web development
 
Big Data Processing
Big Data ProcessingBig Data Processing
Big Data Processing
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
 
Designing your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresDesigning your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with Postgres
 
Real-time Big Data Processing with Storm
Real-time Big Data Processing with StormReal-time Big Data Processing with Storm
Real-time Big Data Processing with Storm
 
Mongo db 2.4 time series data - Brignoli
Mongo db 2.4 time series data - BrignoliMongo db 2.4 time series data - Brignoli
Mongo db 2.4 time series data - Brignoli
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data Stack
 
ICIECA 2014 Paper 05
ICIECA 2014 Paper 05ICIECA 2014 Paper 05
ICIECA 2014 Paper 05
 
Making sense of the Graph Revolution
Making sense of the Graph RevolutionMaking sense of the Graph Revolution
Making sense of the Graph Revolution
 
Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark
 
In memory grids IMDG
In memory grids IMDGIn memory grids IMDG
In memory grids IMDG
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduce
 
Relational cloud, A Database-as-a-Service for the Cloud
Relational cloud, A Database-as-a-Service for the CloudRelational cloud, A Database-as-a-Service for the Cloud
Relational cloud, A Database-as-a-Service for the Cloud
 

Mais de maria.grineva (7)

Semantic Data Search and Analysis Using Web-based User-Generated Knowledge Bases
Semantic Data Search and Analysis Using Web-based User-Generated Knowledge BasesSemantic Data Search and Analysis Using Web-based User-Generated Knowledge Bases
Semantic Data Search and Analysis Using Web-based User-Generated Knowledge Bases
 
Getting Value From Social Media
Getting Value From Social MediaGetting Value From Social Media
Getting Value From Social Media
 
Filtering Twitter
Filtering TwitterFiltering Twitter
Filtering Twitter
 
Architecture of Native XML Database Sedna
Architecture of Native XML Database SednaArchitecture of Native XML Database Sedna
Architecture of Native XML Database Sedna
 
XQuery Triggers in Native XML Database Sedna
XQuery Triggers in Native XML Database SednaXQuery Triggers in Native XML Database Sedna
XQuery Triggers in Native XML Database Sedna
 
Extracting Key Terms From Noisy and Multi-theme Documents
Extracting Key Terms From Noisy and Multi-theme DocumentsExtracting Key Terms From Noisy and Multi-theme Documents
Extracting Key Terms From Noisy and Multi-theme Documents
 
Effective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From TextEffective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From Text
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Analytics for the Real-Time Web

  • 1. Analytics for the Real-Time Web Maria Grineva Systems @ ETH Zurich
  • 2. Real-Time Web • Web 2.0 + mobile devices = Real-Time Web • People share what they do now, discuss breaking news on Twitter, share their current locations on Foursquare...
  • 3. Analytics for the Real-Time Web: new requirements • Batch processing (MapReduce) is too slow • New requirements: • real-time processing: aggregate values incrementally, as new data arrives • data-base intensive: aggregate values are stored in a database constantly being updated
  • 4. Our System: Triggy • Based on Cassandra, a distributed key-value store • Provides programming model similar to MapReduce, adapted to push-style processing • Extends Cassandra with • push-style procedures - to immediately propagate the data to computations; • synchronization - to ensure consistency of aggregate results (counters) • Easily scalable
  • 5. Cassandra Overview Data Model • Data Model: key-value • Extends basic key-value with 2 levels of nesting • Super column - if the second level is presented • Column family ~ table; key-value pair ~ record • Keys are stored ordered
  • 6. Cassandra Overview Incremental Scalability • Incremental scalability requires mechanism to dynamically partition data over the nodes • Data partitioned by key using consistent hashing • Advantage of consistent hashing: departure or arrival of a node affects only its immediate neighbors, other nodes remain unaffected
  • 7. Cassandra Overview Log-Structured Storage • Optimized for write-intensive workloads (log-structured storage)
  • 8. Triggy Programming Model • Modified MapReduce to support push-style processing • Modified only reduce function: reduce* • reduce* incrementally applies a new input value to an already existing aggregate value Map(k1,v1) -> list(k2,v2) Reduce(k2, list (v2)) -> (k2, v3)
  • 10.
  • 11. Triggy Synchronization • reduce* functions have to be synchronized for the same key to guarantee correct results • we make use of Cassandra’s partitioning strategy: all keys are routed to the same node • synchronization within a node: locks on keys that are being processed right now
  • 12. Triggy Fault Tolerance and Scalability • No fault tolerance guarantees • Intermediate data and data in queue can be lost • Triggy is easily scalable because the execution and data storage are tightly coupled • A new node is placed near the most loaded node, part of data are transferred
  • 13. Experiments • Generated workload: tweets with user ids (1 .. 100000) in uniform distribution • The load generator issues as many requests as the system with N can handle • Application: count the number of words posted by each user Map: tweet => (user_id, number_of_words_in_tweet) Reduce: (user_id, numer_of_words_total, number_of_words_in_tweet) => (user_id, number_of_words_total)
  • 14. Similar Systems: Yahoo!’s S4 • Distributed stream processing engine: • Programming interface: Processing Elements written in Java • Data routed between Processing Elements by key • No database. All processing in memory • Used to estimate Click-Through-Rate using user’s behavior within a time window
  • 15. Similar Systems: Google’s Percolator • Percolator is database-intensive: based on BigTable • BigTable: • the same data model as in Cassandra • the same log-structured storage • BigTable - a distributed system with a master; Cassandra - peer2peer • Percolator extends BigTable with • observers (similar to database triggers for push-style processing) • ACID transactions • Triggy vs. Percolator: • MapReduce programming model • No ACID transactions (intermediate data can be lost) - less overhead. (What is the real overhead of full transaction support? )
  • 16. Application Social Media Optimization for news sites • A/B testing for headlines of news stories • Optimization of front page to attract more clicks
  • 17. Application Real-Time News Recommendations • TwitterTim.es - new recommendations via Twitter’s friends graph • Now - rebuilt every 2 hours; goal - real- time updating newspaper
  • 18.
  • 19. Application Real-Time Advertising • Real-Time bidding: • Sites track your browsing behavior via cookies and sell it to advertising services • Web publishers offer up display inventory to advertising services • No fixed CPM, instead: each ad impression is sold to the highest bidder • Retargeting (remarketing) • Advertisers can do remarketing after the following events: (1) the user visited your site and left (assume the site is within the Google content network); (2) the user visited your site and added products to their shopping cart then left; 3) went through purchase process but stop somewhere. • Potentially interesting to use information from social networks
  • 20. Other Applications • Recommendations on location checkins: Foursquare, Facebook places... • Social Games: monitoring events from millions of users in real-time, react in real-time

Notas do Editor

  1. \n
  2. The Web 2.0 era is characterized by the emergence of large amounts of user-generated content. People started generate and contribute data on different Web services: blogs, social networks, Wikipedia. \n\nToday, with the emergence of mobile devices constantly connected to the Internet, that nature of user-generated content has changed. Now people contribute more often, with smaller posts and the life-span of these posts became shorter. \n\nNew Web services appear that encourage real-time usage:\n1) Twitter\nLifespan of each tweet is shorter than it was before for Blog post. Twitter stream is almost real-time.\n2) Location-based social networks: Foursquare, Facebook places. People share their current location (or checkin) at real venues. This data is real-time sensitive, the user reveals his current location and recommendation of near-by friends and other interesting places must be done immediately, while the user is there.\n\n
  3. So far, analyzing and making use of Web 2.0 data has been accomplished using batch-style processing. Data produced over a certain period of time is accumulated and then processed. MapReduce has become the state-of-the-art approach for analytical batch processing of user-generated data.\n\nToday, the Web 2.0 data has become more real-time and this change implies new requirements for analytical systems. Processing data in batches is too slow for real-time sensitive data. Accumulated data can lose its importance in several hours or, even, minutes. Therefore, analytical systems must aggregate values in real-time, incrementally, as new data arrives. It follows that workloads are database-intensive because aggregate values are not produced at once, as in batch processing, but stored in a database constantly being updated. For example, Google’s new web indexing system, Percolator, is not based on MapReduce anymore. Percolator allows lower document processing latencies by updating the web index incrementally (database-intensive).\n\n
  4. We are working on a system that can process analytical tasks at real-time for large amounts of data.\n\nOur system is based on Cassandra distributed key-value store.\nWe add two extensions into Cassandra in order to turn it into a system for real-time analytics: push-style procedures and synchronization.\n\nWe extend Cassandra with push-style procedures. These procedures act like triggers, you can set it onto a table and they fire up when a new key-value record is inserted. They make the computation real-time, as they immediately propagate the inserted data to the analytical computations.\n\nSynchronization: Cassandra is a simple key-value store. There is no mechanism to update a value based on the existing value. For example, to maintain counters, when we need to increment the existing value we first need to query it, and then insert a new value. In Cassandra, there is no transactions, that means, between querying and updating other client can also update the value. That leads to inconsistent counters. We add local synchronization into Cassandra, that can synchronize data within a node.\n\nFurthermore, our system provides a programming model similar to MapReduce, adapted to push-style processing, and is scalable in terms of computation and data storage.\n
  5. In a nutshell, Cassandra data model can be described as follows:\n1) Cassandra is based on a key-value model\nA database consists of column families. A column family is a set of key-value pairs. Drawing an analogy with relational databases, you can think about column family as table and a key-value pair as a record in a table.\n2) Cassandra extends basic key-value model with two levels of nesting\nAt the first level the value of a record is in turn a sequence of key-value pairs. These nested key-value pairs are called columns where key is the name of the column. In other words you can say that a record in a column family has a key and consists of columns. \nAt the second level, the value of a nested key-value pair can be a sequence of key-value pairs as well. When the second level of nesting is presented, outer key-value pairs are called super columns with key being the name of the super column and inner key-value pairs are called columns.\nLet’s consider an classical example of Twitter database to demonstrate the points.\nColumn family Tweets contains records representing tweets. The key of a record is of Time UUID type and generated when the tweet is received (we will use this feature in User_Timelines column family below). The record consist of columns (no super columns here). Columns simply represent attributes of tweets. So it is very similar to how one would store it in a relational database.\nThe next example is User_Timelines (i.e. tweets posted by a user). Records are keyed by user IDs (referenced by User_ID columns in Tweets column family). User_Timelines demonstrates how column names can be used to store values – tweet IDs in this case. The type of column names is defined as Time UUID. It means that tweets IDs are kept ordered by the time of posting. That is very useful as we usually want to show the last N tweets for a user. Values of all columns are set to an empty byte array (denoted “-”) as they are not used.\nTo demonstrate super columns let us assume that we want to collect statistics about URLs posted by each user. For that we need to group all the tweets posted by a user by URLs contained in the tweets. It can be stored using super columns as follows.\nIn User_URLs the names of the super columns are used to store URLs and the names of the nested columns are the corresponding tweet IDs.\n\n\n
  6. One of the key features of Cassandra is that it must scale incrementally. This requires a mechanism to dynamically partition the data over the set of nodes. Cassandra’s partitioning scheme relies on consistent hashing to distribute the load across multiple storage hosts. \n\nIn consistent hashing, the output range of a hash function (which is normally MD5 ) is treated as a fixed circular space or a ring. By this, I mean, that the largest hash value wraps around to the smallest hash value. \n\nEach node in the system is assigned a random value within this space which represents its position on the ring. Each data item identified by a key is assigned to a node by hashing the data item’s key to yield its position on the ring, and then walking the ring clockwise to find the first node with a position larger than the item’s position. The node is deemed the coordinator for this key. Thus, each node becomes responsible for the region in the ring between it and previous node on the ring.\n\nThe principal advantage of the consistent hashing is that departure or arrival of a node only affects its immediate neighbors and other nodes remain unaffected. \n\nThe problem with MD5 hash function for nodes distribution: the random position assignment of each node on the ring leads to non-uniform load and data distribution. That’s why Cassandra analyzes load information on the ring and inserts new nodes near the highly loaded nodes, so that the overloaded node can transfer the data from it onto the new node.\n\n
  7. Cassandra is optimized for write-intensive workloads, that is a useful feature for us, as computing aggregate values for analytical tasks implies heavy updates to the system\n\nCassadra uses so called log-structured stored which was successfully used in BigTable.\nThe idea is that write operations write to buffer in main memory. When the buffer is full, it is written on disk. So, in the result, the buffer is periodically written on disk. And there is a separate thread that merges different versions a sstable. This process is called compaction.\n\nRead operation looks up the value first in memtable, then, if it was not found, in different versions of sstable moving from the most recent one.\n\nSuch storage is highly optimized for writes, and of course makes the queries slower, which is always a tradeoff for databases.\n
  8. MapReduce is a well-established programming model to express analytical applications. To support real-time analytical applications, we modify this programming model to support push-style data processing. In particular, we modify the reduce function. Originally, reduce combined a list of input values into a single aggregate value. Our modified function, reduce∗, incrementally applies a new input value to an already existing aggregate value. This modification allows to apply a new input value to the aggregate value as soon as the new input value is produced. This means,we are able to pushnewvaluestothe reduce function. \n\nFigure 1 depicts our modified programming model. reduce∗ takes as parameters a key, a new value, and the existing aggregate value. It outputs a key-value pair with the same key and the new aggregate value. We did not modify the map function as it is already allows push-style processing. The difference between map and reduce∗ is that multiple maps can be executed in parallel for the same key, while the execution of reduce∗ has to be synchronized for the same key to guarantee correct results. \n\nNote that reduce∗ exhibits some limitations in comparison to the original reduce. Not every reduce function can be converted to its incremental counterpart. For example, to compute the median of a set of values, the previous median and new value is not enough to compute the new median. The complete set of values needs to be stored to compute the new median.\n\n
  9. In order to setup a map/reduce∗ job the developer has to provide implementations for both functions and define the input table, from which the data is fed into map, and the output table, to which the output of reduce∗ is written.\n\n
  10. Example: implementation of WordCountMapReducer\n
  11. The difference between map and reduce∗ is that multiple maps can be executed in parallel for the same key, while the execution of reduce∗ has to be synchronized for the same key to guarantee correct results.\n\nFor that, we extended the nodes of the key-value store adding queues and worker threads. Figure 2 shows our extensions. Each node maintains a queue that buffers map and reduce∗ tasks. Worker threads drain the queues and execute buffered tasks. Buffering map and reduce∗ tasks allows to handle bursts of input data. Furthermore, the size of the queue allows a rough estimation of the load of a node.\n\nHow to Execute map. As described, for each map the developer has to define an input table. Whenever a new key-value pair is written to this table, the node handling this write schedules a new map task by putting it into its local queue. Eventually, a worker thread will execute the map task at this node. Map tasks can be executed in parallel at any node in the system and do not require synchronization because they do not share any data.\n\nHow to Execute reduce∗. In contrast to map, the execution of reduce∗ needs to be synchronized because several reduce∗ tasks can potentially update the same aggregate value in parallel leading to inconsistent data. Cassandra do not provide any synchronization mechanisms. In our system, synchronization is realized in two steps: (1) routing all key-value pairs output by map with the same key to a single node, and (2) synchronizing the execution of reduce∗ within a node using locks. Routing is implemented by reusing Cassandra’s partitioning strategy (using consistent hashing). That is, each key-value pair output by map is routed to the node that is primarily responsible for the respective key. At the receiver node, a new reduce∗ task is submitted to the queue. Multiple worker threads execute these reduce∗ tasks by reading and incrementing the latest aggregate value. Workers threads are synchronized such that only one worker executes a reduce∗ task for a given key. For that, we use a lock table that contains keys being processed by each worker. The output of the reduce∗ task is written to the table specified in the reduce definition. The table may be replicated to achieve reliability. By writing the result, the node might fire a subsequent map/reduce∗ task. The result of reduce∗ can be queried using the key-value store’s standard query interface.\n\nThe figure shows the execution of map and reduce∗ inside oursystem.Twokey-valuepairs(k1 , v1 )and(k1 , v2 )are written to nodes N1 and N5 of the key-value store. These writes fire map tasks defined on the updated table. There- fore,receivernodeN1putsamaptaskforpair(k1 , v1 )into its queue (denoted by m in Figure 2). Similarly, node N5 putsamaptaskforpair(k1 , v2 )intoitsqueue.Theexecution of the map tasks results in three intermediate key-value pairs. Determined by Cassandra’s partitioning strategy, the intermediate pair with key k2 is routed to node N2 while pairs with key k3 are routed to node N3. Nodes N2 and N3 put reduce∗ tasks into their respective queues (denoted by r∗). As described, reduce∗ tasks are executed locally using locks. New aggregate values are computed and stored into the result table.\n\n
  12. Our implementation does not provide fault tolerance guarantees for execution of map/reduce∗ tasks. If the node responsible to execute map fails while the map task is still in the queue, the map task will never be executed. Also, our synchronization mechanism requires intermediate key-value pairs to be routed to a single node. These intermediate pairs might be lost in case of failures. Nevertheless, once a map/ reduce∗ task has been executed successfully the results are stored reliably at a number of replica nodes. Thus, only intermediate data can be lost.\n\nThere are a number of reasons for this design decision. First, for many analytical applications losing intermediate data is not critical. For such applications it is more important to see a general trend rather than exact numbers. Second, only those map/reduce∗ tasks can be lost that wait in the queue at the moment a node fails. If there is no burst of input data, queues are usually empty. Therefore, losing intermediate data happens rarely. Third, the execution of map and reduce∗ tasks is distributed across all nodes of the system. Only a portion of intermediate data will be lost in case a single node fails.\n\nIn order to provide stronger consistency guarantees in case of node failures, we would have to provide exactly-once semantics. Relatively light-weight methods that provide at-least-once semantics are not suitable as repeated executions invalidate aggregate values. Providing exactly-once semantics requires additional storage and computation overhead and is argued to be too expensive and not easy to scale.\n\n\nScalability. In our system, the execution of map and reduce∗ is distributed across the nodes according to the data partitioning strategy of the key-value store. It allows to easily scale the system as execution and data storage are tightly coupled. By default, Cassandra provides a mechanism for scaling the data storage. Any new node is placed near the most loaded node of the system. Parts of the data from the loaded node are transferred to the new node, thus, shedding load between the nodes. We extended Cassandra’s load measurement formula to include execution load as well. As in the SEDA architecture, we use the length of the queue to measure execution load. It is a good criteria because it reflects any bottleneck at a node such as CPU overload or network saturation.\n\n\n
  13. \n
  14. Yahoo! recently open sourced S4, a system that is close to ours.\n\nWhat are the differences:\n\n1) Triggy has MapReduce programming model many developers are familiar with. Programming model of S4 is more general. \n\n2) Our system is tightly coupled with the database, while S4 process tasks in memory. Why we think database-intensive solution is important:\n\nа) With Triggy, you don’t have to worry about the window. You can compute analytics using historical data which can be used within a window, as well as without a window, or the window can be of different sizes for different parameters. For example, while monitoring user’s browsing behavior using cookies for advertising: some users show enough interest for a certain ad within a short time period, while you can monitor and wait for other users much longer.\n\nб) Triggy is easily scalable. You don’t have to scale the computation separately from the database. Tightly coupled solution allows scaling the system with a single knob.\n\n\n
  15. \n
  16. News site use real-time analytics for optimizing their sites to attract more readers.\n\n1) A/B testing for headlines of news stories. When the news is first published on the site, there are two different headlines for it. For the first 5 minutes part of the readers get one headline, while another part of the readers gets another headline. Then the headline that attracts more clicks during the first 5 minutes in chosen. \n\n2) Optimizing news layout. The system analyses clicks, likes and retweet to understand which news stories rise discussions in social media. Then put the most discussed news on to the front page to attract even more readers. \n
  17. The Twitter Tim.es - a personalized news service: http://twittertim.es. The Twitter Tim.es uses your friends relationships on Twitter to recommend news for you.\n\nCurrently, The Twitter Tim.es newspapers are being rebuilt every 2 hours (batch processing). Would be nice to have push-style processing, when the new news story is coming to the newspaper as soon as it is published on Twitter.\n\n\n
  18. \n
  19. What is real-time bidding\n\nHere's the basic gist:\n1) Sites across the web track your browsing behavior via cookies and sell basic data about you to Ad Service companies. For example, Google Content Network covers 80% of internet users.\n\n2) Web publishers offer up display inventory to the RTB market through ad services; rather than signing up for a fixed CPM, they sell each individual ad impression to the highest bidder, based on whom that individual ad is being served to. For example, a retailer who agrees to run a display ad campaign for a shoe sale at $5 per 1,000 impressions. That retailer, however, can specify that they will pay $10 per 1,000 impressions for ads that include running shoes if they know that a browser has previously visited the athletics section of its Web site.\n\nReal-time bidding auction is happening during a milliseconds while the site page is opening. Advertisers have to run their algorithms to decide what ad to show and at what price during this time.\n\nGoogle retargeting (or remarketing):\nWhat is remarketing:\nTravel company has a site where they feature the holiday vacations. Users may come to this website, browse the offers and think about booking a trip, but decide that the deal is still not cheap enough. Then, they continue to browse the web. If the travel company later decide to offer discounted deals to the Carribean, it can target the users that already visited their site (interested users) via display ads, that these users will see later on other sites.\n\nAdvertisers can do remarketing after the following events:\n1) User visited your site and left (assume the site is within the Google content network); 2) User visited your site and added products to their shopping cart then left; 3) Go through purchase process but stop somewhere; etc.\n\nThese events can be extended with information from social networks, for example. Suppose, the system can track what the user is posting on twitter and estimate their interest in different products that can be advertised later.\n\nYou can then pay per click for these people as they search and browse the web (ads will be shown in search or content network).  For retargeting you need to aggregate information about a user in a database. Window approach is not applicable here, because there is no a single time frame.\n
  20. \n
  21. \n