SlideShare a Scribd company logo
1 of 38
Download to read offline
Trend Detection and Analysis
on Twitter
2
Agenda
Motivation
Architecture
Data Preparation
Trend Analysis
Analyzed Trends
Conclusion
3
Motivation
Predict the stock market in real time
source
source
Detecting influenza epidemics
Automatic crime prediction
source
“Successful results of mainly research-based projects
helped to open up new business opportunities”
4
Twitter
5
Early Trend Detector
Bag-of-words (Hashtags, Mentions)
Twitter Streaming API (Twython)
Architecture
Bag of Words
Bags Count
#newyear 7
#christmas 6
@bigdata 2
@sap 3
6
Statistical MeasurementEarly Trend Detector
Bag-of-words (Hashtags, Mentions)
Twitter Streaming API (Twython)
Architecture
Statistical Measurement
(growth, average usage, retweets, participating users…)
Report statistics (every 20 minutes):
• Total hashtags & user mentions
• Hashtag/mentions count
• Usage growth per hashtag/mention
• Participating users per hashtag/mention
• Retweet count per hashtag/mention
7
Early Trend Detector
Bag-of-words (Hashtags, Mentions)
Twitter Streaming API (Twython)
Architecture
Statistical Measurement
(growth, average usage, retweets, participating users…)
Anomaly Detection
Time Series Analysis
Calculated for every hashtag / user mention
Every 2 / 4 hours based on reports
Anomaly detection using:
• Relative & absolute fluctuation
• Total occurrences (sum)
• Minimum occurrences
• Maximum occurrences
• Average occurrences
Time Series Analysis
8
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
Lowercasing & tokenizing
URL & stopword removal
Stop Word Removal
This sample text shows which words will
be removed when applying stop word
removal. Mostly words like the, a or and.
This sample text shows which words will
be removed when applying stop word
removal. Mostly words like the, a or and.
9
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Stemming
Amazing
Amazement
Amazed
amaze
10
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Sentiment Analysis
Sentiment Analysis
I love cookiesI hate cookies
11
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Sentiment Analysis
Topic Modeling (LDA)
Topic Modeling
Topics
• …
• …
• …
Trend Classification
14
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Sentiment Analysis
Topic Modeling (LDA)
Wordcloud Visualization
Wordfreq.js
Wordcloud2.js
GeoSpatial Visualization
CartoDB
Early Trend Detector
Bag-of-words (Hashtags, Mentions)
Anomaly Detection
Statistical Measurement
(growth, average usage, retweets, participating users…)
Time Series Analysis
Trend Classification
Twitter Streaming API (Twython)
Architecture
15
Analyzed Trends
16
Limitations
Tweets collected: 38 million (70GB)
Only English tweets from the USA
Twitter Streaming API
17
New Year
Time Series
18
New Year
Word Cloud
19
New Year
Geospatial Analysis
Midnight Los Angeles Midnight New York
20
New Year
Sentiment Analysis
Positive Neutral Negative
Home sick on #nye. Horrible timing
stupid cold. Ugh. My date is my
couch & pillow watching.
#HappyNewYear everyone.
#HappyNewYear from the Youth for
Astronomy and Engineering Program
at Space Telescope Science Institute!
Happy New Year! Last year was
amazing, and here’s to another great
year of love & happiness! #NYE2015
21
Air Asia Tragedy
22
Air Asia Tragedy
Time Series
23
Air Asia Tragedy
Word Cloud
24
Air Asia Tragedy
Topic Modeling
News
airasia, missing, flight, air,
Indonesia, singapore, asia
Search for the Plane
airasia, missing, plane, find,
plane, world, technology
Sympathy
Prayers, families, thoughts,
airasia, crash, thought, airfrance
Cause
airasia, weather, flight,
pilots, fly, bad, path
International Help
raaf, butterworth, china, australia,
Russia, trndnl, trending
25
Air Asia Tragedy
Sentiment Analysis
Neutral Negative Positive
Prayers are USELESS! Stop repeating
meaningless crap, pretending that
you care … #PrayForAirAsia #QZ8501
#GrowABrain #ReligousNonsense
#BREAKING #AirAsia Flight #8501
likely “at the bottom of the sea”
rescue officials says.
May God’s great love shine on the
families and loved ones of all
passengers and crew #AirAsia #8501
26
Air Asia Tragedy
Google Trends Comparison
Google Trends Twitter Sample
27
Air Asia Tragedy
Google Trends Comparison
Google Trends Twitter Sample
28
Sony Hack
29
Sony Hack
Time Series
30
Sony Hack
Word Cloud
31
Sony Hack
Topic Modeling
Christmas Release
theinterview, christmas, day,
theaters, freedom, theater, showing
Reviews
theinterview, jamesfrancotv, sethrogen,
movie, interview, funny, hilarious
Suspicions
northkorea, sonyhack, korea,
north, internet, sony, amp
News
theinterview, sonypictures, sony,
movie, korea, north, interview
Insider Joke
theinterview, aint, hate, cuz,
jealous, anus, peanutbutter
32
Sony Hack
Geospatial Analysis
33
Sony Hack
Sentiment Analysis
Neutral Negative Positive
#TheInterview SUCKS!!! @sethrogen
Like I knew it would #Stupid
#NotFunny
#Sony says #TheInterview made
more than $1 million at the box office
on in 1 single day on Dec. 25.
Happy I joined my fellow Americans
in the great #TheInterview Christmas
Day Viewing. Plus it was pretty funny,
truth be told.
34
Network Outage
35
Network Outage
Time Series
36
Network Outage
Word Cloud
37
Network Outage
Topic Modeling
Network Error
xbox, psn, sign, connect,
live, error, account, issues
Connection between Hacks
xbox, playstation, watch, movie,
fuckcrucifix, north, korea, interview
Xbox Down
xbox, christmas, play, xboxlivedown,
live, xboxlive, xboxsupport, day
Caused Damage
playstation, dollar, psn, company,
lizardsquad, sony, billion, multi
Hacker Group
fuckcrucifix, lizardmafia, lizardsquad,
fuck,lizard, squad, finestsquad, stop
Restored
psn, back, playstation, online,
askplaystation, network, psndown, working
38
Network Outage
Sentiment Analysis
Neutral Negative Positive
@XboxSupport f*** your servers, a
big ass company like you should
handle these teenage kids, terrible
@AskPlayStation when will the
service be back online because it says
there’s maintenance?
@PlayStation thanks for the great
year. I am sure this new year will be
amazing. Don’t allow yourselves to
be hacked ever again.
39
Conclusion
High quality insights into world’s interest
Twitter is very good for detecting and predicting trends
Maintaining a high data quality is important
40
#Questions
Benjamin Räthlein
@B3nRa
Henning Muszynski
@henningmus
Lukas Masuch
@LukasMasuch

More Related Content

What's hot

Business Analytics with R
Business Analytics with R Business Analytics with R
Business Analytics with R
Edureka!
 

What's hot (20)

Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?
 
Customer Profiling using Data Mining
Customer Profiling using Data Mining Customer Profiling using Data Mining
Customer Profiling using Data Mining
 
Data Modeling & Metadata Management
Data Modeling & Metadata ManagementData Modeling & Metadata Management
Data Modeling & Metadata Management
 
Big data and Social Media Analytics
Big data and Social Media AnalyticsBig data and Social Media Analytics
Big data and Social Media Analytics
 
Data Governance
Data GovernanceData Governance
Data Governance
 
Data Cleaning
Data CleaningData Cleaning
Data Cleaning
 
Unlocking First Party Data
Unlocking First Party DataUnlocking First Party Data
Unlocking First Party Data
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
Movie Recommendation engine
Movie Recommendation engineMovie Recommendation engine
Movie Recommendation engine
 
Data Governance — Aligning Technical and Business Approaches
Data Governance — Aligning Technical and Business ApproachesData Governance — Aligning Technical and Business Approaches
Data Governance — Aligning Technical and Business Approaches
 
Social Network Analysis: Applications & Challenges
Social Network Analysis: Applications & ChallengesSocial Network Analysis: Applications & Challenges
Social Network Analysis: Applications & Challenges
 
The power of unstructured data: Recommendation systems
The power of unstructured data: Recommendation systemsThe power of unstructured data: Recommendation systems
The power of unstructured data: Recommendation systems
 
Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)
 
Data Governance and Stewardship Roundtable
Data Governance and Stewardship RoundtableData Governance and Stewardship Roundtable
Data Governance and Stewardship Roundtable
 
Introduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceIntroduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data Science
 
Social Media Sentiment Analysis
Social Media Sentiment AnalysisSocial Media Sentiment Analysis
Social Media Sentiment Analysis
 
Homepage Personalization at Spotify
Homepage Personalization at SpotifyHomepage Personalization at Spotify
Homepage Personalization at Spotify
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Business Analytics with R
Business Analytics with R Business Analytics with R
Business Analytics with R
 
Einstein Analytics Prediction Builder
Einstein Analytics Prediction BuilderEinstein Analytics Prediction Builder
Einstein Analytics Prediction Builder
 

Viewers also liked

Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
Lukas Masuch
 
City Layer - Pitch
City Layer - PitchCity Layer - Pitch
City Layer - Pitch
Lukas Masuch
 
12 ways trending twitter topics and hashtags may not be working for you
12 ways trending twitter topics and hashtags may not be working for you12 ways trending twitter topics and hashtags may not be working for you
12 ways trending twitter topics and hashtags may not be working for you
Online Promotion Success, Inc.
 

Viewers also liked (20)

Detecting Trends
Detecting TrendsDetecting Trends
Detecting Trends
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Growth Hacking 101
Growth Hacking 101Growth Hacking 101
Growth Hacking 101
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
City Layer - Pitch
City Layer - PitchCity Layer - Pitch
City Layer - Pitch
 
Powerpoint for Dummies
Powerpoint for DummiesPowerpoint for Dummies
Powerpoint for Dummies
 
SparkX - Enterprise Crowdfunding
SparkX - Enterprise CrowdfundingSparkX - Enterprise Crowdfunding
SparkX - Enterprise Crowdfunding
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through Examples
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
 
12 ways trending twitter topics and hashtags may not be working for you
12 ways trending twitter topics and hashtags may not be working for you12 ways trending twitter topics and hashtags may not be working for you
12 ways trending twitter topics and hashtags may not be working for you
 
Google Cloud Platform - Building a scalable mobile application
Google Cloud Platform - Building a scalable mobile applicationGoogle Cloud Platform - Building a scalable mobile application
Google Cloud Platform - Building a scalable mobile application
 
Trend analysis
Trend analysisTrend analysis
Trend analysis
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Introduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep LearningIntroduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep Learning
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Knowledge extraction from the Encyclopedia of Life using Python NLTK
Knowledge extraction from the Encyclopedia of Life using Python NLTKKnowledge extraction from the Encyclopedia of Life using Python NLTK
Knowledge extraction from the Encyclopedia of Life using Python NLTK
 
Introduction to NLTK
Introduction to NLTKIntroduction to NLTK
Introduction to NLTK
 
NLTK Book Chapter 2
NLTK Book Chapter 2NLTK Book Chapter 2
NLTK Book Chapter 2
 

Similar to Trend detection and analysis on Twitter

Final Presentation
Final PresentationFinal Presentation
Final Presentation
Love Tyagi
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
SAIL_QU
 
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Alexandre Sieira
 
Semantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports TweetsSemantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports Tweets
mitsmit
 
OSINT using Twitter & Python
OSINT using Twitter & PythonOSINT using Twitter & Python
OSINT using Twitter & Python
37point2
 
Mis 510 cyber analytics project report
Mis 510 cyber analytics project report Mis 510 cyber analytics project report
Mis 510 cyber analytics project report
Aadil Hussaini
 

Similar to Trend detection and analysis on Twitter (20)

Twitter Trend Detection and Analysis
Twitter Trend Detection and AnalysisTwitter Trend Detection and Analysis
Twitter Trend Detection and Analysis
 
Trend Detection and Analysis on Twitter
Trend Detection and Analysis on TwitterTrend Detection and Analysis on Twitter
Trend Detection and Analysis on Twitter
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
 
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
 
[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor Agent
 
Semantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports TweetsSemantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports Tweets
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
Floods of Twitter Data - StampedeCon 2016
Floods of Twitter Data - StampedeCon 2016Floods of Twitter Data - StampedeCon 2016
Floods of Twitter Data - StampedeCon 2016
 
Using Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkUsing Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter Network
 
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...
 
OSINT using Twitter & Python
OSINT using Twitter & PythonOSINT using Twitter & Python
OSINT using Twitter & Python
 
Safety Bot Guaranteed -- Shmoocon 2017
Safety Bot Guaranteed -- Shmoocon 2017Safety Bot Guaranteed -- Shmoocon 2017
Safety Bot Guaranteed -- Shmoocon 2017
 
Trend Analysis
Trend AnalysisTrend Analysis
Trend Analysis
 
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
 
Geekend 1 04 10 m francis
Geekend 1 04 10 m francisGeekend 1 04 10 m francis
Geekend 1 04 10 m francis
 
Mis 510 cyber analytics project report
Mis 510 cyber analytics project report Mis 510 cyber analytics project report
Mis 510 cyber analytics project report
 

Recently uploaded

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 

Recently uploaded (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 

Trend detection and analysis on Twitter

  • 1. Trend Detection and Analysis on Twitter
  • 3. 3 Motivation Predict the stock market in real time source source Detecting influenza epidemics Automatic crime prediction source “Successful results of mainly research-based projects helped to open up new business opportunities”
  • 5. 5 Early Trend Detector Bag-of-words (Hashtags, Mentions) Twitter Streaming API (Twython) Architecture Bag of Words Bags Count #newyear 7 #christmas 6 @bigdata 2 @sap 3
  • 6. 6 Statistical MeasurementEarly Trend Detector Bag-of-words (Hashtags, Mentions) Twitter Streaming API (Twython) Architecture Statistical Measurement (growth, average usage, retweets, participating users…) Report statistics (every 20 minutes): • Total hashtags & user mentions • Hashtag/mentions count • Usage growth per hashtag/mention • Participating users per hashtag/mention • Retweet count per hashtag/mention
  • 7. 7 Early Trend Detector Bag-of-words (Hashtags, Mentions) Twitter Streaming API (Twython) Architecture Statistical Measurement (growth, average usage, retweets, participating users…) Anomaly Detection Time Series Analysis Calculated for every hashtag / user mention Every 2 / 4 hours based on reports Anomaly detection using: • Relative & absolute fluctuation • Total occurrences (sum) • Minimum occurrences • Maximum occurrences • Average occurrences Time Series Analysis
  • 8. 8 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) Lowercasing & tokenizing URL & stopword removal Stop Word Removal This sample text shows which words will be removed when applying stop word removal. Mostly words like the, a or and. This sample text shows which words will be removed when applying stop word removal. Mostly words like the, a or and.
  • 9. 9 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Stemming Amazing Amazement Amazed amaze
  • 10. 10 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Sentiment Analysis Sentiment Analysis I love cookiesI hate cookies
  • 11. 11 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Sentiment Analysis Topic Modeling (LDA) Topic Modeling Topics • … • … • … Trend Classification
  • 12. 14 Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Sentiment Analysis Topic Modeling (LDA) Wordcloud Visualization Wordfreq.js Wordcloud2.js GeoSpatial Visualization CartoDB Early Trend Detector Bag-of-words (Hashtags, Mentions) Anomaly Detection Statistical Measurement (growth, average usage, retweets, participating users…) Time Series Analysis Trend Classification Twitter Streaming API (Twython) Architecture
  • 14. 16 Limitations Tweets collected: 38 million (70GB) Only English tweets from the USA Twitter Streaming API
  • 17. 19 New Year Geospatial Analysis Midnight Los Angeles Midnight New York
  • 18. 20 New Year Sentiment Analysis Positive Neutral Negative Home sick on #nye. Horrible timing stupid cold. Ugh. My date is my couch & pillow watching. #HappyNewYear everyone. #HappyNewYear from the Youth for Astronomy and Engineering Program at Space Telescope Science Institute! Happy New Year! Last year was amazing, and here’s to another great year of love & happiness! #NYE2015
  • 22. 24 Air Asia Tragedy Topic Modeling News airasia, missing, flight, air, Indonesia, singapore, asia Search for the Plane airasia, missing, plane, find, plane, world, technology Sympathy Prayers, families, thoughts, airasia, crash, thought, airfrance Cause airasia, weather, flight, pilots, fly, bad, path International Help raaf, butterworth, china, australia, Russia, trndnl, trending
  • 23. 25 Air Asia Tragedy Sentiment Analysis Neutral Negative Positive Prayers are USELESS! Stop repeating meaningless crap, pretending that you care … #PrayForAirAsia #QZ8501 #GrowABrain #ReligousNonsense #BREAKING #AirAsia Flight #8501 likely “at the bottom of the sea” rescue officials says. May God’s great love shine on the families and loved ones of all passengers and crew #AirAsia #8501
  • 24. 26 Air Asia Tragedy Google Trends Comparison Google Trends Twitter Sample
  • 25. 27 Air Asia Tragedy Google Trends Comparison Google Trends Twitter Sample
  • 29. 31 Sony Hack Topic Modeling Christmas Release theinterview, christmas, day, theaters, freedom, theater, showing Reviews theinterview, jamesfrancotv, sethrogen, movie, interview, funny, hilarious Suspicions northkorea, sonyhack, korea, north, internet, sony, amp News theinterview, sonypictures, sony, movie, korea, north, interview Insider Joke theinterview, aint, hate, cuz, jealous, anus, peanutbutter
  • 31. 33 Sony Hack Sentiment Analysis Neutral Negative Positive #TheInterview SUCKS!!! @sethrogen Like I knew it would #Stupid #NotFunny #Sony says #TheInterview made more than $1 million at the box office on in 1 single day on Dec. 25. Happy I joined my fellow Americans in the great #TheInterview Christmas Day Viewing. Plus it was pretty funny, truth be told.
  • 35. 37 Network Outage Topic Modeling Network Error xbox, psn, sign, connect, live, error, account, issues Connection between Hacks xbox, playstation, watch, movie, fuckcrucifix, north, korea, interview Xbox Down xbox, christmas, play, xboxlivedown, live, xboxlive, xboxsupport, day Caused Damage playstation, dollar, psn, company, lizardsquad, sony, billion, multi Hacker Group fuckcrucifix, lizardmafia, lizardsquad, fuck,lizard, squad, finestsquad, stop Restored psn, back, playstation, online, askplaystation, network, psndown, working
  • 36. 38 Network Outage Sentiment Analysis Neutral Negative Positive @XboxSupport f*** your servers, a big ass company like you should handle these teenage kids, terrible @AskPlayStation when will the service be back online because it says there’s maintenance? @PlayStation thanks for the great year. I am sure this new year will be amazing. Don’t allow yourselves to be hacked ever again.
  • 37. 39 Conclusion High quality insights into world’s interest Twitter is very good for detecting and predicting trends Maintaining a high data quality is important