SlideShare uma empresa Scribd logo
1 de 25
READING
THE RIOTS
ON TWITTER




  Challenges of new
  social media for
  quantitative
  researchers
     Rob Procter (University of Manchester)
        Farida Vis (University of Leicester)
    Alexander Voss (University of St Andrews)

   http://www.analysingsocialmedia.org/
                                                #readingtheriots
READING
THE RIOTS
ON TWITTER       Methodology

   • Development of computer-based tools for
     sentiment and topic analysis of tweets is an
     active area of research.
   • Our methodology combines computer-based
     tools with established content analysis
     techniques in ways that are complementary to
     their respective strengths.
READING
THE RIOTS
ON TWITTER      Information Flows
   • Any collection of tweets can be divided into tweets
     that are ‘original’ and retweets.
   • If we are interested in how Twitter is used to
     communicate and share information, the only
     reliable evidence that a tweet has been read is that it
     has been retweeted.
   • We use computational tools to group a tweet (the
     parent) and its retweets (its children) into
     information flows.
READING
THE RIOTS
ON TWITTER
             Information Flow Analysis
   For N = 1, CorpusMax
       InformationFlow[N-1] = {}
       If Corpus[N] == “RT @”.username.body
           (LevenshteinDistance, Parent) = LDMin(N-
             1, username, body)
       If LevenshteinDistance< 30
          InformationFlow[Parent] =
          InformationFlow[Parent] + Corpus[N]
READING
THE RIOTS
ON TWITTER                     Example Information Flows
   Riots Corpus
                                          Great sight in my #Birmingham where #Pakistani lads are
     2.6M Tweets                          protecting temples while Sikh lads protecting the mosques: 758

   700,000 accounts
      incitement pls?: 5
      Can we have them arrested for
      Hackney! Fuck the feds! #hackney
      #punchcroft has just posted Go on
      someone calling themselves
READING
THE RIOTS
ON TWITTER      Coding Frames

   • We use established methods of content
     analysis to understand how Twitter was being
     used in the context of topics we wished to
     analyse.
   • Inductively code information flows to develop
     a ‘code frame’ to categorise topics and
     examine relationships in context of a given
     topic.
READING
THE RIOTS
ON TWITTER   Coding Frames
READING
THE RIOTS
ON TWITTER   Rumours on Twitter
READING
THE RIOTS
ON TWITTER
READING
THE RIOTS
ON TWITTER
READING
THE RIOTS
ON TWITTER
READING
THE RIOTS
ON TWITTER
READING
THE RIOTS
ON TWITTER
READING
THE RIOTS
ON TWITTER
READING
THE RIOTS
ON TWITTER
                  Sampling Issues
   • Riots corpus selected from Twitter firehose
     using set of hashtags:
       – Sample may systematically exclude some
         relevant data.
   • Twitter users not representative of the
     population as a whole:
       – Younger, better off, better educated, urban
       – How can we use profile info to counter bias?
READING
THE RIOTS
ON TWITTER
                  Twitter Data APIs
   • Twitter offers a number of different APIs
     providing access to different sets of data.
   • Differences are in terms of:
       – Timescale
       – Real-time vs. retrospective
       – Completeness
       – Functionality to specify subsets of tweets
READING
THE RIOTS
ON TWITTER
              Search and REST APIs
   • Search API – unauthenticated use:
       – Search by keyword, account, etc.
       – No tweets older than about one week
       – Rate limited by details not published
   • REST API – authenticated use, account centric:
       – Retrieve tweets, profile data, friends & followers,
         etc. and authenticated users’ direct messages
       – Searching public tweets also possible
       – Rate limited to 350 requests per hour
READING
THE RIOTS
ON TWITTER
             REST and Search Limitations
   • Users can delete tweets.
   • Twitter will retire tweets (depending on
     traffic an account generates).
   • Rate limiting means it is difficult to
     collect substantial corpora that are
     complete.
READING
THE RIOTS
ON TWITTER
             Twitter Streaming API
   • Twitter Streaming API allows streaming either a
     random sample or tweets selected by keyword
     (track), account (follow) or geo-location (but few
     tweets are geo-coded).
   • Track and follow can be rate-limited if too much
     traffic is generated, but in a way as to produce a
     random sample (needs to be confirmed).
READING
THE RIOTS
ON TWITTER
READING
THE RIOTS
ON TWITTER
         Reliability of Computer-based Tools

   • If, for a given corpus sample, a computer-
     based tool matches performance of human
     coders with a precision of y%:
       – what is the estimated precision over the whole
         corpus?
   • How would a representative corpus sample be
     specified?
READING
THE RIOTS
ON TWITTER
             140m Tweets a Day…
READING
THE RIOTS
ON TWITTER
                Why Cloud Computing?




                                                                 St Andrews Cloud
                                                                 Collaboratory (StACC)




     Information flow analysis: 16 instances, one working day.
READING
THE RIOTS
ON TWITTER
             How to cope… distribution
READING
THE RIOTS
ON TWITTER




  Challenges of new
  social media for
  quantitative
  researchers
     Rob Procter (University of Manchester)
        Farida Vis (University of Leicester)
    Alexander Voss (University of St Andrews)

   http://www.analysingsocialmedia.org/
                                                #readingtheriots

Mais conteúdo relacionado

Mais procurados

DMAP: Data Aggregation and Presentation Framework
DMAP: Data Aggregation and Presentation FrameworkDMAP: Data Aggregation and Presentation Framework
DMAP: Data Aggregation and Presentation FrameworkParang Saraf
 
#ICCSS2015 - Computational Human Security Analytics using "Big Data"
#ICCSS2015 - Computational Human Security Analytics using "Big Data"#ICCSS2015 - Computational Human Security Analytics using "Big Data"
#ICCSS2015 - Computational Human Security Analytics using "Big Data"Pete Burnap
 
Detecting Trends Through Twitter Stream v2
Detecting Trends Through Twitter Stream v2Detecting Trends Through Twitter Stream v2
Detecting Trends Through Twitter Stream v2The Night's Watch
 
EMBERS AutoGSR: Automated Coding of Civil Unrest Events
EMBERS AutoGSR: Automated Coding of Civil Unrest EventsEMBERS AutoGSR: Automated Coding of Civil Unrest Events
EMBERS AutoGSR: Automated Coding of Civil Unrest EventsParang Saraf
 
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYFRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYcscpconf
 
News construction from microblogging posts using open data
News construction from microblogging posts using open data News construction from microblogging posts using open data
News construction from microblogging posts using open data Francisco Berrizbeitia
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Serge Beckers
 

Mais procurados (10)

Comparing Automated Factual Claim Detection Against Judgments of Journalism O...
Comparing Automated Factual Claim Detection Against Judgments of Journalism O...Comparing Automated Factual Claim Detection Against Judgments of Journalism O...
Comparing Automated Factual Claim Detection Against Judgments of Journalism O...
 
DMAP: Data Aggregation and Presentation Framework
DMAP: Data Aggregation and Presentation FrameworkDMAP: Data Aggregation and Presentation Framework
DMAP: Data Aggregation and Presentation Framework
 
#ICCSS2015 - Computational Human Security Analytics using "Big Data"
#ICCSS2015 - Computational Human Security Analytics using "Big Data"#ICCSS2015 - Computational Human Security Analytics using "Big Data"
#ICCSS2015 - Computational Human Security Analytics using "Big Data"
 
Detecting Trends Through Twitter Stream v2
Detecting Trends Through Twitter Stream v2Detecting Trends Through Twitter Stream v2
Detecting Trends Through Twitter Stream v2
 
Media Cloud
Media CloudMedia Cloud
Media Cloud
 
EMBERS AutoGSR: Automated Coding of Civil Unrest Events
EMBERS AutoGSR: Automated Coding of Civil Unrest EventsEMBERS AutoGSR: Automated Coding of Civil Unrest Events
EMBERS AutoGSR: Automated Coding of Civil Unrest Events
 
A Semantic Graph-based Approach for Radicalisation Detection on Social Media
A Semantic Graph-based Approach for Radicalisation Detection on Social Media A Semantic Graph-based Approach for Radicalisation Detection on Social Media
A Semantic Graph-based Approach for Radicalisation Detection on Social Media
 
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYFRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
 
News construction from microblogging posts using open data
News construction from microblogging posts using open data News construction from microblogging posts using open data
News construction from microblogging posts using open data
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?
 

Destaque

Sandra Gonzalez-Bailon
Sandra Gonzalez-BailonSandra Gonzalez-Bailon
Sandra Gonzalez-BailonNSMNSS
 
Helene Snee
Helene SneeHelene Snee
Helene SneeNSMNSS
 
COSMOS
COSMOSCOSMOS
COSMOSNSMNSS
 
Tom de Ruyck
Tom de RuyckTom de Ruyck
Tom de RuyckNSMNSS
 
Farida Vis
Farida VisFarida Vis
Farida VisNSMNSS
 
Role and impact of media on society final ppt............
Role and impact of media on society final ppt............Role and impact of media on society final ppt............
Role and impact of media on society final ppt............Aaryendr
 

Destaque (6)

Sandra Gonzalez-Bailon
Sandra Gonzalez-BailonSandra Gonzalez-Bailon
Sandra Gonzalez-Bailon
 
Helene Snee
Helene SneeHelene Snee
Helene Snee
 
COSMOS
COSMOSCOSMOS
COSMOS
 
Tom de Ruyck
Tom de RuyckTom de Ruyck
Tom de Ruyck
 
Farida Vis
Farida VisFarida Vis
Farida Vis
 
Role and impact of media on society final ppt............
Role and impact of media on society final ppt............Role and impact of media on society final ppt............
Role and impact of media on society final ppt............
 

Semelhante a Rob Procter

New Methodologies for Capturing and Working with Publicly Available Twitter Data
New Methodologies for Capturing and Working with Publicly Available Twitter DataNew Methodologies for Capturing and Working with Publicly Available Twitter Data
New Methodologies for Capturing and Working with Publicly Available Twitter DataAxel Bruns
 
Reading the Riots on Twitter
Reading the Riots on TwitterReading the Riots on Twitter
Reading the Riots on Twitterrobnprocter
 
Insights From Social Media
Insights From Social MediaInsights From Social Media
Insights From Social MediaDr Wasim Ahmed
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Serge Beckers
 
Arcomem training twitter-dynamics_beginner
Arcomem training twitter-dynamics_beginnerArcomem training twitter-dynamics_beginner
Arcomem training twitter-dynamics_beginnerarcomem
 
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Artificial Intelligence Institute at UofSC
 
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
 IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET - Implementation of Twitter Sentimental Analysis According to Hash TagIRJET Journal
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentIoannis Katakis
 
What Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesWhat Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesKrisKasianovitz
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Artificial Intelligence Institute at UofSC
 
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...
IRJET-  	  An Improved Machine Learning for Twitter Breaking News Extraction ...IRJET-  	  An Improved Machine Learning for Twitter Breaking News Extraction ...
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...IRJET Journal
 
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3 Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3 Shalin Hai-Jew
 
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Farida Vis
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingAlex Pinto
 
Groundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitterGroundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitterDan Nguyen
 
Twitter, Twinder, Twitcident: Filtering and Search in Social Web Streams
Twitter, Twinder, Twitcident: Filtering and Search in Social Web StreamsTwitter, Twinder, Twitcident: Filtering and Search in Social Web Streams
Twitter, Twinder, Twitcident: Filtering and Search in Social Web StreamsFabian Abel
 
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...IRJET Journal
 
IRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Tweet Segmentation and its Application to Named Entity RecognitionIRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Tweet Segmentation and its Application to Named Entity RecognitionIRJET Journal
 

Semelhante a Rob Procter (20)

New Methodologies for Capturing and Working with Publicly Available Twitter Data
New Methodologies for Capturing and Working with Publicly Available Twitter DataNew Methodologies for Capturing and Working with Publicly Available Twitter Data
New Methodologies for Capturing and Working with Publicly Available Twitter Data
 
Reading the Riots on Twitter
Reading the Riots on TwitterReading the Riots on Twitter
Reading the Riots on Twitter
 
Insights From Social Media
Insights From Social MediaInsights From Social Media
Insights From Social Media
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?
 
Arcomem training twitter-dynamics_beginner
Arcomem training twitter-dynamics_beginnerArcomem training twitter-dynamics_beginner
Arcomem training twitter-dynamics_beginner
 
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
 
Analyzing Real Time News
Analyzing Real Time NewsAnalyzing Real Time News
Analyzing Real Time News
 
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
 IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor Agent
 
What Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesWhat Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker Notes
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
 
Collecting Twitter Data
Collecting Twitter DataCollecting Twitter Data
Collecting Twitter Data
 
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...
IRJET-  	  An Improved Machine Learning for Twitter Breaking News Extraction ...IRJET-  	  An Improved Machine Learning for Twitter Breaking News Extraction ...
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...
 
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3 Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
 
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
 
Groundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitterGroundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitter
 
Twitter, Twinder, Twitcident: Filtering and Search in Social Web Streams
Twitter, Twinder, Twitcident: Filtering and Search in Social Web StreamsTwitter, Twinder, Twitcident: Filtering and Search in Social Web Streams
Twitter, Twinder, Twitcident: Filtering and Search in Social Web Streams
 
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
 
IRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Tweet Segmentation and its Application to Named Entity RecognitionIRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Tweet Segmentation and its Application to Named Entity Recognition
 

Último

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 

Último (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Rob Procter

  • 1. READING THE RIOTS ON TWITTER Challenges of new social media for quantitative researchers Rob Procter (University of Manchester) Farida Vis (University of Leicester) Alexander Voss (University of St Andrews) http://www.analysingsocialmedia.org/ #readingtheriots
  • 2. READING THE RIOTS ON TWITTER Methodology • Development of computer-based tools for sentiment and topic analysis of tweets is an active area of research. • Our methodology combines computer-based tools with established content analysis techniques in ways that are complementary to their respective strengths.
  • 3. READING THE RIOTS ON TWITTER Information Flows • Any collection of tweets can be divided into tweets that are ‘original’ and retweets. • If we are interested in how Twitter is used to communicate and share information, the only reliable evidence that a tweet has been read is that it has been retweeted. • We use computational tools to group a tweet (the parent) and its retweets (its children) into information flows.
  • 4. READING THE RIOTS ON TWITTER Information Flow Analysis For N = 1, CorpusMax InformationFlow[N-1] = {} If Corpus[N] == “RT @”.username.body (LevenshteinDistance, Parent) = LDMin(N- 1, username, body) If LevenshteinDistance< 30 InformationFlow[Parent] = InformationFlow[Parent] + Corpus[N]
  • 5. READING THE RIOTS ON TWITTER Example Information Flows Riots Corpus Great sight in my #Birmingham where #Pakistani lads are 2.6M Tweets protecting temples while Sikh lads protecting the mosques: 758 700,000 accounts incitement pls?: 5 Can we have them arrested for Hackney! Fuck the feds! #hackney #punchcroft has just posted Go on someone calling themselves
  • 6. READING THE RIOTS ON TWITTER Coding Frames • We use established methods of content analysis to understand how Twitter was being used in the context of topics we wished to analyse. • Inductively code information flows to develop a ‘code frame’ to categorise topics and examine relationships in context of a given topic.
  • 8. READING THE RIOTS ON TWITTER Rumours on Twitter
  • 15. READING THE RIOTS ON TWITTER Sampling Issues • Riots corpus selected from Twitter firehose using set of hashtags: – Sample may systematically exclude some relevant data. • Twitter users not representative of the population as a whole: – Younger, better off, better educated, urban – How can we use profile info to counter bias?
  • 16. READING THE RIOTS ON TWITTER Twitter Data APIs • Twitter offers a number of different APIs providing access to different sets of data. • Differences are in terms of: – Timescale – Real-time vs. retrospective – Completeness – Functionality to specify subsets of tweets
  • 17. READING THE RIOTS ON TWITTER Search and REST APIs • Search API – unauthenticated use: – Search by keyword, account, etc. – No tweets older than about one week – Rate limited by details not published • REST API – authenticated use, account centric: – Retrieve tweets, profile data, friends & followers, etc. and authenticated users’ direct messages – Searching public tweets also possible – Rate limited to 350 requests per hour
  • 18. READING THE RIOTS ON TWITTER REST and Search Limitations • Users can delete tweets. • Twitter will retire tweets (depending on traffic an account generates). • Rate limiting means it is difficult to collect substantial corpora that are complete.
  • 19. READING THE RIOTS ON TWITTER Twitter Streaming API • Twitter Streaming API allows streaming either a random sample or tweets selected by keyword (track), account (follow) or geo-location (but few tweets are geo-coded). • Track and follow can be rate-limited if too much traffic is generated, but in a way as to produce a random sample (needs to be confirmed).
  • 21. READING THE RIOTS ON TWITTER Reliability of Computer-based Tools • If, for a given corpus sample, a computer- based tool matches performance of human coders with a precision of y%: – what is the estimated precision over the whole corpus? • How would a representative corpus sample be specified?
  • 22. READING THE RIOTS ON TWITTER 140m Tweets a Day…
  • 23. READING THE RIOTS ON TWITTER Why Cloud Computing? St Andrews Cloud Collaboratory (StACC) Information flow analysis: 16 instances, one working day.
  • 24. READING THE RIOTS ON TWITTER How to cope… distribution
  • 25. READING THE RIOTS ON TWITTER Challenges of new social media for quantitative researchers Rob Procter (University of Manchester) Farida Vis (University of Leicester) Alexander Voss (University of St Andrews) http://www.analysingsocialmedia.org/ #readingtheriots

Notas do Editor

  1. The cloud economics argument from Amazon shows how traditional forms of providing computational and storage resources are either wasteful or risk customer dissatisfaction. Using a cloud model, the level of resource provision can be adapted to current demand. The St Andrews Cloud Collaboratory (StACC) is a private cloud (actually, more than one) that allows us to allocate resources to a research project when needed and release them for other uses when not needed for the project. This allows St Andrews to do more research per server room / watt / CO2.