SlideShare uma empresa Scribd logo
1 de 27
Thou Shalt not Share Collections of Tweets: Should we give a TOS?
“Thou Shalt not Share Collections ...” Interest sparked by AoIR discussion Post by Prof Stuart Shulman on May 5th 2
The Original Post (OP) 3 [Posted: Thu May 5 05:24:10 PDT 2011]
What Twitter said 4
5
Twitter-History a.k.a. ‘Twistory’ “We hope Twitter will realize the value of enabling researchers, journalists and citizens better ways to search, sort and analyze clusters of this important historical information.” 6
Twitter appears to think so too! 7
Twitter says “desist!” Prohibited other services from offering archives (for download): E.g., 140kit, TwapperKeeper, DiscoverText, ... Shut down 3rd party clients (Twidroyd & UberTwitter) for: Private Direct Messages longer than 140 characters Trademark infringement Changing the content of users' Tweets in order to make money 8
Twitter responds ... “... abide by a simple set of rules that are in the interests of our users, as well as the health and vitality of the platform as a whole.” “... on an average day we turn off more than one hundred services that violate our API rules of the road.” “You can download Twitter for Blackberry, Twitter for Android and other official Twitter apps here. You can also try our mobile web site or apps from other third-party developers.” 9
Why now? 10
Perspectives: Online social messaging service (user) Open ecosystem infrastructure (developer) Historical social record (researchers) Post “tweets” with max. 140 characters in real-time Publicly accessible (cf. CB radios) with some privacy Provides  search (limited) Uses & develops open-source software (e.g., Cassandra, Lucene, FlockDB, ...)
12
Some Twitter numbers Valuation: 4 billion (January 2011) Investment: $360 million (200m, Dec 2010) Employees: 400 (Jan 2011)200 are engineers Revenue: Ad estimates 150 million for 2011 No. of tweets: 140-150 million per day Users/Accounts: 200 million (approx.) Website ranking: Top 10-Top20 Twitter search: One billion queries per day 13
2006 (late)-2008 14
2009-2010 15
2011 16
A quick aside ...
Twitter Research Services: 140kit, TwapperKeeper, DiscoverText, The Archivist, ... Some hundreds of publications Areas:  Social network analysis, recommendations systems, social influence, user sentiment, business strategy, disaster prediction & alerts, education, software engineering, politics, ... Using:  Content analysis (narrative), ethnography, SVMs, TextRank, TFIDF, BoW, POS, ... 18
The Twitter API REST API uses HTTP protocol  All website features supported through API Programming libraries available Rate limiting (user & IP): Anonymous: 150 requests per hour OAuth:  350 requests per hour Whitelist e.g.  20,000 requests Streaming offerings: Spritzer (1%) Gardenhose (10%)  Firehose (100%) 19
General Terms of Service (Nov 2010) Under “Your Rights”: “... You grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed).” 20
TOS tips “This license is you authorizing us to make your Tweets available to the rest of the world and to let others do the same. But what’s yours is yours – you own your content.” “Twitter has an evolving set of rules for how API developers can interact with your content. These rules exist to enable an open ecosystem with your rights in mind.” 21
API TOS (Feb 2011) Access to Twitter Content: You will not attempt or encourage others to: sell, rent, lease, sublicense, redistribute, or syndicate the Twitter API or Twitter Content to any third party for such party to develop additional products or services without prior written approval from Twitter Content = “All use of the Twitter API and content, documentation, code, and related materials made available to you on or through Twitter.” 22
Authorised resyndication = GNIP First authorized reseller of Twitter data, Nov 2010 Offerings: Halfhose (50%, $30k / mo) Decahose (10%, $5k / mo) Power Track ($.10 per 1,000 Tweets) Link Stream ($50k / mo) User Mention Stream ($20k / mo) Keyword Search 23
Potential consequences Obstruct peer review of datasets Prohibits researchers getting access to data (in a timely way, if at all) Stifle innovations (most come from user community & 3rd party developers!) Users become more cautious about using social media Twitter becomes less useful (protest, reporting, ...) Twitter services become hacking targets: (unreliable, unstable, slow, ...) Social science researchers twiddle their thumbs
One solution ... One solution? 25
Talking points Is there a problem here? Does Twitter have any obligation to users, developers & researchers? Is it worth (or even ethical) to violate Twitter’s TOS to get access to researchable data? Should users’ content even be available to researchers?
Thanks!

Mais conteúdo relacionado

Semelhante a Thou Shalt not Share Collections of Tweets: Should we give a TOS?

Data Access, Ownership and Control in Social Web Services: Issues for Twitter...
Data Access, Ownership and Control in Social Web Services: Issues for Twitter...Data Access, Ownership and Control in Social Web Services: Issues for Twitter...
Data Access, Ownership and Control in Social Web Services: Issues for Twitter...Cornelius Puschmann
 
Twitter in the Government
Twitter in the GovernmentTwitter in the Government
Twitter in the Governmentgencat .
 
Twitter Terms of Service Explained - Jake White
Twitter Terms of Service Explained - Jake WhiteTwitter Terms of Service Explained - Jake White
Twitter Terms of Service Explained - Jake WhiteJake White
 
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORKDETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORKIRJET Journal
 
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3 Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3 Shalin Hai-Jew
 
Twitter Presentation
Twitter PresentationTwitter Presentation
Twitter Presentationabradley76
 
Twitter Presentation
Twitter PresentationTwitter Presentation
Twitter Presentationabradley76
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonHetu Bhavsar
 
Eavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging SiteEavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging SiteShalin Hai-Jew
 
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Artificial Intelligence Institute at UofSC
 
Twitter: As A Professional Development Tool
Twitter: As A Professional Development ToolTwitter: As A Professional Development Tool
Twitter: As A Professional Development Toolcswetzel
 
John Conroy
John ConroyJohn Conroy
John Conroyblogtalk
 
Python report on twitter sentiment analysis
Python report on twitter sentiment analysisPython report on twitter sentiment analysis
Python report on twitter sentiment analysisAntaraBhattacharya12
 
Rob Procter
Rob ProcterRob Procter
Rob ProcterNSMNSS
 
Benefits of the Social Web: How Can It Help My Museum?
Benefits of the Social Web: How Can It Help My Museum?Benefits of the Social Web: How Can It Help My Museum?
Benefits of the Social Web: How Can It Help My Museum?lisbk
 
8 tools to help filter your twitter stream & find news | poynter.
8 tools to help filter your twitter stream & find news | poynter.8 tools to help filter your twitter stream & find news | poynter.
8 tools to help filter your twitter stream & find news | poynter.Anjanette Delgado
 

Semelhante a Thou Shalt not Share Collections of Tweets: Should we give a TOS? (20)

Collecting Twitter Data
Collecting Twitter DataCollecting Twitter Data
Collecting Twitter Data
 
Data Access, Ownership and Control in Social Web Services: Issues for Twitter...
Data Access, Ownership and Control in Social Web Services: Issues for Twitter...Data Access, Ownership and Control in Social Web Services: Issues for Twitter...
Data Access, Ownership and Control in Social Web Services: Issues for Twitter...
 
Twitter in the Government
Twitter in the GovernmentTwitter in the Government
Twitter in the Government
 
Twitter Terms of Service Explained - Jake White
Twitter Terms of Service Explained - Jake WhiteTwitter Terms of Service Explained - Jake White
Twitter Terms of Service Explained - Jake White
 
We are losing our tweets!
We are losing our tweets!We are losing our tweets!
We are losing our tweets!
 
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORKDETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
 
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3 Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
 
Twitter Presentation
Twitter PresentationTwitter Presentation
Twitter Presentation
 
Twitter Presentation
Twitter PresentationTwitter Presentation
Twitter Presentation
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
 
Eavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging SiteEavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging Site
 
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
 
KMA SPTechCon Deck on Collaboration
KMA SPTechCon Deck on CollaborationKMA SPTechCon Deck on Collaboration
KMA SPTechCon Deck on Collaboration
 
Twitter: As A Professional Development Tool
Twitter: As A Professional Development ToolTwitter: As A Professional Development Tool
Twitter: As A Professional Development Tool
 
John Conroy
John ConroyJohn Conroy
John Conroy
 
Twet
TwetTwet
Twet
 
Python report on twitter sentiment analysis
Python report on twitter sentiment analysisPython report on twitter sentiment analysis
Python report on twitter sentiment analysis
 
Rob Procter
Rob ProcterRob Procter
Rob Procter
 
Benefits of the Social Web: How Can It Help My Museum?
Benefits of the Social Web: How Can It Help My Museum?Benefits of the Social Web: How Can It Help My Museum?
Benefits of the Social Web: How Can It Help My Museum?
 
8 tools to help filter your twitter stream & find news | poynter.
8 tools to help filter your twitter stream & find news | poynter.8 tools to help filter your twitter stream & find news | poynter.
8 tools to help filter your twitter stream & find news | poynter.
 

Último

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Último (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Thou Shalt not Share Collections of Tweets: Should we give a TOS?

  • 1. Thou Shalt not Share Collections of Tweets: Should we give a TOS?
  • 2. “Thou Shalt not Share Collections ...” Interest sparked by AoIR discussion Post by Prof Stuart Shulman on May 5th 2
  • 3. The Original Post (OP) 3 [Posted: Thu May 5 05:24:10 PDT 2011]
  • 5. 5
  • 6. Twitter-History a.k.a. ‘Twistory’ “We hope Twitter will realize the value of enabling researchers, journalists and citizens better ways to search, sort and analyze clusters of this important historical information.” 6
  • 7. Twitter appears to think so too! 7
  • 8. Twitter says “desist!” Prohibited other services from offering archives (for download): E.g., 140kit, TwapperKeeper, DiscoverText, ... Shut down 3rd party clients (Twidroyd & UberTwitter) for: Private Direct Messages longer than 140 characters Trademark infringement Changing the content of users' Tweets in order to make money 8
  • 9. Twitter responds ... “... abide by a simple set of rules that are in the interests of our users, as well as the health and vitality of the platform as a whole.” “... on an average day we turn off more than one hundred services that violate our API rules of the road.” “You can download Twitter for Blackberry, Twitter for Android and other official Twitter apps here. You can also try our mobile web site or apps from other third-party developers.” 9
  • 11. Perspectives: Online social messaging service (user) Open ecosystem infrastructure (developer) Historical social record (researchers) Post “tweets” with max. 140 characters in real-time Publicly accessible (cf. CB radios) with some privacy Provides search (limited) Uses & develops open-source software (e.g., Cassandra, Lucene, FlockDB, ...)
  • 12. 12
  • 13. Some Twitter numbers Valuation: 4 billion (January 2011) Investment: $360 million (200m, Dec 2010) Employees: 400 (Jan 2011)200 are engineers Revenue: Ad estimates 150 million for 2011 No. of tweets: 140-150 million per day Users/Accounts: 200 million (approx.) Website ranking: Top 10-Top20 Twitter search: One billion queries per day 13
  • 18. Twitter Research Services: 140kit, TwapperKeeper, DiscoverText, The Archivist, ... Some hundreds of publications Areas: Social network analysis, recommendations systems, social influence, user sentiment, business strategy, disaster prediction & alerts, education, software engineering, politics, ... Using: Content analysis (narrative), ethnography, SVMs, TextRank, TFIDF, BoW, POS, ... 18
  • 19. The Twitter API REST API uses HTTP protocol All website features supported through API Programming libraries available Rate limiting (user & IP): Anonymous: 150 requests per hour OAuth: 350 requests per hour Whitelist e.g.  20,000 requests Streaming offerings: Spritzer (1%) Gardenhose (10%) Firehose (100%) 19
  • 20. General Terms of Service (Nov 2010) Under “Your Rights”: “... You grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed).” 20
  • 21. TOS tips “This license is you authorizing us to make your Tweets available to the rest of the world and to let others do the same. But what’s yours is yours – you own your content.” “Twitter has an evolving set of rules for how API developers can interact with your content. These rules exist to enable an open ecosystem with your rights in mind.” 21
  • 22. API TOS (Feb 2011) Access to Twitter Content: You will not attempt or encourage others to: sell, rent, lease, sublicense, redistribute, or syndicate the Twitter API or Twitter Content to any third party for such party to develop additional products or services without prior written approval from Twitter Content = “All use of the Twitter API and content, documentation, code, and related materials made available to you on or through Twitter.” 22
  • 23. Authorised resyndication = GNIP First authorized reseller of Twitter data, Nov 2010 Offerings: Halfhose (50%, $30k / mo) Decahose (10%, $5k / mo) Power Track ($.10 per 1,000 Tweets) Link Stream ($50k / mo) User Mention Stream ($20k / mo) Keyword Search 23
  • 24. Potential consequences Obstruct peer review of datasets Prohibits researchers getting access to data (in a timely way, if at all) Stifle innovations (most come from user community & 3rd party developers!) Users become more cautious about using social media Twitter becomes less useful (protest, reporting, ...) Twitter services become hacking targets: (unreliable, unstable, slow, ...) Social science researchers twiddle their thumbs
  • 25. One solution ... One solution? 25
  • 26. Talking points Is there a problem here? Does Twitter have any obligation to users, developers & researchers? Is it worth (or even ethical) to violate Twitter’s TOS to get access to researchable data? Should users’ content even be available to researchers?