This document discusses how and why archives should approach archiving social media. It argues that social media, like Twitter, provides a first draft of present events from unfiltered first-hand accounts, and captures Australians' views on specific events and long-term trends. However, this rich data is easily lost if not archived. It proposes that archives could partner with Twitter to receive a full feed of tweets from Australia, and process and store the raw datasets in standard formats while developing ethical protocols for usage. Analyzing these archives could provide insights into the formation and interaction of online publics and networks in the Australian public sphere.
Archiving the Immediate: How and Why Archives Should Approach Social Media
1. Archiving the Immediate: How and Why Archives Should Approach Social Media Assoc. Prof. Axel BrunsARC Centre of Excellence for Creative Industries and InnovationQueensland University of TechnologyBrisbane, Australia a.bruns@qut.edu.au / @snurb_dot_info http://mappingonlinepublics.net/
2. Why Social Media? Social Media: Facebook: 10+ million users in Australia Twitter: 1-2 million users in Australia User-generated content and discussions Themes from the personal to the public (news.com.au) (theage.com.au) (abc.net.au)
3. Background: Researching Twitter Mapping Online Publics: ARC Discovery project, 2010-12: Assoc. Prof. Axel Bruns and Dr. Jean Burgess, QUT Methodology and outcomes: http://mappingonlinepublics.net/ Further projects on social media and crisis communication under development Tools for Twitter analysis: yourTwapperkeeper: API-based data capture Gawk – open source, multiplatform, programmable command-line tool for processing CSV documents WordStat – commercial, PC-only text analysis tool; generates concept co-occurrence data that can be exported for visualisation Gephi – open source, multiplatform network visualisation tool
14. Why Do We Care? Historical significance: Social media coverage as a first draft of the present Especially Twitter: flat, open, self-organising network First-hand, unfiltered, direct insights into Australians’ views Rich data on specific events and on long-term trends We archive journalistic publications, so why not this? Readily available, but easily lost: Access to rich data (and metadata) through standard APIs Especially on Twitter, limited immediate ethical concerns Ephemeral content which is lost to posterity unless archived ‘Big data’, but far from unmanageable Better start archiving now than make up for lost material later
15. How to Archive Social Media Twitter: U.S. Library of Congress already receives full feed of all tweets Made accessible after six-month delay (from when? to whom?) Potential to join partnership or set up similar deal for Australia? Twitter access to high-volume tweet feeds via Gnip.com Flat fee + volume cost of US$1/10,000 tweets received Potential to negotiate discount for Australian public archives? Different levels of inclusiveness in tracking Track all Australian Twitter users? Top 500,000? Top 100,000? Raw datasets in standard formats, and/or in-house processing Show leadership in developing ethical usage protocols
16. Filipinos Marketing / PR Adelaide Wine Perth / PR News / Business Latika Bourke Australia on Twitter Food Journalism / Politics / News Mumbrella Annabel Crabb Leigh Sales Fashion / Style / Parenting Malcolm Turnbull Marie Claire ABC News Crikey Fashion / Magazines Celebrities / Media Arts Joe Hockey Mia Freedman Laurie Oakes Sunrise on 7 Music / Triple J Tony Abbott Julia Gillard Matt Preston Kevin Rudd Triple J Teens / TV Hits Wil Anderson TV (follower/followee network – 140,000 most connected Australia users, of 550,000 processed so far) Football (Soccer) 7pm Project AFL Radio Teens / Short Stack NRL Cricket Sports Hamish and Andy
17. The Promise of ‘Big Data’ Research Insights on Australian public communication on Twitter: Micro: @reply and retweet conversations Meso: hashtag ‘communities’ Macro: follower/followee networks Multiple overlapping publics / networks Evidence of processes in the Australian public sphere: What drives the formation and dissipation of online publics? How do they interact and interweave? How are they interleaved with the wider media ecology? How is information disseminated across complex networks?