15. Cnut knew trying to control the tide was silly
cloudofdata.com www.flickr.com/photos/30591976@N05/3402395112/
16. why not use language of opportunity?
cloudofdata.com www.flickr.com/photos/73645804@N00/2712985768/
17. why not use language of opportunity?
“data-driven organisations
look at big data as a
solution, not a problem”
Release 2.0, February 2009
cloudofdata.com www.flickr.com/photos/73645804@N00/2712985768/
20. emerging from lots of places, and being combined
cloudofdata.com www.flickr.com/photos/98274023@N00/2102067531/
quantity, nature, expectations...
21. emerging from lots of places, and being combined
More data, faster...
cloudofdata.com www.flickr.com/photos/98274023@N00/2102067531/
quantity, nature, expectations...
50. Data LINKED to other places outside firewall
eg BBC trusts and relies upon MusicBrainz
bit.ly/9tBJGH
cloudofdata.com www.flickr.com/photos/foxypar4/2124673642/
51. harks back to TimBL’s original vision
cloudofdata.com www.flickr.com/photos/tanaka/3212373419/
for a Read/Write Web
52. “the Web done
right”
Sir Tim Berners-Lee, 2008
harks back to TimBL’s original vision
cloudofdata.com www.flickr.com/photos/tanaka/3212373419/
for a Read/Write Web
53. Use URIs to name things
Use HTTP URIs so that they can be followed
When someone follows a URI, provide useful information
Include links to other URIs, so that more can be discovered.
cloudofdata.com www.w3.org/DesignIssues/LinkedData.html
57. Web-scale tools
NoSQL data manipulation with Hadoop, Cassandra, etc
cloudofdata.com
58. Web-scale tools
NoSQL data manipulation with Hadoop, Cassandra, etc
Web-scale storage and compute
Separate archival role from analysis, dissemination and use
“too cheap to meter” may be measuring the wrong things
cloudofdata.com
59. Web-scale tools
NoSQL data manipulation with Hadoop, Cassandra, etc
Web-scale storage and compute
Separate archival role from analysis, dissemination and use
“too cheap to meter” may be measuring the wrong things
Leverage connections
between archives, and with the wider world
embrace the Web, and its architecture
cloudofdata.com
61. cloud of data
Thank you Download this presentation
slideshare.net/cloudofdata
Dr Paul Miller
The Cloud of Data
paul.miller@cloudofdata.com
skype: cloudofdata Made on a
phone: +44 7769 740083
Mac
Except where otherwise noted, this work is licensed under the Creative Commons Attribution Licence.
To view a copy of this licence, visit creativecommons.org/licenses/by/2.0/uk/ or send a letter to
cloudofdata.com Creative Commons, 171 Second St, San Francisco, CA 94105, United States of America
Notas do Editor
\n
\n
\n
Data explosion. \nNot necessarily like this anymore. \nTables, and spreadsheets, and databases.\n
or even - for you - this. \nNot just about STORING and SERVING streams.\nDetect and exploit CONNECTIONS - sometimes in near-real-time.\n
774 connections\nclusters are inferred. \nWhat does it mean?\n
Automatic Clusters reflect my world with remarkable precision\nLabels are my own\nBigger dots (the ones you can see, at this scale!) = bigger influence in network\nMore than 50 contacts? Get your own. \n
Explore...\nConnections shared with Seamus Ross;\nBlue Culture, Green JISC, 1 Orange Librarian\n
Google CEO Eric Schmidt. Speaking at Techonomy, Lake Tahoe, in August 2010.\n\nPlenty to quibble with… data v. information, ‘dawn of civilisation,’ etc. But. HUGE shift. Autonomous sensors, 24 hour multi-channel tv, social networks, finance, commerce...\n
Lewis Strauss, Chairman of US Atomic Energy Commission (and Pres. Eisenhower).\nPower “Too cheap to meter.” Reckoned in 1954 we’d get there. Have we?\n
Storage too cheap to meter by mid 1990s?\nNot quite - question of resolution. $300,000 in 1981. $10,000 in 1990. $10 in 2000. $0.10 last year. \n\nSoon will be too cheap to meter. Changes the value proposition. In many domains, cheaper to keep everything than to selectively manage.\n\nBUT quantities increasing faster than costs are falling… and mechanics of storage a small fraction of the ‘cost’ of keeping data.\n\nTracked by a website in Nova Scotia, Canada. Data extracted by David Isenberg.\n\n
COMPUTE gradually becoming too cheap to meter, too.\nCloud Computing - pay for the computers you need, for the time you need them. And then stop paying.\n\nSeparate STORAGE from PROCESSING from USE/DELIVERY/ACCESS. Not a bad thing to do anyway!\n\nMicrosoft, Dublin.\n
The Economist, Feb/March 2010. Science and others also write about this. O’Reilly, GigaOM and others organise events around this.\nCompanies scrambling to ‘own’ this...\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Is A/V “Big Data” ? Maybe. Sometimes.\nGb and Gb of digitised film probably not.\nReal-time stream from 24 hour News… or the Big Brother House… could be. Analyse? Find patterns? Compare, Contrast, Explore.\n
Mike Driscoll & Roger Ehrenberg spoke at recent Strata - and in GigaOM piece - and used tar sands/ oil sands analogy. Plenty of oil/value - but previously too expensive to extract.\n
Now for something different. Linked Data rarely - so far - ‘Big’ Data.\nMight people in this room be the ones to change that?\n
mostly human readable. mostly unconnected, except by hyperlinks that say nothing more structured than ‘see also…’\n
been coming a long time. Heavily cited SciAm article is from 2001...\n
Plenty of people solving hard - focussed - problems with ‘semantics’\n
W3C ‘Semantic Web Stack.’\nPerceived as complex, but contains powerful, flexible, elements...\n
simple principles. simple power.\n
\n
\n
subject, object, predicate.\n
subject, object, predicate.\n
subject, object, predicate.\n
subject, object, predicate.\n
subject, object, predicate.\n
Add URIs and each is unambiguous. \n\nYou can also link, and link and link - ALL Tolkien’s books, etc.\n\nThe other stuff in the stack just makes this happen.\n
Add URIs and each is unambiguous. \n\nYou can also link, and link and link - ALL Tolkien’s books, etc.\n\nThe other stuff in the stack just makes this happen.\n
Add URIs and each is unambiguous. \n\nYou can also link, and link and link - ALL Tolkien’s books, etc.\n\nThe other stuff in the stack just makes this happen.\n
Add URIs and each is unambiguous. \n\nYou can also link, and link and link - ALL Tolkien’s books, etc.\n\nThe other stuff in the stack just makes this happen.\n
Add URIs and each is unambiguous. \n\nYou can also link, and link and link - ALL Tolkien’s books, etc.\n\nThe other stuff in the stack just makes this happen.\n
\n
\n
data.gov.uk, data.gov, and many more\n
also World Cup, Natural History, Programmes, and more…\nData-driven organisation. Record once, use in many places.\nBecome a nodal point on the web.\n
linked and open is better.\nJISC Linked Data Horizon Scan.\n
\n
\n
most recent version, from September 2010\nSize of the Cloud demonstrates interest… but you’ll rarely/ever use them all - look how sparse the connections are.\n
\n
\n
\n
\n
Archive or Agora?\nPreservation or New Use?\nNOT mutually exclusive.\n