SlideShare a Scribd company logo
1 of 44
Tools & techniques
for working with
datasets
                             Tony Hirst
              Dept of Communication and Systems
                            The Open University
Quick wins and
half-hour hacks
Building a
toolbox…
http://mashe.hawksey.info/2012/11/mining-and-openrefineing-jiscmail-a-look-at-oer-discuss/

/via Martin Hawksey/@mhawksey
“You can quickly create an online 3-D
visualisation (with Google Earth) of
these rare documents”
R-Studio
All at once
      or
one at a time?
Macroscopes
@mediaczar




             (Accession Plot)
Google Maps, 1884 edition?
Overview first,
            zoom and filter,
    then details-on-demand
From: The Eyes Have It:A Task by Data Type Taxonomy for Information Visualizations
•   X and Y (at a push, Z)
•   Node size and colour
•   (Node label size and colour)
•   Edge thickness and colour
•   (Edge label and colour)
•   Node proximity/grouping
•   Clustering

• Filtering and differential
  application of the above
Group by  Hierarchy inside


(implied) containment
Treemap in R
Similarities
    and
differences
Single page
   app +
  linkage
Templated data views
blog.ouseful.info
 @psychemedia

More Related Content

What's hot

13 10 2006 Prato
13 10 2006  Prato13 10 2006  Prato
13 10 2006 PratoStuart Dunn
 
Brisith Academy/LH presentation
Brisith Academy/LH presentationBrisith Academy/LH presentation
Brisith Academy/LH presentationAnna Ashton
 
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'ScienceWorks
 
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...Micah Altman
 
A names backbone - a graph of taxonomy
A names backbone - a graph of taxonomyA names backbone - a graph of taxonomy
A names backbone - a graph of taxonomynickyn
 
Applied spatial data introducing
Applied spatial data introducingApplied spatial data introducing
Applied spatial data introducingHa Hoang
 
FREYA - Connected Open Identifiers for Discovery, Access and Use of Research ...
FREYA - Connected Open Identifiers for Discovery, Access and Use of Research ...FREYA - Connected Open Identifiers for Discovery, Access and Use of Research ...
FREYA - Connected Open Identifiers for Discovery, Access and Use of Research ...EUDAT
 
Open Data Analytics for Parliamentary Monitoring in Finland
Open Data Analytics for Parliamentary Monitoring in FinlandOpen Data Analytics for Parliamentary Monitoring in Finland
Open Data Analytics for Parliamentary Monitoring in FinlandLouhos
 

What's hot (15)

13 10 2006 Prato
13 10 2006  Prato13 10 2006  Prato
13 10 2006 Prato
 
2014_WWW_BTOR
2014_WWW_BTOR2014_WWW_BTOR
2014_WWW_BTOR
 
Torsten Reimer
Torsten ReimerTorsten Reimer
Torsten Reimer
 
co:op-READ-Convention Marburg - Sebastian Colutto
co:op-READ-Convention Marburg - Sebastian Coluttoco:op-READ-Convention Marburg - Sebastian Colutto
co:op-READ-Convention Marburg - Sebastian Colutto
 
Brisith Academy/LH presentation
Brisith Academy/LH presentationBrisith Academy/LH presentation
Brisith Academy/LH presentation
 
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
 
Sharing data
Sharing dataSharing data
Sharing data
 
LODAC Museum -- Connecting Museums with LOD --
LODAC Museum -- Connecting Museums with LOD --LODAC Museum -- Connecting Museums with LOD --
LODAC Museum -- Connecting Museums with LOD --
 
Building intelligent systems with FAIR data
Building intelligent systems with FAIR dataBuilding intelligent systems with FAIR data
Building intelligent systems with FAIR data
 
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
 
A names backbone - a graph of taxonomy
A names backbone - a graph of taxonomyA names backbone - a graph of taxonomy
A names backbone - a graph of taxonomy
 
Applied spatial data introducing
Applied spatial data introducingApplied spatial data introducing
Applied spatial data introducing
 
FREYA - Connected Open Identifiers for Discovery, Access and Use of Research ...
FREYA - Connected Open Identifiers for Discovery, Access and Use of Research ...FREYA - Connected Open Identifiers for Discovery, Access and Use of Research ...
FREYA - Connected Open Identifiers for Discovery, Access and Use of Research ...
 
Open Data Analytics for Parliamentary Monitoring in Finland
Open Data Analytics for Parliamentary Monitoring in FinlandOpen Data Analytics for Parliamentary Monitoring in Finland
Open Data Analytics for Parliamentary Monitoring in Finland
 
Trellis
TrellisTrellis
Trellis
 

Viewers also liked

Michael Nicholson storyboard
Michael Nicholson storyboardMichael Nicholson storyboard
Michael Nicholson storyboardmnicholson1603
 
Community Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wireCommunity Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wireTony Hirst
 
Data Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKXData Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKXTony Hirst
 
Notes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 WorkshopNotes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 WorkshopTony Hirst
 
Residential school 2015_robotics_interest
Residential school 2015_robotics_interestResidential school 2015_robotics_interest
Residential school 2015_robotics_interestTony Hirst
 
Robotlab jupyter
Robotlab   jupyterRobotlab   jupyter
Robotlab jupyterTony Hirst
 
Fco open data in half day th-v2
Fco open data in half day  th-v2Fco open data in half day  th-v2
Fco open data in half day th-v2Tony Hirst
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriateTony Hirst
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriateTony Hirst
 

Viewers also liked (9)

Michael Nicholson storyboard
Michael Nicholson storyboardMichael Nicholson storyboard
Michael Nicholson storyboard
 
Community Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wireCommunity Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wire
 
Data Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKXData Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKX
 
Notes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 WorkshopNotes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 Workshop
 
Residential school 2015_robotics_interest
Residential school 2015_robotics_interestResidential school 2015_robotics_interest
Residential school 2015_robotics_interest
 
Robotlab jupyter
Robotlab   jupyterRobotlab   jupyter
Robotlab jupyter
 
Fco open data in half day th-v2
Fco open data in half day  th-v2Fco open data in half day  th-v2
Fco open data in half day th-v2
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriate
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriate
 

Similar to B llabs

Ben Shneiderman: Thrill of Discovery
Ben Shneiderman: Thrill of DiscoveryBen Shneiderman: Thrill of Discovery
Ben Shneiderman: Thrill of Discoveryruss9595
 
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Bernhard Rieder
 
Sands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked KnowledgeSands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked Knowledgesandsfish
 
ICDMWorkshopProposal.doc
ICDMWorkshopProposal.docICDMWorkshopProposal.doc
ICDMWorkshopProposal.docbutest
 
Leveraging Technology in Collaborative Work - Foundations
Leveraging Technology in Collaborative Work - FoundationsLeveraging Technology in Collaborative Work - Foundations
Leveraging Technology in Collaborative Work - FoundationsStephen Judd
 
The Elusive Nature of Software Documentation
The Elusive Nature of Software DocumentationThe Elusive Nature of Software Documentation
The Elusive Nature of Software DocumentationMargaret-Anne Storey
 
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Bernhard Rieder
 
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...tmra
 
Macroscopes and Distant Reading: Implications for Infrastructures to Support ...
Macroscopes and Distant Reading: Implications for Infrastructures to Support ...Macroscopes and Distant Reading: Implications for Infrastructures to Support ...
Macroscopes and Distant Reading: Implications for Infrastructures to Support ...Trevor Owens
 
Big data visualization state of the art
Big data visualization state of the artBig data visualization state of the art
Big data visualization state of the artsoria musa
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Amit Sheth
 
mLearning : Habitus and Field Activities
mLearning : Habitus and Field ActivitiesmLearning : Habitus and Field Activities
mLearning : Habitus and Field ActivitiesMichael Sean Gallagher
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
 
Project Proposal Topics Modeling (Ir)
Project Proposal    Topics Modeling (Ir)Project Proposal    Topics Modeling (Ir)
Project Proposal Topics Modeling (Ir)Svitlana volkova
 
Accessible Next Level Visualizations
Accessible Next Level VisualizationsAccessible Next Level Visualizations
Accessible Next Level VisualizationsTed Gies
 
Linked Data for Digital Humanities research at Media Archives
Linked Data for Digital Humanities research at Media ArchivesLinked Data for Digital Humanities research at Media Archives
Linked Data for Digital Humanities research at Media ArchivesVictor de Boer
 

Similar to B llabs (20)

Ben Shneiderman: Thrill of Discovery
Ben Shneiderman: Thrill of DiscoveryBen Shneiderman: Thrill of Discovery
Ben Shneiderman: Thrill of Discovery
 
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
 
Sands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked KnowledgeSands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked Knowledge
 
Methods and Tools for Facilitating Social Participation
Methods and Tools for Facilitating Social ParticipationMethods and Tools for Facilitating Social Participation
Methods and Tools for Facilitating Social Participation
 
ICDMWorkshopProposal.doc
ICDMWorkshopProposal.docICDMWorkshopProposal.doc
ICDMWorkshopProposal.doc
 
Leveraging Technology in Collaborative Work - Foundations
Leveraging Technology in Collaborative Work - FoundationsLeveraging Technology in Collaborative Work - Foundations
Leveraging Technology in Collaborative Work - Foundations
 
The Elusive Nature of Software Documentation
The Elusive Nature of Software DocumentationThe Elusive Nature of Software Documentation
The Elusive Nature of Software Documentation
 
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
 
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
 
Macroscopes and Distant Reading: Implications for Infrastructures to Support ...
Macroscopes and Distant Reading: Implications for Infrastructures to Support ...Macroscopes and Distant Reading: Implications for Infrastructures to Support ...
Macroscopes and Distant Reading: Implications for Infrastructures to Support ...
 
Big data visualization state of the art
Big data visualization state of the artBig data visualization state of the art
Big data visualization state of the art
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
IVACS 2010
IVACS 2010IVACS 2010
IVACS 2010
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
 
mLearning : Habitus and Field Activities
mLearning : Habitus and Field ActivitiesmLearning : Habitus and Field Activities
mLearning : Habitus and Field Activities
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
 
Project Proposal Topics Modeling (Ir)
Project Proposal    Topics Modeling (Ir)Project Proposal    Topics Modeling (Ir)
Project Proposal Topics Modeling (Ir)
 
Presentation to KILT
Presentation to KILTPresentation to KILT
Presentation to KILT
 
Accessible Next Level Visualizations
Accessible Next Level VisualizationsAccessible Next Level Visualizations
Accessible Next Level Visualizations
 
Linked Data for Digital Humanities research at Media Archives
Linked Data for Digital Humanities research at Media ArchivesLinked Data for Digital Humanities research at Media Archives
Linked Data for Digital Humanities research at Media Archives
 

More from Tony Hirst

15 in 20 research fiesta
15 in 20 research fiesta15 in 20 research fiesta
15 in 20 research fiestaTony Hirst
 
Jupyternotebooks ou.pptx
Jupyternotebooks ou.pptxJupyternotebooks ou.pptx
Jupyternotebooks ou.pptxTony Hirst
 
Virtual computing.pptx
Virtual computing.pptxVirtual computing.pptx
Virtual computing.pptxTony Hirst
 
ouseful-parlihacks
ouseful-parlihacksouseful-parlihacks
ouseful-parlihacksTony Hirst
 
A Quick Tour of OpenRefine
A Quick Tour of OpenRefineA Quick Tour of OpenRefine
A Quick Tour of OpenRefineTony Hirst
 
Conversations with data
Conversations with dataConversations with data
Conversations with dataTony Hirst
 
Data reuse OU workshop bingo
Data reuse OU workshop bingoData reuse OU workshop bingo
Data reuse OU workshop bingoTony Hirst
 
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Tony Hirst
 
Lincoln jun14datajournalism
Lincoln jun14datajournalismLincoln jun14datajournalism
Lincoln jun14datajournalismTony Hirst
 
Lincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data JournalismLincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data JournalismTony Hirst
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear talesTony Hirst
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear talesTony Hirst
 
Guerrilla resaearch wtf
Guerrilla resaearch wtfGuerrilla resaearch wtf
Guerrilla resaearch wtfTony Hirst
 
Lincoln2014 ddj (ppt)
Lincoln2014 ddj (ppt)Lincoln2014 ddj (ppt)
Lincoln2014 ddj (ppt)Tony Hirst
 
An Introduction to Data Journalism
An Introduction to Data JournalismAn Introduction to Data Journalism
An Introduction to Data JournalismTony Hirst
 

More from Tony Hirst (20)

15 in 20 research fiesta
15 in 20 research fiesta15 in 20 research fiesta
15 in 20 research fiesta
 
Dev8d jupyter
Dev8d jupyterDev8d jupyter
Dev8d jupyter
 
Ili 16 robot
Ili 16 robotIli 16 robot
Ili 16 robot
 
Jupyternotebooks ou.pptx
Jupyternotebooks ou.pptxJupyternotebooks ou.pptx
Jupyternotebooks ou.pptx
 
Virtual computing.pptx
Virtual computing.pptxVirtual computing.pptx
Virtual computing.pptx
 
ouseful-parlihacks
ouseful-parlihacksouseful-parlihacks
ouseful-parlihacks
 
Week4
Week4Week4
Week4
 
A Quick Tour of OpenRefine
A Quick Tour of OpenRefineA Quick Tour of OpenRefine
A Quick Tour of OpenRefine
 
Conversations with data
Conversations with dataConversations with data
Conversations with data
 
Data reuse OU workshop bingo
Data reuse OU workshop bingoData reuse OU workshop bingo
Data reuse OU workshop bingo
 
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
 
Lincoln jun14datajournalism
Lincoln jun14datajournalismLincoln jun14datajournalism
Lincoln jun14datajournalism
 
Lincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data JournalismLincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data Journalism
 
Calrg14 tm351
Calrg14 tm351Calrg14 tm351
Calrg14 tm351
 
Calrg14 tm351
Calrg14 tm351Calrg14 tm351
Calrg14 tm351
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear tales
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear tales
 
Guerrilla resaearch wtf
Guerrilla resaearch wtfGuerrilla resaearch wtf
Guerrilla resaearch wtf
 
Lincoln2014 ddj (ppt)
Lincoln2014 ddj (ppt)Lincoln2014 ddj (ppt)
Lincoln2014 ddj (ppt)
 
An Introduction to Data Journalism
An Introduction to Data JournalismAn Introduction to Data Journalism
An Introduction to Data Journalism
 

B llabs

Editor's Notes

  1. Let pi,j be the rate at which word i occurs in document j, and pj be the average across documents( sum Pij/ndocs)The size of each word is mapped to its maximum deviation ( maxi(pi,j- pj ) ), and its angular position is determined by the document where that maximum occurs.