SlideShare uma empresa Scribd logo
1 de 104
A Tinkerer’s Toolbox:
Data Driven Journalism




                                           Tony Hirst
                     Dept of Communication and Systems
                                   The Open University
               Visiting Senior Research Fellow, University of Lincoln
@psychemedia

blog.ouseful.info

      #???
Where I situate myself…
Visualising data helps me make
sense of the world around me
Do you know
   what’s
 possible?
#ddj
Google
Spreadsheets
Data Distributions




                     Outliers
Trends and (anti)correlations...
Explanatory visualization
Data visualizations that are used to
transmit information or a point of
view from the designer to the
reader. Explanatory visualizations
typically have a specific “story” or
information that they are intended
to transmit.

Exploratory visualization
Data visualizations that are used by
the designer for self-informative
purposes to discover patterns,
trends, or sub-problems in a
dataset. Exploratory visualizations
typically don’t have an already-
known story.
Exploiting
Structure
Hierarchical data and treemaps - medals




Pivot tables
Templated data views
Macroscopes
Look for
Differences
Data Can Tell a
    Story
http://www.musik-therapie.at/PederHill/Structure&Plot.htm
Visual Data
Summaries
ggplot() +
geom_linerange(data = d1,aes(x= car, ymin = ymin,ymax = ymax)) +
geom_point(data = d2,aes(x= car, y= value,shape = variable),size = 2) +
opts(title="F1 2011 Korea nRace Summary Chart",
    axis.text.x=theme_text(angle=-90, hjust=0)) +
labs(x = NULL, y = "Position", shape = "")
Data
Clean(s)ing
Google Refine
(Inner) Joins &
 Reconciliation
Google Fusion
Tables
Google Refine
OpenHeatMap
“Data Flow”
“Analog Synth Meeting”, Todd Huffman
Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Find the data…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Get the data as data…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Transform the data…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Enrich the data and transform again…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Display the data…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Publish the displayed data…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
The onlineCSV file
      becomes a spreadsheet
          becomes A DATABASE
Finding data…
site:.gov.uk
filetype:xls
underspend
inurl:http://phx.corporate-ir.net/phoenix.zhtml?
intitle:press
site:phx.corporate-ir.net
inurl:http://phx.corporate-ir.net/phoenix.zhtml?
intitle:press
site:phx.corporate-ir.net
Tapping the
Data Burden
Reporting body              Receiving body


                      Data tap




Data Burdens and FOI
Opening Data
 Up via FOI
“Public Data” &
 Social Media
   Mapping
Emergent views
 of structural
  properties
My “journalism” is tracking down
tools and working out recipes that
     help datasets tell stories
http://delicious.com/stacks/view/CROBXt
Build lazy…
Electrical Safety 101

We get a lot of stuff from
Asia, so it all comes with
funny plugs, travelling just
adds to the fun.

Left to right top to bottom we
have:

Singapore wall socket UK
Adapter UK -> NZ/AU
Double adapter NZ/AU
My cell charger NZ/AU
Adapter NZ/AU -> everything
Andreas cell charger Euro
Camera charger US




                    tolomea
“Hands Passing Baton at Sporting Event”, tableatny
@psychemedia

blog.ouseful.info

Mais conteúdo relacionado

Mais de Tony Hirst

Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Tony Hirst
 
Lincoln jun14datajournalism
Lincoln jun14datajournalismLincoln jun14datajournalism
Lincoln jun14datajournalism
Tony Hirst
 
Lincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data JournalismLincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data Journalism
Tony Hirst
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear tales
Tony Hirst
 

Mais de Tony Hirst (20)

Virtual computing.pptx
Virtual computing.pptxVirtual computing.pptx
Virtual computing.pptx
 
ouseful-parlihacks
ouseful-parlihacksouseful-parlihacks
ouseful-parlihacks
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriate
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriate
 
Robotlab jupyter
Robotlab   jupyterRobotlab   jupyter
Robotlab jupyter
 
Fco open data in half day th-v2
Fco open data in half day  th-v2Fco open data in half day  th-v2
Fco open data in half day th-v2
 
Notes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 WorkshopNotes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 Workshop
 
Community Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wireCommunity Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wire
 
Residential school 2015_robotics_interest
Residential school 2015_robotics_interestResidential school 2015_robotics_interest
Residential school 2015_robotics_interest
 
Data Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKXData Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKX
 
Week4
Week4Week4
Week4
 
A Quick Tour of OpenRefine
A Quick Tour of OpenRefineA Quick Tour of OpenRefine
A Quick Tour of OpenRefine
 
Conversations with data
Conversations with dataConversations with data
Conversations with data
 
Data reuse OU workshop bingo
Data reuse OU workshop bingoData reuse OU workshop bingo
Data reuse OU workshop bingo
 
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
 
Lincoln jun14datajournalism
Lincoln jun14datajournalismLincoln jun14datajournalism
Lincoln jun14datajournalism
 
Lincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data JournalismLincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data Journalism
 
Calrg14 tm351
Calrg14 tm351Calrg14 tm351
Calrg14 tm351
 
Calrg14 tm351
Calrg14 tm351Calrg14 tm351
Calrg14 tm351
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear tales
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Lincoln ddj

Notas do Editor

  1. Do we have a hashtag for the workshop?
  2. Collaborative commentary
  3. Through the provision of an API on top of the aggregated local council data, OpenlyLocal can also be treated as a database in its own right. In the example shown here, committee membership is displayed via a treemap showing party affiliations of committee members. (Hovering over a particular grouping displays a list of names of council members on that committee from that party political grouping.) Whilst it would be a major task to take data from every council website in a variety of formats in order to generate similar views for other councils, the work done by OpenlyLocal in aggregating this data and then republishing it via a single API in a single format means that the treemap view can be applied to each council whose data is stored in OpenlyLocal.In passing, it is also worth mentioning how the use of visualisations can be helpful in cleaning data or identifying possible errors in it. In the above example, we see that party affiliations for councillors on the Isle of Wight Council are declared as both Liberal Democrat and and Liberal Democrat Group.
  4. The top, blue strip shows the gear (1 to 7); the green strip shows the throttle pedal depression (0-100%), and the red strip shows the brake (0-100%). The light blue strip is a composite of the previous three strips. The whiter the pixel, the closer it is to 100% throttle in 7th gear with no braking.The bottom two traces show the longitudinal and lateral g-force respectively. For the longitudinal trace, red shows braking – being forced into the steering wheel; green shows acceleration – being forced back into your seat. You’ll see the greatest g-force under braking occurs when the brakes are slapped full on… (the red bits in the third and fifth traces line up). For the latitudinal g-force, the red shows the driving being flung to the left (i.e. right hand corner), the green shows them being pushed out to the right.
  5. Analogsynth – pretty much ultimate freedom to linlk audio processing effects modules together. Simplified by having a common plug.
  6. Some scene setting about what I mean by “flow”…
  7. Suppose we have a table of numerical data associated with placenames on something like Wikipedia. How do we knock up a quick map view of the data?
  8. UK city population search onwikipedia
  9. This can all be a bit flakey – a bit like balancing stones… But It can also be surprisingly stable (for a time at least!)
  10. Here we see the result of pulling data into a Google Spreadsheet from a CSV file published at a particular web address. We now have the ability to run the full range of spreadsheet tools over the data – data which is being pulled in from the datastore, remember.(A similar functionality presumably exists in Microsoft Excel?)
  11. Emergent Social Positioning: origins: 1.5 degree egonet (how followers follow each other, how hashtaggers follow each other)- projection maps from followers to folk they commonly follow;-- projection maps from hashtaggers to folk they commonly follow- projection maps from friends to folk who commonly follow them
  12. Lots of the time, things don’t quite fit: the import format for one tool does not match up with the export formats of another… so sometimes we need an adapter. (Cf. also the notion of impedance mismatch.)
  13. Do we have a hashtag for the workshop?