1. ICEGOV
Open Data Tutorial
Jim Hendler, @jahendler
Jeanne Holm, @JeanneHolm
22 October 2012
Co-author: Hadley Beeman, @HadleyBeeman
2. Introductions!
• Please introduce yourself
– Name
– Organization
– Three (3) words that explain either why you are
here or what you hope to learn
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 2
3. Understanding the
Foundations of Open Data
• Why do countries and people share data?
• What will citizens, businesses, scientists, and
journalists do with the data?
• How can we manage it?
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 3
4. Why Countries Share Data
• Meet regulatory compliance
• Provide transparency into government
operations
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 4
5. Why Countries Share Data
• Anticipate economic development
• Initiate innovation
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 5
6. 2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 6
7. Why People Want Open Data
Swati Ramanathan
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 7
8. Real Outcomes = Better Lives
• In health care
– Data empowers communities to make changes
that improve the quality of life of citizens
• In California, ReLeaf plants trees in areas identified as
danger areas for asthma sufferers
– Companies use government data to innovate and
create high-value jobs
– Civic Commons has a great collection of good
open use cases: http://civiccommons.org/
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 8
9. Energy Drives Innovation
• Communities like
Energy.Data.gov
connect
innovators, indus
try, academia, an
d government at
federal, state, an
d local levels
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 9
10. Challenges Spark Ideas
• Energy.Data.gov
connects works
with challenges
across the nation to
integrate federal
data and bring
government
personnel to code-
a-thons
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 10
11. Data Drives Decisions
• Apps transform data
in understandable
ways to help people
make decisions
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 11
12. Changing Economic
Equations
Study from Malaysian government:
http://www.transknowformance.com/article.cfm?id=53
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 12
13. Why People Want Open Data
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 13
14. What Makes Data Open
• Open Format
– The US Government through the Open
Government Directive defines an open format
as “one that is platform independent, machine
readable, and made available to the public
without restrictions that would impede the re-use
of that information.”
• http://www.whitehouse.gov/omb/assets/memoranda_
2010/m10-06.pdf
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 14
15. What Makes Data Open
• Example Open Formats
– PDF for documents (but not data)
– CSV for data
– Web standards for publishing, sharing or linking
• HTML, XML, RDF
– Web standards for syndication
• RSS, Atom, JSON
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 15
16. What Makes Data Open
• Metadata
– The information about the data being shared
• Who produced it
• Where
• When
• Use restrictions
• Etc.
– Use standards such as ADMS or Dublin Core
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 16
17. Dataset extension to Schema.org
(pending): Google, MS (Bing), Yahoo!
• Improve SEO
• Improve international
search and federation
• Unique opportunity
for public/private
partnership
Express your support at:
http://blog.schema.org/2012/07/describing-datasets-with-schemaorg.html
9 July 2012 2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
ICEGOV Open Data Tutorial 17
18. What Topics of Data Are
Published
• Analytics based on over 1,000,000 datasets
from around the world can be seen at
– http://logd.tw.rpi.edu/iogds_data_analytics
• The examples that follow are from that page
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 18
19. Countries Sharing Data
Important note:
quantity is not really the
most important issue
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 19
20. Countries Sharing Data
Important note:
quantity is not really the most
important issue
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 20
21. Example: U.S.
Data.gov
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 21
22. Example: UK
Data.gov.uk
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 22
23. Example: Spain
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 23
24. Topics (Across All Catalogs)
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 24
25. Topics (Across All Catalogs)
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 25
26. Data “Mashups” of Many Kinds
More than 50 at
http://logd.tw.rpi.edu
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 26
27. Making Data
Open, Accessible, and
Discoverable
• Architecture for systems and technology
• Processes for publishing data
• Policies for ensuring data is
open, accessible, and obtainable
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 27
28. Creating an Open Data
Architecture
• Key components
– Workflow for release approval (often overlooked)
– Dataset storage
• Can be centralized or via linking
– Data cataloging
• Metadata critical to a good open
data site
– Data API
• Can be via download or via access
• Technical issues with syndication, usage rules, etc.
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 28
29. Processes
• Publication (and cleaning)
• Data reuse and integration
• Community input
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 29
30. Policies Become Essential
• Policies help drive the ecosystem and “motivate”
departments to continue to share data openly
• Build the policies based around issues that are universal
• Licensing, provenance
http://creativecommons.org/licenses/
Open data on
food, security, transp
ortation, and
transparency
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 30
31. Semantic Web and Linked Data
County Council
Royal Mail
Ordnance Survey
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 31
32. Linking Data Via Common
Naming (Usually URLs)
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 32
33. Example: Agency Names
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 33
34. Can Be Lots of Things
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 34
35. “Linking” Data
Government data is currently
over half the cloud in size
(~17B triples), 10s of
thousands of links to other
data (within and without)
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
http://linkeddata.org/ Tutorial
9 July 2012 ICEGOV Open Data 35
36. 5 Star Data
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 36
37. Creating the Open Data
Community
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 37
38. Creating Community
• Communities are public-facing
spaces that present
data, information, and subject
matter knowledge about a single
topic from many organizations in
one place
– The topics for communities can be
chosen based on priorities from the
public, departments based on their
mission, or issues of national
importance
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 38
39. Community Vision
• These questions help to guide early discussions
1. Vision: What will the community connection and collaboration look
like in the future?
2. Leaders: Who will help to lead the community?
3. Participants: Who will participate?
4. Outcome: What are the expected outcomes, metrics, and
measurements that will show success? How will this community
work to improve the lives of citizens?
5. Functionality: What types of activities will be conducted on the site
(forums, blogs, wikis, ranking, rating, challenges, or apps)?
6. Content: What content should be displayed
7. Interactivity: What ways will the community interact with the
leaders, with each other, and with the public?
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 39
40. Open Communities
Community
✓
Developers
Open Data ✓
Semantic Web ✓
Health ✓
Law ✓
Energy ✓
Education ✓
Ocean ✓
Safety ✓
Manufacturing ✓
Business ✓
Ethics ✓
Smart Disclosure ✓
Sustainable Supply Chain ✓
Cities ✓
+ many more…
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 40
41. Supporting Global Events
Japanese
tsunami, earthquake, and
radiation monitoring
Restore the Gulf:
Deepwater Horizon
Response
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 41
42. Health.Data.gov
Champion: Todd Park
U.S. Chief Technology
Officer
Apps Forums
Blogs
Challenges
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 42
43. Publicizing Data to Innovators
• Challenges and code‐a-thons
(health2challenge.org)
• Many innovator “meetups” and
conferences
• Annual health data-paloozas
• Over 139 applications
• 50 new businesses
• Thousands of lives improved
each day
• 1700 attendees at the Health
Data Palooza in 2012
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 43
44. Creating Apps That Improve Lives:
Asthmapolis
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 44
45. Creating Apps That Save
Lives: iTriage and Hospital
Compare
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 45
46. Additional Topics
• Licensing, provenance, languages
• Metadata design (international)
• Trust – government data is controversial, who
controls it?
• Scaling – over 1M datasets and growing fast
– How to search, store, link, translate, and archive
• Versioning and updating
• Visualization beyond the single dataset
• Boundaries of open data
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 46
47. Summary
• More and more government agencies throughout the world are
sharing “raw data” with their citizens
– Enhance transparency
– Increase Innovation
– “Crowd source” government services
• Open data in open formats allows governments, agencies, and third
parties to develop analyses, information graphics, and other ways
to share information
• Development of correct processes and policies are an important
aspect of Open Government Data sharing
– Need to support, not squelch, information sharing
– Need to find appropriate balance of data release with
privacy, security, and other citizen/government mandates
• Community mechanisms are an important aspect of an open data
ecosystem
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 47
48. Questions
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 48
49. Summary and Next Steps
• Join a community
– W3C eGovernment Interest Group
• http://www.w3.org/egov/wiki/Main_Page
– Open Data Innovation Network on LinkedIn
• http://bit.ly/ODNetwork
2012 INTERNATIONAL OPEN GOVERNMENT DATA CONFERENCE—OPEN GOV DATA TUTORIAL
9 July 2012 ICEGOV Open Data Tutorial 49
Notas do Editor
A parent has a child who is illAsk questions online at HealthTapFind a hospital and compare (Hospital Compare)That doctor recommends GPS-powered inhaler (Asthmapolis)Monitor asthma levels at school through Public School RecordsKnow in advance the best places to play, how to get to school, and how to plan your dayThe data delivered through the 172 agencies participating in Data.gov eases the burden on families in caring for a sick childMore importantly, the data as it’s aggregated empower communities to make changes that improve the qualiy of life of citizens(ReLeaf plants trees in areas identified by Together We Breathe as danger areas for asthma sufferersCities see hot spots that trigger asthma problems for their citizens
Uses asthma patients aggregated GPS notations to create hot spots in communities where there are asthma issuesChanges individual behaviorFrom 65% daily incidence to 25% daily incidence of inhaler uses over a six month study