This is a presentation that will be given at the 2012 Health Datapalooza (http://hdiforum.org), describing the new healthdata.gov site, its PaaS/DaaS direction, and related i2/ONC developer challenges.
What Are The Drone Anti-jamming Systems Technology?
HDI III - Healthdata.gov - Now, Next and Challenges
1. healthdata.gov
now and next
challenges overview
hhs ocio, health datapalooza 2012
2. session agenda
• now
– tools and features
• next
– target architecture
• challenges
– explanations in sequence
1
3. now – tools and features
• Drupal
– publishing workflow and community engagement
• Solr
– faceted search
• CKAN
– „on demand resources‟ (RESTful API and feeds)
• EC2
– powered by GovCloud
• github.com/hhs
– public repo‟s coming soon!
2
12. next – target architecture
• linked data
– (closed) google knowledge graph
– open health knowledge graph
• integration framework
– top down modeling
– bottom up mapping
– social curation
11
13. #gkg – (closed) ‘things, not strings’
“The Knowledge Graph helps us
understand the relationships
between things [… that are] linked
in our graph. […] It‟s not just a
catalog of objects; it also models
all these inter-relationships.” source
12
20. i2 challenges
• two types
– three domain specific
• improve the integration and liquidity of data made available
– four platform specific
• enhance the capabilities of the technology components
• 3 release rounds
– sequenced to leverage dependencies
• round 1: June through October 2102
• round 2: November 2012 through May 2013
• round 3: June through December 2013
19
21. round 1 challenges
• June 2012 through October 2012
– domain specific
• [1.1] cross domain and domain specific metadata
– voluntary consensus standards organizations, defacto
standards, other
– platform specific
• [1.2] Simplified Sign On (SSO)
– WebID identity provider and relying parties, HDP infrastructure
components
– $35K: $20K 1st, $10K 2nd, $5K 3rd place prizes
20
22. round 2 challenges
• November 2012 through May 2013
– domain specific
• [2.3] Mapping, Reconciliation and Correlation
– structural variety, authoritative URI‟s, linking heuristics
– platform specific
• [2.4] Faceted Browsing and Visualization
– D3 (backbone, jQuery, etc.)
• [2.5] Custom API
– Linked Data API „configurator‟ for dataset resources
» each of these builds on [1.1] results
21
23. round 3 challenges
• June 2013 through December 2013
– domain specific
• [3.6] Correlating HHS and NHS Classifications
– structural variety, authoritative URI‟s, linking heuristics
– platform specific
• [3.7] Linked Data API based Data Element Access Services
– „securing the data, not just the device‟
» builds on [1.1], [1.2], and [2.5]
22
24. domain challenge [1.1]
• Metadata
– requests the application of existing voluntary
consensus standards for metadata common to all
open government data
– and invites new designs for health domain specific
metadata to classify datasets in our growing catalog,
creating entities, attributes and relations
– that form the foundations for better discovery,
integration and liquidity.
• 374 on challenge.gov
23
37. platform challenge [1.2]
• WebID based SSO
– will improve community engagement
– by providing simplified sign on (SSO) for external
users interacting across multiple HDP technology
components,
– making it easier for community collaborators to
contribute,
– leveraging new approaches to decentralized
authentication.
• 375 on challenge.gov
36
42. domain challenge [2.3]
• Mapping, Reconciliation and Correlation
– builds on the Metadata domain challenge [1.1]
– begins by acknowledging disparate open government publishing
practices
– and seeks the demonstration of an innovative and automated
solution for transforming semi-structured data into structured data,
– reconciles decentralized distributions about the same data entity
against the master identity of an authoritative source,
– and correlates these master identities when multiple authoritative
sources exist,
– enabling the network effect by introducing strong identity resolution
techniques that ease the ability to aggregate different data about
the same entities from independent publishers.
41
47. platform challenge [2.4]
• Faceted Browsing and Visualization
– builds on the Metadata domain challenge [1.1]
– uses the most popular browser based UI frameworks and libraries
to realize novel exploration and discovery techniques for traversing
large amounts of interrelated data,
– contributing to a growing collection of open source widgets that
make it easy for third parties to create new applications and embed
health data in their content.
46
48. surfing the domain schemata
no domain knowledge
required to discover
entities and relationships
47
49. agents construct e/r queries
Siri, which {LA County}
Hospitals have the best
{Heart Attack} stats?
48
51. platform challenge [2.5]
• Custom API
– also builds on the Metadata domain challenge [1.1]
– makes it possible to tune programmatic access in accordance
with dataset metadata, leveraging an existing „Web 3.0‟
framework and Linked Data API (LDA) implementation to provide
specialized interfaces
50
52. a ‘Web 3.0’ API ‘configurator’
• Linked Data API (LDA)
– http://code.google.com/p/linked-data-api/
• open source impl here
– http://code.google.com/p/puelia-php/
• example usage here
– http://reference.data.gov.uk/doc/department
• example api reference docs here
– http://environment.data.gov.uk/lab/doc/api-bwq-reference-
v0.2.html
• commercialization example here
– http://kasabi.com/tour
51
53. domain challenge [3.6]
• Correlating HHS – NHS Classifications
– builds on both the Metadata [1.1] and Mapping, Reconciliation and
Correlation [2.3] domain challenges,
– and uses the US and UK health domain specific classification
schemes to exercise the capabilities demonstrated by the
automated solution to [2.3],
– resulting in better international integration of frameworks for
understanding societal outcomes and their corresponding health
statistics.
52
54. platform challenge [3.7]
• Linked Data API based Data Element Access Services
– builds on the Metadata domain challenge [1.1], and the Web ID
based SSO [1.2], and Custom API [2.5] platform challenges
– augmenting WebID based authentication with metadata driven
authorization,
– introducing an innovative security and privacy implementation of
„data element access services‟ (DEAS) as described by the PCAST
Health IT Report,
– resulting in a Custom API configured by domain specific metadata
that governs fine grained access to provide the right data to the
right user.
• „secure the data, not just the devices‟
53