SlideShare uma empresa Scribd logo
1 de 16
FilteredPush
                                       PI James Hanken

                                       Maureen A. Kelly
                                       David B. Lowery
                                       Paul J. Morris
                                       Robert A. Morris

                                       James A. Macklin
University of California, Davis        Bertram Ludacher
                                       Tianhong Song
                                       Sven Koehler
Harvard University Herbaria            Former Project Participants
                                        Lei Dou
                                        Chinua Iloabachie
University of Massachusetts, Boston     Timothy McPhillips
                                        Donna Tremonte
                                        Zhimin Wang
               NSF: DBI #0960535 (Production, Year 3 of 3)
               NSF: DBI #0646266 (Prototype, Complete)
               http://wiki.filteredpush.org
Annotate What?
●
    Curatorial and scientific metadata for an
    estimated 3 bn. specimens in the world's
    natural history collections. 1 bn in U.S.
●
    Digital record of about 1.5% at most. See
    gbif.org (Also includes non-captured
    observations).
http://www.flickr.com/photos/nhm_beetle_id/5556112706
/
●
    Ongoing programs in U.S., Europe,
    Australia,...to digitize all.
●
    In U.S., 10 year, $500 M effort now underway;
    automated; semi-automated;
    –   QC issues; correlations to other paper and digital
        resources; intentionally changing data
The Problem: Data Quality
● Collections & occurrence data is
  all over the map
   – … literally (off the map!)
   – … even after “digitization” (sensu stricto)
   – … and after “digitization” (sensu lato),
     a.k.a. “computerization”


● Issues:
   – Lat/Long transposition,
     coordinate & projection issues
   – Data entry/creation, “fuzzy”
     data, naming issues, bit rot,
     data conversions and
     transformations, schema
     mappings, … (you name it)

                                                   6
(1) Kvetch about data


                           (2) Push to interested parties

                               (3) Filter
    http://xkcd.com/386/




                                        (4) Change data
                                        in databases
(5) Store all assertions
Modeling distributed annotations of
mutable, actionable, distributed
                data
●
    OA is almost adequate
+ clear distinction between annotation model
  and domain models;
+ provenance support
+ web document-centric features are optional
+ SpecificResource model fits data annotation
  well
+ Use of CNT for machine-opaque resources
+ (RDF is friendly to semantic pub/sub)
Modeling distributed annotations of
mutable, actionable, distributed
                data
●
    OA is almost adequate
- Model of Evidence for assertions in Body.
- Model to convey annotator's Expectation to
  consuming apps
- Model use of queries as Selectors
(-) Model “transcription” as an oa:Motivation
(?) Model domain associations between Target
  and Body
:anAnnotation a oa:Annotation;
  oa:hasTarget :nsptarget1;
  oa:hasBody <http://fp.org/rangeViolation> ;
  oa:hasBody :invalidLatitudeText.
:nsptarget1 a oad:AnySuchResource;
  oa:hasSelector :findInvalidLatitudes.
:findInvalidLatitudes a oad:SparqlQuerySelector;
  oad:hasQuery “select distinct ?x WHERE {
         ?x a dwcFP:Occurrence.
         ?x dwc:hasLatitude ?lat.
         filter (?lat > 90 || ?lat < 90).}”;
   […]   .
Clients                                        AnnotationProcessor
                                                   AnnotationProcessor
                  Specify 6                      Mapper        ClientLibrary
                                         Specify6 Driver                 SparqlRules
                                       Specify6-HUH Driver                OA/OAD
                                           SQL Driver                      dwcFP
                                       FP-oauth-consumer
Morphbank                                FP-oauth-provider
FilteredPush instance


                             Client Tools      FP Network                 Triage
                            FP-PHP-Library
                             ClientLibrary                       FP Access Point
                                SparqlRules
    Symbiota                     OA/OAD
                                                 Knowledge      Messaging          Analysis
    FilteredPush instance
                                 dwcFP                            JMS-KVP           Kepler
                                                   MySQL
                             RDF Handler          Fuseki         SPARQL
                                                                   Push
                                                   Fedora
 Kepler Kuration
                                                   Mongo
Who asserted photo:owner ???
No confusion about provenance of photo:owner assertion ?
END


http://wiki.filteredpush.org

Video there in a few(?) weeks
Specimen record annotation

Mais conteúdo relacionado

Destaque

Netgeneráció
NetgenerációNetgeneráció
Netgeneráció
Trita19
 
Werkstatt 1 finalsätze
Werkstatt 1   finalsätzeWerkstatt 1   finalsätze
Werkstatt 1 finalsätze
Ilse Gruenbart
 
Bricopoint novembre dicembre 2011 14
Bricopoint novembre dicembre 2011 14Bricopoint novembre dicembre 2011 14
Bricopoint novembre dicembre 2011 14
erik8312
 
Middle Eastern Terrorism Works Cited
Middle Eastern Terrorism Works CitedMiddle Eastern Terrorism Works Cited
Middle Eastern Terrorism Works Cited
wilkie65559
 
The Burial of Eteocles
The Burial of EteoclesThe Burial of Eteocles
The Burial of Eteocles
guestbf3d50
 
Legalizacionmarihuana
LegalizacionmarihuanaLegalizacionmarihuana
Legalizacionmarihuana
caroushine
 
Russian October Word of the Day
Russian October Word of the DayRussian October Word of the Day
Russian October Word of the Day
sbornstein099kms
 

Destaque (18)

Wie Ihre Geburtstagsparty gelingt
Wie Ihre Geburtstagsparty gelingtWie Ihre Geburtstagsparty gelingt
Wie Ihre Geburtstagsparty gelingt
 
Netgeneráció
NetgenerációNetgeneráció
Netgeneráció
 
Dativ
DativDativ
Dativ
 
Dativ
DativDativ
Dativ
 
Werkstatt 1 finalsätze
Werkstatt 1   finalsätzeWerkstatt 1   finalsätze
Werkstatt 1 finalsätze
 
Der Artikel im Dativ
Der Artikel im DativDer Artikel im Dativ
Der Artikel im Dativ
 
Bricopoint novembre dicembre 2011 14
Bricopoint novembre dicembre 2011 14Bricopoint novembre dicembre 2011 14
Bricopoint novembre dicembre 2011 14
 
A-Wurf
A-WurfA-Wurf
A-Wurf
 
Most funny pranks
Most funny pranksMost funny pranks
Most funny pranks
 
Middle Eastern Terrorism Works Cited
Middle Eastern Terrorism Works CitedMiddle Eastern Terrorism Works Cited
Middle Eastern Terrorism Works Cited
 
Inde xscan ngconsolida20111201
Inde xscan ngconsolida20111201Inde xscan ngconsolida20111201
Inde xscan ngconsolida20111201
 
Projeto nacional
Projeto nacionalProjeto nacional
Projeto nacional
 
Star storage m cloud week
Star storage m cloud weekStar storage m cloud week
Star storage m cloud week
 
Pasword
PaswordPasword
Pasword
 
The Burial of Eteocles
The Burial of EteoclesThe Burial of Eteocles
The Burial of Eteocles
 
Legalizacionmarihuana
LegalizacionmarihuanaLegalizacionmarihuana
Legalizacionmarihuana
 
India growth
India growthIndia growth
India growth
 
Russian October Word of the Day
Russian October Word of the DayRussian October Word of the Day
Russian October Word of the Day
 

Semelhante a West coastrollout

Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Peter Haase
 
Deploying Grid Services Using Hadoop
Deploying Grid Services Using HadoopDeploying Grid Services Using Hadoop
Deploying Grid Services Using Hadoop
George Ang
 
Microsoft Openness Mongo DB
Microsoft Openness Mongo DBMicrosoft Openness Mongo DB
Microsoft Openness Mongo DB
Heriyadi Janwar
 
Net flowhadoop flocon2013_yhlee_final
Net flowhadoop flocon2013_yhlee_finalNet flowhadoop flocon2013_yhlee_final
Net flowhadoop flocon2013_yhlee_final
Yeounhee Lee
 

Semelhante a West coastrollout (20)

"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"
 
A Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationA Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and Organization
 
Annotopia open annotation services platform
Annotopia open annotation services platformAnnotopia open annotation services platform
Annotopia open annotation services platform
 
TripFS presentation at ldow 2010
TripFS presentation at ldow 2010TripFS presentation at ldow 2010
TripFS presentation at ldow 2010
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal
 
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop Ecosystem
 
HDFS: Hadoop Distributed Filesystem
HDFS: Hadoop Distributed FilesystemHDFS: Hadoop Distributed Filesystem
HDFS: Hadoop Distributed Filesystem
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
 
Hadoop Essential for Oracle Professionals
Hadoop Essential for Oracle ProfessionalsHadoop Essential for Oracle Professionals
Hadoop Essential for Oracle Professionals
 
Deploying Grid Services Using Hadoop
Deploying Grid Services Using HadoopDeploying Grid Services Using Hadoop
Deploying Grid Services Using Hadoop
 
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
 
Using Architectures for Semantic Interoperability to Create Journal Clubs for...
Using Architectures for Semantic Interoperability to Create Journal Clubs for...Using Architectures for Semantic Interoperability to Create Journal Clubs for...
Using Architectures for Semantic Interoperability to Create Journal Clubs for...
 
Microsoft Openness Mongo DB
Microsoft Openness Mongo DBMicrosoft Openness Mongo DB
Microsoft Openness Mongo DB
 
Hcls sci disc-isa2rdf
Hcls sci disc-isa2rdfHcls sci disc-isa2rdf
Hcls sci disc-isa2rdf
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
 
What’s New in the Berkeley Data Analytics Stack
What’s New in the Berkeley Data Analytics StackWhat’s New in the Berkeley Data Analytics Stack
What’s New in the Berkeley Data Analytics Stack
 
Hadoop
HadoopHadoop
Hadoop
 
Net flowhadoop flocon2013_yhlee_final
Net flowhadoop flocon2013_yhlee_finalNet flowhadoop flocon2013_yhlee_final
Net flowhadoop flocon2013_yhlee_final
 
20120411 travelalliancemcguinnessfinal
20120411 travelalliancemcguinnessfinal20120411 travelalliancemcguinnessfinal
20120411 travelalliancemcguinnessfinal
 
Ld4 l triannon
Ld4 l triannonLd4 l triannon
Ld4 l triannon
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

West coastrollout

  • 1. FilteredPush PI James Hanken Maureen A. Kelly David B. Lowery Paul J. Morris Robert A. Morris James A. Macklin University of California, Davis Bertram Ludacher Tianhong Song Sven Koehler Harvard University Herbaria Former Project Participants Lei Dou Chinua Iloabachie University of Massachusetts, Boston Timothy McPhillips Donna Tremonte Zhimin Wang NSF: DBI #0960535 (Production, Year 3 of 3) NSF: DBI #0646266 (Prototype, Complete) http://wiki.filteredpush.org
  • 2. Annotate What? ● Curatorial and scientific metadata for an estimated 3 bn. specimens in the world's natural history collections. 1 bn in U.S. ● Digital record of about 1.5% at most. See gbif.org (Also includes non-captured observations).
  • 3.
  • 5. Ongoing programs in U.S., Europe, Australia,...to digitize all. ● In U.S., 10 year, $500 M effort now underway; automated; semi-automated; – QC issues; correlations to other paper and digital resources; intentionally changing data
  • 6. The Problem: Data Quality ● Collections & occurrence data is all over the map – … literally (off the map!) – … even after “digitization” (sensu stricto) – … and after “digitization” (sensu lato), a.k.a. “computerization” ● Issues: – Lat/Long transposition, coordinate & projection issues – Data entry/creation, “fuzzy” data, naming issues, bit rot, data conversions and transformations, schema mappings, … (you name it) 6
  • 7. (1) Kvetch about data (2) Push to interested parties (3) Filter http://xkcd.com/386/ (4) Change data in databases (5) Store all assertions
  • 8. Modeling distributed annotations of mutable, actionable, distributed data ● OA is almost adequate + clear distinction between annotation model and domain models; + provenance support + web document-centric features are optional + SpecificResource model fits data annotation well + Use of CNT for machine-opaque resources + (RDF is friendly to semantic pub/sub)
  • 9. Modeling distributed annotations of mutable, actionable, distributed data ● OA is almost adequate - Model of Evidence for assertions in Body. - Model to convey annotator's Expectation to consuming apps - Model use of queries as Selectors (-) Model “transcription” as an oa:Motivation (?) Model domain associations between Target and Body
  • 10.
  • 11. :anAnnotation a oa:Annotation; oa:hasTarget :nsptarget1; oa:hasBody <http://fp.org/rangeViolation> ; oa:hasBody :invalidLatitudeText. :nsptarget1 a oad:AnySuchResource; oa:hasSelector :findInvalidLatitudes. :findInvalidLatitudes a oad:SparqlQuerySelector; oad:hasQuery “select distinct ?x WHERE { ?x a dwcFP:Occurrence. ?x dwc:hasLatitude ?lat. filter (?lat > 90 || ?lat < 90).}”; […] .
  • 12. Clients AnnotationProcessor AnnotationProcessor Specify 6 Mapper ClientLibrary Specify6 Driver SparqlRules Specify6-HUH Driver OA/OAD SQL Driver dwcFP FP-oauth-consumer Morphbank FP-oauth-provider FilteredPush instance Client Tools FP Network Triage FP-PHP-Library ClientLibrary FP Access Point SparqlRules Symbiota OA/OAD Knowledge Messaging Analysis FilteredPush instance dwcFP JMS-KVP Kepler MySQL RDF Handler Fuseki SPARQL Push Fedora Kepler Kuration Mongo
  • 14. No confusion about provenance of photo:owner assertion ?