SlideShare uma empresa Scribd logo
1 de 17
Data Curation and
Biodiversity Research --
The BiSciCol Project and a
look
at the “Triplifier Simplifier”


John Deck, University of California, Berkeley
Brian Stucky, University of Colorado, Boulder
Lukasz Ziemba, University of Florida, Gaineseville
Nico Cellinese, University of Florida, Gainesville
Rob Guralnick, University of Colorado, Boulder
BiSciCol Team
Reed Beaman, Nico Cellinese, Jonathan Coddington, Neil Davies, John
   Deck, Rob
Guralnick, Bryan P. Heidorn, Chris Meyer, Tom Orrell, Rich Pyle, Kate
   Rachwal, Brian
Stucky, Rob Whitton, Lukasz Ziemba
•   BiSciCol is National Science Foundation funded 2010 – 2014
•   Infrastructure to tag & track specimens & derivates in cyberspace
•   Relies on globally unique identifiers (GUIDs) to track objects
•   Implements a Linked Data approach
•   Provides support for the Global Names Architecture
A Biological Relationship Graph …




                          Taxonomic Type Filter




                          Class Filter
                           X  Specimens
                               Tissues
                           X   Sequences
Why Linked Data? Why BiSciCol?
Here is Gustav’s Problem


                   Generates Lots of Data…




 (Prefers to collect stuff)
Biodiversity Data Challenges



   Data is Distributed


   Rapidly Changing
   Technologies

   Covers Multiple
   Domains
Solving Biodiversity Data Challenges with
BiSciCol and Linked Data

                                             Is a dwc:Event
  Group data into classes.
                                             Is a dwc:Event
   Assign identifiers.

   Link identifiers.

                         [ ] Ocean Sampling Day
   Publish.              [X] Moorea Biocode
                         [X] SI MSNGR System
                         [+] Add My Data
The Triplifier
                                          Darwin Core
                                           Archive      Darwin
                                                        Core
                                                        Archive

PART 1: Loading Data
                        Spreadsheets



                          Mysql




                                       KEMU      MySQL
The Triplifier


PART 2: Assigning Entities                                                     78




                             From Gary Larsen and adapted by Barry Smith in Referent Tracking
                                presentation at the Semantics of Biodiversity Workshop, 2012.
The Triplifier


PART 3: Assign Links
Triplify!: View graph based data
            Response




    Query
The Triplifier Interface




     Publish
What challenges are we facing now?
(for BiSciCol, Linked Data, and data integration
                   In general)
Identifier Issues
                                 Persistence
                                 Solutions:
                                 • DOIs (http://doi.org/)
                                 • EZIDs (http://ezid.net/)




                                 Assignment at the source is difficult
                                 Solutions:
                                 • Calculated namespaces (e.g. geo:lat,lng) via PDAs
                                 • UUIDs (randomly unique)

           The digestible RFID tag


                                     Semantic web requires URIs but many standards (including
scheme : string                      Darwin Core) do not require URIs for identifiers
                                     Solution:
                                     • Promote use of URIs for identifiers in all Standards.
     URI
“Occurrence”      Classification Issues
                                              Inadequate representational units
                                              Confusion between representational
                                              units



“Sample, Specimen, Individual, Aggregation”

                                               Solutions:
                                               • Continue working on clarity in term
                                                  definitions
                                               • Work from upper level ontologies (e.g.
                                                  Basic Formal Ontology) to derive
                                                  definitions.
Relation Issues
        Non-sensical conclusions are possible!




        Solution:
        • apply directional links only where
           appropriate.
Adoption Issues
    Critical mass required for effective utilization
    Solutions:
    • Work with aggregators (GBIF, VertNet, NCBI).
    • View Triples as a publishable unit




    Reality is complicated

    Solutions:
    • Work collaboratively (e.g. BioPortal,
       hackathons, interdisciplinary
       workshops)
The BiSciCol Mission

• BiSciCol tackles biodiversity data challenges:
    •    Tracking and integration of objects across disciplines
    •    Linking derivatives back to their source
• BiSciCol is about community, collaborative practice
    •    Commitment to standards, ontologies
    •    Agreement on permanent, resolvable identifiers
    •    Triplification of data sources to enhance linked data



        http://biscicol.blogspot.com/       http://biscicol.org

Mais conteúdo relacionado

Semelhante a 3 bitriplifiertalk

Biological Science Collections Tagging and Tracking presented at SPNHC
Biological Science Collections Tagging and Tracking presented at SPNHCBiological Science Collections Tagging and Tracking presented at SPNHC
Biological Science Collections Tagging and Tracking presented at SPNHCRob Guralnick
 
Scientific data management from the lab to the web
Scientific data management   from the lab to the webScientific data management   from the lab to the web
Scientific data management from the lab to the webJose Manuel Gómez-Pérez
 
BiSciCol: Linking Information for Biodiversity Scientists
BiSciCol: Linking Information for Biodiversity ScientistsBiSciCol: Linking Information for Biodiversity Scientists
BiSciCol: Linking Information for Biodiversity ScientistsJohn Deck
 
Publishing biodiversity: The interplay between Scratchpads and the new Biodiv...
Publishing biodiversity: The interplay between Scratchpads and the new Biodiv...Publishing biodiversity: The interplay between Scratchpads and the new Biodiv...
Publishing biodiversity: The interplay between Scratchpads and the new Biodiv...Dimitrios Koureas
 
Marco Roos: Newton's ideas and methods are preserved forever: how about yours?
Marco Roos: Newton's ideas and methods are preserved forever: how about yours?Marco Roos: Newton's ideas and methods are preserved forever: how about yours?
Marco Roos: Newton's ideas and methods are preserved forever: how about yours?GigaScience, BGI Hong Kong
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and LibariesRob Grim
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013ECNOfficer
 
Scratchpads training course introduction
Scratchpads training course introductionScratchpads training course introduction
Scratchpads training course introductionDimitrios Koureas
 
SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science Robert H. McDonald
 
ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides DuraSpace
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objectsseanb
 
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...Beniamino Murgante
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Jian Qin
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycleSherry Lake
 
Bit Level Preservation
Bit Level PreservationBit Level Preservation
Bit Level PreservationMicah Altman
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentationekansa
 
2015 09 emc lsug
2015 09 emc lsug2015 09 emc lsug
2015 09 emc lsugChris Dwan
 

Semelhante a 3 bitriplifiertalk (20)

NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
 
Biological Science Collections Tagging and Tracking presented at SPNHC
Biological Science Collections Tagging and Tracking presented at SPNHCBiological Science Collections Tagging and Tracking presented at SPNHC
Biological Science Collections Tagging and Tracking presented at SPNHC
 
Scientific data management from the lab to the web
Scientific data management   from the lab to the webScientific data management   from the lab to the web
Scientific data management from the lab to the web
 
BiSciCol: Linking Information for Biodiversity Scientists
BiSciCol: Linking Information for Biodiversity ScientistsBiSciCol: Linking Information for Biodiversity Scientists
BiSciCol: Linking Information for Biodiversity Scientists
 
Publishing biodiversity: The interplay between Scratchpads and the new Biodiv...
Publishing biodiversity: The interplay between Scratchpads and the new Biodiv...Publishing biodiversity: The interplay between Scratchpads and the new Biodiv...
Publishing biodiversity: The interplay between Scratchpads and the new Biodiv...
 
Marco Roos: Newton's ideas and methods are preserved forever: how about yours?
Marco Roos: Newton's ideas and methods are preserved forever: how about yours?Marco Roos: Newton's ideas and methods are preserved forever: how about yours?
Marco Roos: Newton's ideas and methods are preserved forever: how about yours?
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013
 
Scratchpads training course introduction
Scratchpads training course introductionScratchpads training course introduction
Scratchpads training course introduction
 
SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science
 
ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides
 
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research DataNISO/DCMI Webinar: Metadata for Managing Scientific Research Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objects
 
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycle
 
Bit Level Preservation
Bit Level PreservationBit Level Preservation
Bit Level Preservation
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentation
 
2015 09 emc lsug
2015 09 emc lsug2015 09 emc lsug
2015 09 emc lsug
 

Último

APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 

Último (20)

APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 

3 bitriplifiertalk

  • 1. Data Curation and Biodiversity Research -- The BiSciCol Project and a look at the “Triplifier Simplifier” John Deck, University of California, Berkeley Brian Stucky, University of Colorado, Boulder Lukasz Ziemba, University of Florida, Gaineseville Nico Cellinese, University of Florida, Gainesville Rob Guralnick, University of Colorado, Boulder BiSciCol Team Reed Beaman, Nico Cellinese, Jonathan Coddington, Neil Davies, John Deck, Rob Guralnick, Bryan P. Heidorn, Chris Meyer, Tom Orrell, Rich Pyle, Kate Rachwal, Brian Stucky, Rob Whitton, Lukasz Ziemba
  • 2. BiSciCol is National Science Foundation funded 2010 – 2014 • Infrastructure to tag & track specimens & derivates in cyberspace • Relies on globally unique identifiers (GUIDs) to track objects • Implements a Linked Data approach • Provides support for the Global Names Architecture
  • 3. A Biological Relationship Graph … Taxonomic Type Filter Class Filter X Specimens Tissues X Sequences
  • 4. Why Linked Data? Why BiSciCol? Here is Gustav’s Problem Generates Lots of Data… (Prefers to collect stuff)
  • 5. Biodiversity Data Challenges Data is Distributed Rapidly Changing Technologies Covers Multiple Domains
  • 6. Solving Biodiversity Data Challenges with BiSciCol and Linked Data Is a dwc:Event Group data into classes. Is a dwc:Event Assign identifiers. Link identifiers. [ ] Ocean Sampling Day Publish. [X] Moorea Biocode [X] SI MSNGR System [+] Add My Data
  • 7. The Triplifier Darwin Core Archive Darwin Core Archive PART 1: Loading Data Spreadsheets Mysql KEMU MySQL
  • 8. The Triplifier PART 2: Assigning Entities 78 From Gary Larsen and adapted by Barry Smith in Referent Tracking presentation at the Semantics of Biodiversity Workshop, 2012.
  • 9. The Triplifier PART 3: Assign Links
  • 10. Triplify!: View graph based data Response Query
  • 12. What challenges are we facing now? (for BiSciCol, Linked Data, and data integration In general)
  • 13. Identifier Issues Persistence Solutions: • DOIs (http://doi.org/) • EZIDs (http://ezid.net/) Assignment at the source is difficult Solutions: • Calculated namespaces (e.g. geo:lat,lng) via PDAs • UUIDs (randomly unique) The digestible RFID tag Semantic web requires URIs but many standards (including scheme : string Darwin Core) do not require URIs for identifiers Solution: • Promote use of URIs for identifiers in all Standards. URI
  • 14. “Occurrence” Classification Issues Inadequate representational units Confusion between representational units “Sample, Specimen, Individual, Aggregation” Solutions: • Continue working on clarity in term definitions • Work from upper level ontologies (e.g. Basic Formal Ontology) to derive definitions.
  • 15. Relation Issues Non-sensical conclusions are possible! Solution: • apply directional links only where appropriate.
  • 16. Adoption Issues Critical mass required for effective utilization Solutions: • Work with aggregators (GBIF, VertNet, NCBI). • View Triples as a publishable unit Reality is complicated Solutions: • Work collaboratively (e.g. BioPortal, hackathons, interdisciplinary workshops)
  • 17. The BiSciCol Mission • BiSciCol tackles biodiversity data challenges: • Tracking and integration of objects across disciplines • Linking derivatives back to their source • BiSciCol is about community, collaborative practice • Commitment to standards, ontologies • Agreement on permanent, resolvable identifiers • Triplification of data sources to enhance linked data http://biscicol.blogspot.com/ http://biscicol.org