SlideShare uma empresa Scribd logo
1 de 33
Metadata for Managing
             Scientific Research Data
                      NISO/DCMI Webinar:
                              August 22, 2012




Jane Greenberg, Professor and Director of
the SILS Metadata Research Center
janeg@email.unc.edu
Overview
▪   Why should we care?
▪   What is data?
▪   What is metadata‘s role w.r.t data?
▪   Selected metadata standards
▪   Challenges, opportunities, and jumping in
▪   Concluding comments
▪   Q&A
Why should we care?
BIG stuff
▪ Digital data deluge (Hey & Trefethen, 2003)
▪ Big data (New York Times)
                                                2008
▪ The fourth paradigm (Jim Gray, 2007)

Just as important
▪ The long tail (Heidorn, 2008)
▪ CODATA/Data-at-Risk Task Group
▪ Scholarly communications, data citation

      Technological affordances for improving and
      advancing science
Cultural shift toward data sharing
▪ National and international policies
  – US NSF and NIH [1, 2]
  – OECD (Organisation for Economic Co-operation and
    Development) [3]
  – INSPIRE Infrastructure for Spatial Information in the European
    Community EU Commission [4]
  – UK Medical Research Council [5]

             Dryad ―enables scientists to validate
             published findings, explore new analysis
             methodologies, repurpose data for research
             questions unanticipated by the original
             authors, and perform synthetic studies.‖
             (http://datadryad.org/)
Overview
▪ Why should we care?

▪ What is data?
▪   What is metadata‘s role w.r.t data?
▪   Selected metadata standards
▪   Challenges, opportunities, and jumping in
▪   Concluding comments
▪   Q&A
Data
▪ No single agreed upon definition
▪ One person‘s data is another person‘s
  information
▪ Data often implies the ―raw‖ stuff lacking
  context
   – Scholarly context, written assessment
▪ ―Essence of science‖ (Greenberg, et al, 2009)
▪ What is science?
   – The Archaeology Data Service (ADS)
     archaeologydataservice.ac.uk
Data                               quantity   type             The Dryad
                                                                Repository
                                    3162       Plain Text
I know it when I see it             476        Microsoft Excel
                                    308        Adobe Portable Document
                                               Format
By example: Traditional             302        Comma-separated values
observations, numbers, and          252        Nexus
measures stored in spreadsheets     153        Microsoft Excel OpenXML
and databases, fossils,             108        Microsoft Word
phylogenetic trees, and herbarium   80         Zip file
samples (White, 2008)               62         JPEG image
                                    45         Microsoft Word OpenXML
Other disciplines                   40         Extensible Markup Language
▪ Bioinformatics: Gene              35         Hypertext Markup Language
  expressions, DNA transcription    21         Rich Text Format
  to RNA translation                16         FASTA sequence file
                                    15         Tag Image File Format
▪ Geology, agriculture,
                                    14         Postscript Files
  surveillance, and historical
                                    2          Video Quicktime
  manuscript research:
                                    2          Mathematica Notebook
  Hyperspectral remote sensing
                                    1          Microsoft Powerpoint
                                    (email w/R. Scherle, July 2012)
Overview
▪ Why should we care?
▪ What is data?

▪ What is metadata‘s role w.r.t data?
▪   Selected metadata standards
▪   Challenges, opportunities, and jumping in
▪   Concluding comments
▪   Q&A
Metadata defined
……data about data
…….information about data

▪―Metadata or ‗data about data‘ describes the
content, quality, condition, and other
characteristics of data.‖ (FGDC Metadata WG,
1998)

▪Structured information about an object (data)
that facilitates functions associated with the
object. (Greenberg, 2002, 2003, 2009)
Typical functions

                             Control
 Discover     Manage
                              rights

  Identify     Certify       Indicate
 versions    authenticity     status

Mark conent   Situate        Describe
 strucure   geospatially    processes
Overview
▪ Why should we care?
▪ What is data?
▪ What is metadata‘s role w.r.t data?

▪ Selected metadata standards
▪ Challenges, opportunities, and jumping in
▪ Concluding comments
▪ Q&A
It gets messy really quickly
Metadata for Scientific Research Data


     Descriptive
       – General to granular
   ▪Value (addressing a topic, ―aboutness‖)
       – Topical (ontologies, subject heading lists/thesauri,
         taxonomies)
   ▪Named entities
       – Name authority files (people, organizations,
         geographical jurisdictions, structures, and events)
   ▪Geo-spatial (coordinates)
   ▪Temporal data (ISO 8601/ W3CDTF, or …)
Given the messiness…

―I cannot tell you exactly what metadata
standards, vocabularies, etc. to use…‖
Examining metadata schemes
 Objectives and    Domains               Architectural layout
 principles

 • Objectives • Discipline               • Structural design
                   • Genre               • Extent
 • Principles
                   • Format              • Granularity

Metadata Objectives and principles, Domain, and
Architectural Layout (MODAL) framework

(Greenberg, 2005; Willis, et al, JASIST 2012)
Objectives and    Domains           Architectural
Simple          principles                          layout
schemes
[6]             • Interoperability • Multi-         • Primarily flat
                • Easy to            disciplinary   • Minimal with
                  generate,        • Any genre or     means to
                  lower barrier      format           extend
                  to produce                        • General (not
                                                      granular)
Dublin Core
Metadata
Element Set
(DCMES)
ver.1.1
US MARC         • Need training                     • Primarily flat
bibliographic                                       • Extensible
format
DataCite                                            • Primarily flat
Dublin Core
    Application
    Profile-
    Dryad [7]





DataCite example, ver.2.2 [8]
National Institute for
Environmental Studies and
Center for Climate System
Research Japan
US MARC bibliographic
format: World Ocean
Circulation Experiment global
data (Moss Landing Marine
Labs and the Monterey Bay
Aquarium Research Institute
Library) [9]
Objectives and         Domains              Architectural
Simple/            principles                                  layout
moderate              Interoperability      Greater domain      Primarily flat
                       balanced               focus               Extensibility—
schemes                w/specific            Genera               via connecting
                       needs                  diversity within    Slightly more
                      Generation             a domain             granular
                       requires more
                       expertise
Darwin Core

Access to                                                      •   Not as flat
Biological
Collections Data
(ABCD)
Ecological
Metadata
Language
DCMI Terms                                                     • Graph approach
Wieczorek, et al. (2012). Darwin Core: An Evolving Community-
Developed Biodiversity Data Standard.
PLoS One. 2012; 7(1): e29715: doi: 10.1371/journal.pone.0029715.
Access to Biological Collections Data (ABCD) (A minimum record)

<?xml version='1.0' encoding='UTF-8'?> <DataSets
xmlns='http://www.tdwg.org/schemas/abcd/2.06'>
<DataSet>
<TechnicalContacts> <TechnicalContact> <Name>Gerd
MÃŒller</Name> <Email>gerd@dfb.de</Email>
</TechnicalContact> </TechnicalContacts>
<ContentContacts> <ContentContact> <Name>A
Another</Name> <Email>a.another@fake.org</Email>
</ContentContact> </ContentContacts> <Metadata>
<Description> <Representation language='en'>
<Title>PonTaurus collection</Title> </Representation>
</Description> <RevisionData> <DateModified>2001-03-
01T00:00:00</DateModified> </RevisionData> </Metadata>
<Units> <Unit>
<SourceInstitutionID>BGBM</SourceInstitutionID>
<SourceID>PonTaurus</SourceID> <UnitID>1136</UnitID>
</Unit> </Units> </DataSet> </DataSets>
abstract                educationLevel      modified
accessRights            extent              provenance
accrualMethod           format              publisher
accrualPeriodicity      hasFormat           references
accrualPolicy           hasPart             relation
alternative             hasVersion          replaces
audience                identifier          requires
available               instructionalMethod rights
bibliographicCitation   isFormatOf          rightsHolder
conformsTo              isPartOf            source
contributor             isReferencedBy      spatial
coverage                isReplacedBy        subject
created                 isRequiredBy        tableOfContents
creator                 issued              temporal
date                    isVersionOf         title
dateAccepted            language            type
dateCopyrighted         license             valid
dateSubmitted           mediator        Properties in the /terms/
description             medium                 namespace
Objectives and           Domains               Architectural
Complex           principles                                     layout
schemes
                     Interoperability     •    Genre focus         Hierarchical
                      level                •    Format              Extensive
                     Generation                variation           Granular
                      requires greater
                      expertise
FGDC
DDI

Content Standard for Digital                    Data Document Initiative (DDI)
Geospatial Metadata
(CSDGM)/FGDC
1. Identification Information (M)          1.   Concept
2. Data Quality Information                2.   Collecting
3. Spatial Data Organization Information   3.   Processing  Archiving
4. Spatial Reference Information           4.   Distribution  Archiving
5. Entity and Attribute Information        5.   Discovery
6. Distribution Information                6.   Analysis
7. Metadata Reference Information (M)      7.   Repurposing
Summary for descriptive schemes
▪ Simple: Interoperable, Easy to generate/low barrier,
  generally multidisciplinary, genera/format agnostics,
  primarily flat, general (not granular), 15-25 properties

▪ Simple/moderate: Interoperability balanced
  w/specific needs, generation requires more expertise,
  greater domain focus, extensible--via connecting to
  other schemes, more granular, more properties

▪ Complex: Interoperable level, generation requires
  expertise, genera focus/format variation, hierarchical,
  granular, and extensive (100+ properties)
Overview
▪   Why should we care?
▪   What is data?
▪   What is metadata‘s role w.r.t data?
▪   Selected metadata standards
▪ Challenges, opportunities, and jumping in
▪ Concluding comments
▪ Q&A
Challenges and opportunities
Challenges            Opportunities

Workflow/When to  Educate scientists early (Qin, 2009)
   ▪ Stop
generate the here Integrate into social setting w/Center for
metadata?         Embedded Networked Sensing
                  (CENS) (Borgman, Mayernik, etc., 2009-current;
                  Mayernik‘s dissertation, 2011)
Methods for generating Use automatic techniques as much as possible,
metadata (labor        leverage human expertise (Dryad, DataOne Excel
intensive)             project)

Too many standards    Don‘t panic, join communities, look for
Which one do I use?   examples. (If you can‘t find them?)
Do I need to          No. Explore and develop a best practice.
implement my          Pursue a 2 pronged approach (Greenberg, et al,
metadata as linked    2009)
data.
Jumping in…
1. DCMI/NISO Seminars !!
2. DCMI Science and Metadata Community
  (http://wiki.dublincore.org/index.php/DCMI_Science_And_Metadata)

3. Digital Curation Center (DCC)
  (http://www.dcc.ac.uk/)

4. The Research Data Management
   Training, or MANTRA project
  (http://datalib.edina.ac.uk/mantra/)

5. DataONE workshops and tutorials
  (www.dataone.org/)
Overview
▪   Why should we care?
▪   What is data?
▪   What is metadata‘s role w.r.t data?
▪   Selected metadata standards
▪   Challenges, opportunities, and jumping in
▪ Concluding comments
▪ Q&A
Concluding comments
▪ Standards are guidelines; no police
  – Aim for reasonable quality

▪ KISS: Keep it simple stupid
  – What’s vital; what will aid reuse?
▪ Help to move the practice forward
  – Share what you learn

▪ Nothing new/it‘s all new
  –   Data documentation since ancient times
  –   SILOS; let‘s break them down (Willis, et al, 2012)
  –   Greater connectivity than ever
  –   Cross-disciplinary approaches for problem solving
Overview
▪   Why should we care?
▪   What is data?
▪   What is metadata‘s role w.r.t data?
▪   Selected metadata standards
▪   Challenges, opportunities, and jumping in
▪   Concluding comments

▪ Q&A
Footnotes
[1] NSF Data Sharing Policy: http://www.nsf.gov/bfa/dias/policy/dmp.jsp.
[2] NIH Data Sharing Policy: http://grants.nih.gov/grants/policy/data_sharing/.
[3] ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT/Data and
Metadata Reporting and Presentation Handbook: http://www.oecd.org/std/37671574.pdf.
[4] The INSPIRE Infrastructure for Spatial Information in the European Community):
http://inspire.ec.europa.eu/index.cfm/pageid/48. directive released 15 May 2007 and will be
implemented in various stages, with full implementation required by 2019, and aims to create a
European Union (EU) spatial data infrastructure.
[5] UK medical research council:
http://www.mrc.ac.uk/Ourresearch/Ethicsresearchguidance/datasharing/index.html.
[6] The DCMI Glossary (scroll down for ―schema‖ entry):
http://dublincore.org/documents/usageguide/glossary.shtml#schema.
[7] Dublin Core Example: Data from: Divergence time estimation using fossils as terminal taxa
and the origins of Lissamphibia (Dryad repository):
http://datadryad.org/resource/doi:10.5061/dryad.8120?show=full.
[8] National Institute for Environmental Studies and Center for Climate System Research
Japan—animation data (DataCite): http://schema.datacite.org/meta/kernel-
2.2/example/datacite-metadata-sample-v2.2.xml.
[9] US MARC bibliographic format: World Ocean Circulation Experiment global data (Moss
Landing Marine Labs and the Monterey Bay Aquarium Research Institute Library):
http://mlml.kohalibrary.com/cgi-bin/koha/opac-detail.pl?biblionumber=9282.

Mais conteúdo relacionado

Mais procurados

Data Management Planning at the DCC
Data Management Planning at the DCCData Management Planning at the DCC
Data Management Planning at the DCCMartin Donnelly
 
The Dublin Core 1:1 Principle in the Age of Linked Data
The Dublin Core 1:1 Principle in the Age of Linked DataThe Dublin Core 1:1 Principle in the Age of Linked Data
The Dublin Core 1:1 Principle in the Age of Linked DataRichard Urban
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked DataEUCLID project
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataEUCLID project
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Researchadameq
 
SDA2013 Pundit: Creating, Exploring and Consuming Annotations
SDA2013 Pundit: Creating, Exploring and Consuming AnnotationsSDA2013 Pundit: Creating, Exploring and Consuming Annotations
SDA2013 Pundit: Creating, Exploring and Consuming AnnotationsMarco Grassi
 
Better Search With Structured Knowledge
Better Search With Structured KnowledgeBetter Search With Structured Knowledge
Better Search With Structured KnowledgeMichel Dumontier
 
Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012Figoblog
 
Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to MetadataJenn Riley
 
Tutorial on Semantic Digital Libraries (WWW'2007)
Tutorial on Semantic Digital Libraries (WWW'2007)Tutorial on Semantic Digital Libraries (WWW'2007)
Tutorial on Semantic Digital Libraries (WWW'2007)Sebastian Ryszard Kruk
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceMarin Dimitrov
 
Mending the Gap between Library's Electronic and Print Collections in ILS and...
Mending the Gap between Library's Electronic and Print Collections in ILS and...Mending the Gap between Library's Electronic and Print Collections in ILS and...
Mending the Gap between Library's Electronic and Print Collections in ILS and...New York University
 
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...National Institute of Informatics (NII)
 
The Mysteries of Metadata
The Mysteries of MetadataThe Mysteries of Metadata
The Mysteries of MetadataAmit Sheth
 
IFLA 2012 - OCLC Linked Data round table
IFLA 2012 - OCLC Linked Data round tableIFLA 2012 - OCLC Linked Data round table
IFLA 2012 - OCLC Linked Data round tableFigoblog
 
Towards digitizing scholarly communication
Towards digitizing scholarly communicationTowards digitizing scholarly communication
Towards digitizing scholarly communicationSören Auer
 

Mais procurados (20)

Data Management Planning at the DCC
Data Management Planning at the DCCData Management Planning at the DCC
Data Management Planning at the DCC
 
The Dublin Core 1:1 Principle in the Age of Linked Data
The Dublin Core 1:1 Principle in the Age of Linked DataThe Dublin Core 1:1 Principle in the Age of Linked Data
The Dublin Core 1:1 Principle in the Age of Linked Data
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked Data
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked Data
 
Providing Linked Data
Providing Linked DataProviding Linked Data
Providing Linked Data
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Research
 
SDA2013 Pundit: Creating, Exploring and Consuming Annotations
SDA2013 Pundit: Creating, Exploring and Consuming AnnotationsSDA2013 Pundit: Creating, Exploring and Consuming Annotations
SDA2013 Pundit: Creating, Exploring and Consuming Annotations
 
Better Search With Structured Knowledge
Better Search With Structured KnowledgeBetter Search With Structured Knowledge
Better Search With Structured Knowledge
 
Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012
 
Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to Metadata
 
Tutorial on Semantic Digital Libraries (WWW'2007)
Tutorial on Semantic Digital Libraries (WWW'2007)Tutorial on Semantic Digital Libraries (WWW'2007)
Tutorial on Semantic Digital Libraries (WWW'2007)
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business Intelligence
 
Mending the Gap between Library's Electronic and Print Collections in ILS and...
Mending the Gap between Library's Electronic and Print Collections in ILS and...Mending the Gap between Library's Electronic and Print Collections in ILS and...
Mending the Gap between Library's Electronic and Print Collections in ILS and...
 
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
 
Querying Linked Data
Querying Linked DataQuerying Linked Data
Querying Linked Data
 
The Mysteries of Metadata
The Mysteries of MetadataThe Mysteries of Metadata
The Mysteries of Metadata
 
IFLA 2012 - OCLC Linked Data round table
IFLA 2012 - OCLC Linked Data round tableIFLA 2012 - OCLC Linked Data round table
IFLA 2012 - OCLC Linked Data round table
 
Semantic Digital Libraries
Semantic Digital LibrariesSemantic Digital Libraries
Semantic Digital Libraries
 
Towards digitizing scholarly communication
Towards digitizing scholarly communicationTowards digitizing scholarly communication
Towards digitizing scholarly communication
 

Semelhante a Managing Scientific Research Data

Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Jian Qin
 
SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science Robert H. McDonald
 
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...ASIS&T
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycleSherry Lake
 
LAC Group - Metadata for mere mortals (Choosing standards)
LAC Group - Metadata for mere mortals (Choosing standards)LAC Group - Metadata for mere mortals (Choosing standards)
LAC Group - Metadata for mere mortals (Choosing standards)LAC Group
 
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...San Diego Supercomputer Center
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingPeter Haase
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?Jian Qin
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?Jian Qin
 
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaEfficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaGezim Sejdiu
 
2012.10 - Workshop on Semantic Statistics - 1
2012.10 - Workshop on Semantic Statistics - 12012.10 - Workshop on Semantic Statistics - 1
2012.10 - Workshop on Semantic Statistics - 1Dr.-Ing. Thomas Hartmann
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and LibariesRob Grim
 
Data science.pptx
Data science.pptxData science.pptx
Data science.pptxHakkinsRaj
 
Authoring Tool of AAT with DADT
Authoring Tool of AAT with DADTAuthoring Tool of AAT with DADT
Authoring Tool of AAT with DADTAAT Taiwan
 
Introduction to Metadata Standards
Introduction to Metadata StandardsIntroduction to Metadata Standards
Introduction to Metadata StandardsDavid Massart
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Recordspbajcsy
 
CSU-ACADIS_dataManagement101-20120217
CSU-ACADIS_dataManagement101-20120217CSU-ACADIS_dataManagement101-20120217
CSU-ACADIS_dataManagement101-20120217lyarmey
 

Semelhante a Managing Scientific Research Data (20)

Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...
 
SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science
 
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycle
 
LAC Group - Metadata for mere mortals (Choosing standards)
LAC Group - Metadata for mere mortals (Choosing standards)LAC Group - Metadata for mere mortals (Choosing standards)
LAC Group - Metadata for mere mortals (Choosing standards)
 
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
 
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
 
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaEfficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
 
2012.10 - Workshop on Semantic Statistics - 1
2012.10 - Workshop on Semantic Statistics - 12012.10 - Workshop on Semantic Statistics - 1
2012.10 - Workshop on Semantic Statistics - 1
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
Data science.pptx
Data science.pptxData science.pptx
Data science.pptx
 
Authoring Tool of AAT with DADT
Authoring Tool of AAT with DADTAuthoring Tool of AAT with DADT
Authoring Tool of AAT with DADT
 
Introduction to Metadata Standards
Introduction to Metadata StandardsIntroduction to Metadata Standards
Introduction to Metadata Standards
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Records
 
CSU-ACADIS_dataManagement101-20120217
CSU-ACADIS_dataManagement101-20120217CSU-ACADIS_dataManagement101-20120217
CSU-ACADIS_dataManagement101-20120217
 
L07 metadata
L07 metadataL07 metadata
L07 metadata
 

Mais de National Information Standards Organization (NISO)

Mais de National Information Standards Organization (NISO) (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 

Último

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 

Último (20)

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 

Managing Scientific Research Data

  • 1. Metadata for Managing Scientific Research Data NISO/DCMI Webinar: August 22, 2012 Jane Greenberg, Professor and Director of the SILS Metadata Research Center janeg@email.unc.edu
  • 2. Overview ▪ Why should we care? ▪ What is data? ▪ What is metadata‘s role w.r.t data? ▪ Selected metadata standards ▪ Challenges, opportunities, and jumping in ▪ Concluding comments ▪ Q&A
  • 3. Why should we care? BIG stuff ▪ Digital data deluge (Hey & Trefethen, 2003) ▪ Big data (New York Times) 2008 ▪ The fourth paradigm (Jim Gray, 2007) Just as important ▪ The long tail (Heidorn, 2008) ▪ CODATA/Data-at-Risk Task Group ▪ Scholarly communications, data citation Technological affordances for improving and advancing science
  • 4. Cultural shift toward data sharing ▪ National and international policies – US NSF and NIH [1, 2] – OECD (Organisation for Economic Co-operation and Development) [3] – INSPIRE Infrastructure for Spatial Information in the European Community EU Commission [4] – UK Medical Research Council [5] Dryad ―enables scientists to validate published findings, explore new analysis methodologies, repurpose data for research questions unanticipated by the original authors, and perform synthetic studies.‖ (http://datadryad.org/)
  • 5. Overview ▪ Why should we care? ▪ What is data? ▪ What is metadata‘s role w.r.t data? ▪ Selected metadata standards ▪ Challenges, opportunities, and jumping in ▪ Concluding comments ▪ Q&A
  • 6. Data ▪ No single agreed upon definition ▪ One person‘s data is another person‘s information ▪ Data often implies the ―raw‖ stuff lacking context – Scholarly context, written assessment ▪ ―Essence of science‖ (Greenberg, et al, 2009) ▪ What is science? – The Archaeology Data Service (ADS) archaeologydataservice.ac.uk
  • 7. Data quantity type The Dryad Repository 3162 Plain Text I know it when I see it 476 Microsoft Excel 308 Adobe Portable Document Format By example: Traditional 302 Comma-separated values observations, numbers, and 252 Nexus measures stored in spreadsheets 153 Microsoft Excel OpenXML and databases, fossils, 108 Microsoft Word phylogenetic trees, and herbarium 80 Zip file samples (White, 2008) 62 JPEG image 45 Microsoft Word OpenXML Other disciplines 40 Extensible Markup Language ▪ Bioinformatics: Gene 35 Hypertext Markup Language expressions, DNA transcription 21 Rich Text Format to RNA translation 16 FASTA sequence file 15 Tag Image File Format ▪ Geology, agriculture, 14 Postscript Files surveillance, and historical 2 Video Quicktime manuscript research: 2 Mathematica Notebook Hyperspectral remote sensing 1 Microsoft Powerpoint (email w/R. Scherle, July 2012)
  • 8. Overview ▪ Why should we care? ▪ What is data? ▪ What is metadata‘s role w.r.t data? ▪ Selected metadata standards ▪ Challenges, opportunities, and jumping in ▪ Concluding comments ▪ Q&A
  • 9. Metadata defined ……data about data …….information about data ▪―Metadata or ‗data about data‘ describes the content, quality, condition, and other characteristics of data.‖ (FGDC Metadata WG, 1998) ▪Structured information about an object (data) that facilitates functions associated with the object. (Greenberg, 2002, 2003, 2009)
  • 10. Typical functions Control Discover Manage rights Identify Certify Indicate versions authenticity status Mark conent Situate Describe strucure geospatially processes
  • 11. Overview ▪ Why should we care? ▪ What is data? ▪ What is metadata‘s role w.r.t data? ▪ Selected metadata standards ▪ Challenges, opportunities, and jumping in ▪ Concluding comments ▪ Q&A
  • 12. It gets messy really quickly
  • 13. Metadata for Scientific Research Data Descriptive – General to granular ▪Value (addressing a topic, ―aboutness‖) – Topical (ontologies, subject heading lists/thesauri, taxonomies) ▪Named entities – Name authority files (people, organizations, geographical jurisdictions, structures, and events) ▪Geo-spatial (coordinates) ▪Temporal data (ISO 8601/ W3CDTF, or …)
  • 14. Given the messiness… ―I cannot tell you exactly what metadata standards, vocabularies, etc. to use…‖
  • 15. Examining metadata schemes Objectives and Domains Architectural layout principles • Objectives • Discipline • Structural design • Genre • Extent • Principles • Format • Granularity Metadata Objectives and principles, Domain, and Architectural Layout (MODAL) framework (Greenberg, 2005; Willis, et al, JASIST 2012)
  • 16. Objectives and Domains Architectural Simple principles layout schemes [6] • Interoperability • Multi- • Primarily flat • Easy to disciplinary • Minimal with generate, • Any genre or means to lower barrier format extend to produce • General (not granular) Dublin Core Metadata Element Set (DCMES) ver.1.1 US MARC • Need training • Primarily flat bibliographic • Extensible format DataCite • Primarily flat
  • 17. Dublin Core Application Profile- Dryad [7] 
  • 18. DataCite example, ver.2.2 [8] National Institute for Environmental Studies and Center for Climate System Research Japan
  • 19. US MARC bibliographic format: World Ocean Circulation Experiment global data (Moss Landing Marine Labs and the Monterey Bay Aquarium Research Institute Library) [9]
  • 20. Objectives and Domains Architectural Simple/ principles layout moderate  Interoperability  Greater domain  Primarily flat balanced focus  Extensibility— schemes w/specific  Genera via connecting needs diversity within  Slightly more  Generation a domain granular requires more expertise Darwin Core Access to • Not as flat Biological Collections Data (ABCD) Ecological Metadata Language DCMI Terms • Graph approach
  • 21. Wieczorek, et al. (2012). Darwin Core: An Evolving Community- Developed Biodiversity Data Standard. PLoS One. 2012; 7(1): e29715: doi: 10.1371/journal.pone.0029715.
  • 22. Access to Biological Collections Data (ABCD) (A minimum record) <?xml version='1.0' encoding='UTF-8'?> <DataSets xmlns='http://www.tdwg.org/schemas/abcd/2.06'> <DataSet> <TechnicalContacts> <TechnicalContact> <Name>Gerd MÃŒller</Name> <Email>gerd@dfb.de</Email> </TechnicalContact> </TechnicalContacts> <ContentContacts> <ContentContact> <Name>A Another</Name> <Email>a.another@fake.org</Email> </ContentContact> </ContentContacts> <Metadata> <Description> <Representation language='en'> <Title>PonTaurus collection</Title> </Representation> </Description> <RevisionData> <DateModified>2001-03- 01T00:00:00</DateModified> </RevisionData> </Metadata> <Units> <Unit> <SourceInstitutionID>BGBM</SourceInstitutionID> <SourceID>PonTaurus</SourceID> <UnitID>1136</UnitID> </Unit> </Units> </DataSet> </DataSets>
  • 23. abstract educationLevel modified accessRights extent provenance accrualMethod format publisher accrualPeriodicity hasFormat references accrualPolicy hasPart relation alternative hasVersion replaces audience identifier requires available instructionalMethod rights bibliographicCitation isFormatOf rightsHolder conformsTo isPartOf source contributor isReferencedBy spatial coverage isReplacedBy subject created isRequiredBy tableOfContents creator issued temporal date isVersionOf title dateAccepted language type dateCopyrighted license valid dateSubmitted mediator Properties in the /terms/ description medium namespace
  • 24. Objectives and Domains Architectural Complex principles layout schemes  Interoperability • Genre focus  Hierarchical level • Format  Extensive  Generation variation  Granular requires greater expertise FGDC DDI Content Standard for Digital Data Document Initiative (DDI) Geospatial Metadata (CSDGM)/FGDC 1. Identification Information (M) 1. Concept 2. Data Quality Information 2. Collecting 3. Spatial Data Organization Information 3. Processing  Archiving 4. Spatial Reference Information 4. Distribution  Archiving 5. Entity and Attribute Information 5. Discovery 6. Distribution Information 6. Analysis 7. Metadata Reference Information (M) 7. Repurposing
  • 25. Summary for descriptive schemes ▪ Simple: Interoperable, Easy to generate/low barrier, generally multidisciplinary, genera/format agnostics, primarily flat, general (not granular), 15-25 properties ▪ Simple/moderate: Interoperability balanced w/specific needs, generation requires more expertise, greater domain focus, extensible--via connecting to other schemes, more granular, more properties ▪ Complex: Interoperable level, generation requires expertise, genera focus/format variation, hierarchical, granular, and extensive (100+ properties)
  • 26.
  • 27. Overview ▪ Why should we care? ▪ What is data? ▪ What is metadata‘s role w.r.t data? ▪ Selected metadata standards ▪ Challenges, opportunities, and jumping in ▪ Concluding comments ▪ Q&A
  • 28. Challenges and opportunities Challenges Opportunities Workflow/When to Educate scientists early (Qin, 2009) ▪ Stop generate the here Integrate into social setting w/Center for metadata? Embedded Networked Sensing (CENS) (Borgman, Mayernik, etc., 2009-current; Mayernik‘s dissertation, 2011) Methods for generating Use automatic techniques as much as possible, metadata (labor leverage human expertise (Dryad, DataOne Excel intensive) project) Too many standards Don‘t panic, join communities, look for Which one do I use? examples. (If you can‘t find them?) Do I need to No. Explore and develop a best practice. implement my Pursue a 2 pronged approach (Greenberg, et al, metadata as linked 2009) data.
  • 29. Jumping in… 1. DCMI/NISO Seminars !! 2. DCMI Science and Metadata Community (http://wiki.dublincore.org/index.php/DCMI_Science_And_Metadata) 3. Digital Curation Center (DCC) (http://www.dcc.ac.uk/) 4. The Research Data Management Training, or MANTRA project (http://datalib.edina.ac.uk/mantra/) 5. DataONE workshops and tutorials (www.dataone.org/)
  • 30. Overview ▪ Why should we care? ▪ What is data? ▪ What is metadata‘s role w.r.t data? ▪ Selected metadata standards ▪ Challenges, opportunities, and jumping in ▪ Concluding comments ▪ Q&A
  • 31. Concluding comments ▪ Standards are guidelines; no police – Aim for reasonable quality ▪ KISS: Keep it simple stupid – What’s vital; what will aid reuse? ▪ Help to move the practice forward – Share what you learn ▪ Nothing new/it‘s all new – Data documentation since ancient times – SILOS; let‘s break them down (Willis, et al, 2012) – Greater connectivity than ever – Cross-disciplinary approaches for problem solving
  • 32. Overview ▪ Why should we care? ▪ What is data? ▪ What is metadata‘s role w.r.t data? ▪ Selected metadata standards ▪ Challenges, opportunities, and jumping in ▪ Concluding comments ▪ Q&A
  • 33. Footnotes [1] NSF Data Sharing Policy: http://www.nsf.gov/bfa/dias/policy/dmp.jsp. [2] NIH Data Sharing Policy: http://grants.nih.gov/grants/policy/data_sharing/. [3] ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT/Data and Metadata Reporting and Presentation Handbook: http://www.oecd.org/std/37671574.pdf. [4] The INSPIRE Infrastructure for Spatial Information in the European Community): http://inspire.ec.europa.eu/index.cfm/pageid/48. directive released 15 May 2007 and will be implemented in various stages, with full implementation required by 2019, and aims to create a European Union (EU) spatial data infrastructure. [5] UK medical research council: http://www.mrc.ac.uk/Ourresearch/Ethicsresearchguidance/datasharing/index.html. [6] The DCMI Glossary (scroll down for ―schema‖ entry): http://dublincore.org/documents/usageguide/glossary.shtml#schema. [7] Dublin Core Example: Data from: Divergence time estimation using fossils as terminal taxa and the origins of Lissamphibia (Dryad repository): http://datadryad.org/resource/doi:10.5061/dryad.8120?show=full. [8] National Institute for Environmental Studies and Center for Climate System Research Japan—animation data (DataCite): http://schema.datacite.org/meta/kernel- 2.2/example/datacite-metadata-sample-v2.2.xml. [9] US MARC bibliographic format: World Ocean Circulation Experiment global data (Moss Landing Marine Labs and the Monterey Bay Aquarium Research Institute Library): http://mlml.kohalibrary.com/cgi-bin/koha/opac-detail.pl?biblionumber=9282.