SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
Data	
  Repositories	
  and	
  
         Services	
  
        Xiamen	
  University	
  Library	
  
             June	
  8,	
  2012	
  
                        	
  
                    Jian	
  Qin	
  
        School	
  of	
  InformaCon	
  Studies	
  
             Syracuse	
  University	
  
       hDp://eslib.ischool.syr.edu/jqin/	
  
Agenda	
  
•      What	
  is	
  a	
  repository?	
  Repository	
  soNware?	
  
•      What	
  does	
  it	
  do?	
  	
  
•      How	
  does	
  it	
  work?	
  
•      Case	
  studies:	
  
         –  Dryad:	
  an	
  internaConal	
  repository	
  of	
  data	
  and	
  
            publicaCons	
  for	
  basic	
  and	
  applied	
  biosciences	
  
         –  Dataverse:	
  a	
  data	
  repository	
  system	
  




6/8/12	
                           Data	
  repositories	
  and	
  services	
      2	
  
What	
  is	
  a	
  data	
  repository?	
  
Data	
  Repository	
  is	
  a	
  logical	
  (and	
  
someCmes	
  physical)	
  parCConing	
                                            Repository	
  commonly	
  
                                                                                 refers	
  to	
  a	
  locaCon	
  for	
  
    of	
  data	
  where	
  mulCple	
  
                                                                                storage,	
  oNen	
  for	
  safety	
  
   databases	
  which	
  apply	
  to	
  
                                                                                    or	
  preservaCon.	
  
 specific	
  applicaCons	
  or	
  sets	
  of	
  
                                                                                                    	
  
         applicaCons	
  reside.	
  	
                                            hDp://en.wikipedia.org/wiki/Repository	
  	
  
                    	
  
hDp://www.learn.geekinterview.com/data-­‐warehouse/
       dw-­‐basics/what-­‐is-­‐data-­‐repository.html	
  	
  




 6/8/12	
                                       Data	
  repositories	
  and	
  services	
                                         3	
  
WHAT	
  CAN	
  WE	
  EXPECT	
  IN	
  A	
  DATA	
  
   REPOSITORY?	
  




6/8/12	
               Data	
  repositories	
  and	
  services	
     4	
  
Technical	
  features	
  
•  Standards	
  
      –  OAI-­‐PMH	
  
      –  Z39.50	
  protocol	
  	
  
      –  Open	
  source	
  license	
  
•  Hardware	
                                                                     •  Staff	
  requirements	
  
      –  Minimum	
  hardware	
  requirements	
                                            –  UNIX	
  systems	
  
      –  SAN	
  support	
                                                                    administrator	
  
•  So;ware	
                                                                              –  Java	
  programmer	
  
      –      OS	
  	
                                                                     –  PERL	
  programmer	
  
      –      Programming	
  language	
                                                    –  Python	
  programmer	
  
      –      Database	
  
      –      Web	
  server	
                                                   Open	
  Society	
  InsCtute.	
  (2004).	
  A	
  guide	
  to	
  
      –      Java	
  servlet	
  engine	
                                       insCtuConal	
  repository	
  soNware.	
  3rd	
  ed.	
  
                                                                               hDp://www.soros.org/openaccess/pdf/
      –      Search	
  engine	
                                                OSI_Guide_to_IR_SoNware_v3.pdf	
  	
  	
  
      – 
6/8/12	
     Other	
                       Data	
  repositories	
  and	
  services	
                                                   5	
  
Features	
  and	
  funcCons	
  
•  Repository	
  &	
  system	
  administraDon	
  
      –  User	
  registraCon,	
  authenCcaCon	
  &	
  password	
  
         administraCon	
  
      –  Module-­‐level	
  APIs	
  
•  Content	
  submission	
  administraDon	
  
      –  Define	
  mulCple	
  collecCons	
  with	
  same	
  instance	
  of	
  
         system	
  
      –  Submission	
  stages	
  
      –  Submission	
  support	
  
      –  System	
  generated	
  usage	
  stats	
  and	
  reposts	
  
                   Open	
  Society	
  InsCtute.	
  (2004).	
  A	
  guide	
  to	
  insCtuConal	
  repository	
  soNware.	
  3rd	
  ed.	
  
                   hDp://www.soros.org/openaccess/pdf/OSI_Guide_to_IR_SoNware_v3.pdf	
  	
  	
  

 6/8/12	
                                    Data	
  repositories	
  and	
  services	
                                            6	
  
FuncCons	
  of	
  repositories	
  
•  Content	
  management	
                                                •  Archiving	
  
    –     Content	
  import/export	
                                                 –  Persistent	
  document	
  
                                                                                        idenCficaCon	
  
    –     Document/object	
  formats	
  
                                                                                     –  Data	
  preservaCon	
  report	
  
    –     Metadata	
  
                                                                                     –  Object	
  history/version	
  control	
  
    –     Real-­‐Cme	
  updaCng	
  and	
  
          indexing	
  of	
  accepted	
  content	
                         •  System	
  maintenance	
  
•  DisseminaCon	
                                                                    –  System	
  support	
  
                                                                                                •  DocumentaCon/manual	
  
    –  User	
  interface	
                                                                      •  Listserv	
  
    –  Search	
  capability	
                                                                   •  Bug	
  track/feature	
  request	
  
               •    Full	
  text	
                                                                 system	
  
               •    All	
  descripCve	
  metadata	
                                             •  Formal	
  support/help	
  desk	
  
               •    Selected	
  metadata	
  fields	
  
               •    Browse	
  
               •    Sort	
  search	
  results	
                                        Open	
  Society	
  InsCtute.	
  (2004).	
  A	
  guide	
  to	
  
    –  Indexed	
  by	
  Google/other	
                                                 insCtuConal	
  repository	
  soNware.	
  3rd	
  ed.	
  
       search	
  engines	
                                                             hDp://www.soros.org/openaccess/pdf/
                                                                                       OSI_Guide_to_IR_SoNware_v3.pdf	
  	
  	
  
  6/8/12	
                                        Data	
  repositories	
  and	
  services	
                                                      7	
  
The	
  context	
  of	
  repositories	
  
                                                            Research	
  
                                                          community	
  

              InsCtuConal	
  
               repository	
                                                               Data	
  
                                                                                       repository	
  
              PublicaCons,	
  
             presentaCons,	
                                                           Datasets	
  
              reports,	
  etc.	
  	
  
                                                 Disciplines	
  
                                                 Standards	
  
                                                 Technology	
  
6/8/12	
                                 Data	
  repositories	
  and	
  services	
                      8	
  
InsCtuConal	
  repositories	
  
  InsCtuConal	
              •  An	
  insCtuConal	
  repository	
  (IR)consists	
  of	
  formally	
  
   repository	
                 organized	
  and	
  managed	
  collecCons	
  of	
  digital	
  content	
  
                                generated	
  by	
  faculty,	
  staff,	
  and	
  students	
  at	
  an	
  insCtuCon	
  
 PublicaCons,	
  
presentaCons,	
              •  Types	
  of	
  IRs:	
  
 reports,	
  etc.	
  	
              –  CollecCon-­‐based	
  digital	
  repositories	
  managed	
  by	
  library	
  
                                        professionals	
  
                                     –  Course	
  management	
  systems	
  and	
  associated	
  file	
  stores	
  
                                     –  CollecCon	
  of	
  research	
  data	
  and	
  reports	
  managed	
  by	
  research	
  
                                        units	
  (centers,	
  laboratories,	
  etc.)	
  
                                     –  Student	
  academic	
  porlolio	
  systems	
  
                                     –  InsCtuConal	
  file	
  storage	
  systems	
  
                                     –  Digital	
  asset	
  management	
  workflow	
  systems	
  	
  
                                     –  Web	
  content	
  management	
  systems	
  	
  used	
  by	
  insCtuCons	
  or	
  
                                        depts	
  to	
  store	
  and	
  stage	
  web	
  content	
  
EDUCAUSE	
  Evolving	
  Technologies	
  CommiDee.	
  (2003).	
  InsCtuConal	
  repositories:	
  Enhancing	
  teaching,	
  learning,	
  and	
  
research.	
  hDp://net.educause.edu/ir/library/pdf/DEC0303.pdf	
  	
  
  6/8/12	
                                             Data	
  repositories	
  and	
  services	
                                           9	
  
Data	
  repositories	
  
•  No	
  one	
  agreed-­‐upon	
  definiCon	
  
                                                                                        Data	
  
•  CharacterisCcs:	
                                                                 repository	
  

         –  A	
  repository	
  operated	
  by	
  an	
  academic	
  
            insCtuCon/unit	
  or	
  a	
  research	
  organizaCon	
                   Datasets	
  
         –  A	
  system	
  for	
  storing,	
  managing,	
  preserving,	
  
            and	
  providing	
  access	
  to	
  data	
  
         –  Centered	
  on	
  a	
  discipline	
  or	
  a	
  research	
  field	
  
            involving	
  mulCple	
  disciplines	
  
         –  Policies	
  governing	
  the	
  intellectual	
  property	
  
            rights,	
  management,	
  access,	
  sharing,	
  and	
  
            citaCon	
  

6/8/12	
                               Data	
  repositories	
  and	
  services	
               10	
  
Dryad:	
  a	
  repository	
  for	
  
                                    data	
  and	
  publicaCons	
  
hDp://datadryad.org/	
  	
  

  •  As	
  a	
  data	
  repository,	
  Dryad	
  provides	
  a	
  plalorm	
  to	
  associate	
  
     data	
  with	
  underlying	
  publicaCons.	
  	
  
  •  Content	
  acquisiCon:	
  user	
  submission	
  
  •  How	
  to	
  moCvate	
  users	
  to	
  submit	
  data?	
  
      •  Make	
  it	
  simple	
  and	
  rewarding	
  
      •  Provide	
  detailed	
  support	
  informaCon	
  about:	
  
                •  DeposiCng	
  data	
  
                •  Managing	
  data	
  
                •  Intellectual	
  property	
  rights	
  (CC0)	
  
                •  Download	
  data	
  packages	
  
                •  View	
  usage	
  staCsCcs	
  
  6/8/12	
                            Data	
  repositories	
  and	
  services	
                   11	
  
hDp://datadryad.org/handle/10255/dryad.8085	
  	
  

                                                                 Dryad	
  
                                                                metadata	
  
                                                                 record	
  
                                                                example	
  




6/8/12	
     Data	
  repositories	
  and	
  services	
                   12	
  
Dryad	
  metadata	
  record	
  example	
  (cont’d)	
  


Individual	
  files	
  in	
  
the	
  data	
  package.	
  
The	
  metadata	
  
shows:	
  
•  #	
  of	
  downloads	
  
•  File	
  technical	
  
   data	
  
•  Copyright	
  type	
  
•  DocumentaCon	
  
   for	
  the	
  data	
  file	
  




        6/8/12	
                      Data	
  repositories	
  and	
  services	
     13	
  
Dryad	
  Backend	
  
•  Uses	
  core	
  features	
  of	
  DSpace	
  with	
  
   modificaCons	
  or	
  complete	
  replacement	
  
•  Uses	
  OAI-­‐PMH	
  to	
  allow	
  metadata	
  harvesCng	
  
         –  Metadata	
  formats	
  available	
  for	
  harvesCng	
  include	
  
             •  METS/MODS,	
  OAI-­‐DC	
  (Dublin	
  Core),	
  OAI-­‐ORE/Atom,	
  
                and	
  RDF/DC	
  	
  
•  Uses	
  DOI	
  to	
  idenCfy	
  Dryad	
  data	
  packages	
  and	
  
   files	
  
                         hDp://wiki.datadryad.org/Category:Technical_DocumentaCon	
  	
  

6/8/12	
                           Data	
  repositories	
  and	
  services	
           14	
  
DOI	
  Examples	
  	
  	
  
     •  Data	
  packages	
  
             –  doi:10.5061/dryad.1664	
  
             –  doi:10.5061/dryad.642	
  
             –  doi:10.5061/dryad.1307	
  
     •  Data	
  files	
  
             –  doi:10.5061/dryad.1664/1	
  
             –  doi:10.5061/dryad.642/1	
  
             –  doi:10.5061/dryad.1307/1	
  
             –  doi:10.5061/dryad.1307/2	
  
             –  doi:10.5061/dryad.1307/3	
  
6/8/12	
                        Data	
  repositories	
  and	
  services	
     15	
  
DATA	
  REPOSITORY	
  SOFTWARE	
  


6/8/12	
           Data	
  repositories	
  and	
  services	
     16	
  
6/8/12	
     Data	
  repositories	
  and	
  services	
     17	
  
Dataverse	
  metadata	
  ediCng	
  interface	
  




6/8/12	
     Data	
  repositories	
  and	
  services	
                                       18	
  
Dataverse	
  metadata	
  ediCng	
  interface	
  (cont’d)	
  




6/8/12	
                                    Data	
  repositories	
  and	
  services	
     19	
  
6/8/12	
     Data	
  repositories	
  and	
  services	
     20	
  
Standards	
  and	
  tools	
  for	
  repositories	
  
             •  Open	
  Archive	
  IniCaCve	
  (OAI)	
  and	
  its	
  Protocol	
  for	
  
                Metadata	
  HarvesCng	
  (OAI-­‐PMH)	
  
             •  Tools	
  (open	
  source):	
  
                 –  DSpace	
  (hDp://www.dspace.org)	
  	
  
                 –  Fedora	
  (hDp://www.fedora-­‐commons.org/)	
  
                 –  Dataverse	
  (hDp://thedata.org/)	
  	
  
                 –  EPrints	
  (hDp://www.eprints.org/)	
  
                 –  More:	
  
                    hDp://oad.simmons.edu/oadwiki/Free_and_open-­‐
                    source_repository_soNware	
  	
  



6/8/12	
                               Data	
  repositories	
  and	
  services	
            21	
  

Mais conteúdo relacionado

Mais procurados

Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
Lucy McKenna
 
Policy-based Data Management
Policy-based Data Management Policy-based Data Management
Policy-based Data Management
Gary Wilhelm
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013
ECNOfficer
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
eTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service PlatformeTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service Platform
ibemam
 

Mais procurados (20)

Curation and Preservation of Crystallography Data
Curation and Preservation of Crystallography DataCuration and Preservation of Crystallography Data
Curation and Preservation of Crystallography Data
 
Wilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of FedoraWilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of Fedora
 
FAIR BioData Management
FAIR BioData ManagementFAIR BioData Management
FAIR BioData Management
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
 
Research Data Services Best Practices by Dalal Rahme
Research Data Services Best Practices by Dalal RahmeResearch Data Services Best Practices by Dalal Rahme
Research Data Services Best Practices by Dalal Rahme
 
Rdap12 wrap up reagan moore
Rdap12 wrap up reagan mooreRdap12 wrap up reagan moore
Rdap12 wrap up reagan moore
 
Quick Intro to InterMine within AIP and MTGD - JCVI Research Works-in-Progres...
Quick Intro to InterMine within AIP and MTGD - JCVI Research Works-in-Progres...Quick Intro to InterMine within AIP and MTGD - JCVI Research Works-in-Progres...
Quick Intro to InterMine within AIP and MTGD - JCVI Research Works-in-Progres...
 
Policy-based Data Management
Policy-based Data Management Policy-based Data Management
Policy-based Data Management
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
 
Exploration of a Data Landscape using a Collaborative Linked Data Framework.
Exploration of a Data Landscape using a Collaborative Linked Data Framework.Exploration of a Data Landscape using a Collaborative Linked Data Framework.
Exploration of a Data Landscape using a Collaborative Linked Data Framework.
 
eTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service PlatformeTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service Platform
 
Open minted content_provision
Open minted content_provisionOpen minted content_provision
Open minted content_provision
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with Confidence
 

Destaque

Ndsa 2013-abrams-integrating-repositories-for-data-sharing
Ndsa 2013-abrams-integrating-repositories-for-data-sharingNdsa 2013-abrams-integrating-repositories-for-data-sharing
Ndsa 2013-abrams-integrating-repositories-for-data-sharing
University of California Curation Center
 
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
EUDAT
 

Destaque (8)

Saving private data, sharing Open Data? Role of libraries and institutional r...
Saving private data, sharing Open Data? Role of libraries and institutional r...Saving private data, sharing Open Data? Role of libraries and institutional r...
Saving private data, sharing Open Data? Role of libraries and institutional r...
 
Data Publishing and Institutional Repositories
Data Publishing and Institutional RepositoriesData Publishing and Institutional Repositories
Data Publishing and Institutional Repositories
 
Instutional repositories and data
Instutional repositories and dataInstutional repositories and data
Instutional repositories and data
 
Ndsa 2013-abrams-integrating-repositories-for-data-sharing
Ndsa 2013-abrams-integrating-repositories-for-data-sharingNdsa 2013-abrams-integrating-repositories-for-data-sharing
Ndsa 2013-abrams-integrating-repositories-for-data-sharing
 
Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
Open Data Repositories
Open Data RepositoriesOpen Data Repositories
Open Data Repositories
 
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
 

Semelhante a Data repositories -- Xiamen University 2012 06-08

Organic.Edunet Repository Tools
Organic.Edunet Repository ToolsOrganic.Edunet Repository Tools
Organic.Edunet Repository Tools
Hannes Ebner
 
Pushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesPushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the Pipes
Rajarshi Guha
 
agINFRA Agricultural Ontology Workshop Presentation
agINFRA Agricultural Ontology Workshop PresentationagINFRA Agricultural Ontology Workshop Presentation
agINFRA Agricultural Ontology Workshop Presentation
Benjamin Cave
 

Semelhante a Data repositories -- Xiamen University 2012 06-08 (20)

Prototype Design of Open Access Institutional Repository
Prototype Design of Open Access Institutional RepositoryPrototype Design of Open Access Institutional Repository
Prototype Design of Open Access Institutional Repository
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
Organic.Edunet Repository Tools
Organic.Edunet Repository ToolsOrganic.Edunet Repository Tools
Organic.Edunet Repository Tools
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
 
Pieper NISO Virtual Conf Feb17
Pieper NISO Virtual Conf Feb17Pieper NISO Virtual Conf Feb17
Pieper NISO Virtual Conf Feb17
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
On demand access to Big Data through Semantic Technologies
 On demand access to Big Data through Semantic Technologies On demand access to Big Data through Semantic Technologies
On demand access to Big Data through Semantic Technologies
 
Pushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesPushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the Pipes
 
agINFRA Agricultural Ontology Workshop Presentation
agINFRA Agricultural Ontology Workshop PresentationagINFRA Agricultural Ontology Workshop Presentation
agINFRA Agricultural Ontology Workshop Presentation
 
2012 02 aos-johanneskeizer
2012 02 aos-johanneskeizer2012 02 aos-johanneskeizer
2012 02 aos-johanneskeizer
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation Environments
 
Tim Pugh-SPEDDEXES 2014
Tim Pugh-SPEDDEXES 2014Tim Pugh-SPEDDEXES 2014
Tim Pugh-SPEDDEXES 2014
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
Crushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional DataCrushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional Data
 
Digitisation and institutional repositories 2
Digitisation and institutional repositories 2Digitisation and institutional repositories 2
Digitisation and institutional repositories 2
 
ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides
 
Alex Wade, Digital Library Interoperability
Alex Wade, Digital Library InteroperabilityAlex Wade, Digital Library Interoperability
Alex Wade, Digital Library Interoperability
 

Mais de Jian Qin

Developing Data Services to Support eScience/eResearch
Developing Data Services to Support eScience/eResearchDeveloping Data Services to Support eScience/eResearch
Developing Data Services to Support eScience/eResearch
Jian Qin
 
Scientific Data Management
Scientific Data ManagementScientific Data Management
Scientific Data Management
Jian Qin
 
Research literature review
Research literature reviewResearch literature review
Research literature review
Jian Qin
 

Mais de Jian Qin (11)

Data Science and What It Means to Library and Information Science
Data Science and What It Means to Library and Information ScienceData Science and What It Means to Library and Information Science
Data Science and What It Means to Library and Information Science
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
 
Survey research
Survey research Survey research
Survey research
 
Developing Data Services to Support Scientific Data Management (v3)
Developing Data Services to Support Scientific Data Management (v3)Developing Data Services to Support Scientific Data Management (v3)
Developing Data Services to Support Scientific Data Management (v3)
 
Preparing eScience librarians -- RDAP 2012
Preparing eScience librarians -- RDAP 2012 Preparing eScience librarians -- RDAP 2012
Preparing eScience librarians -- RDAP 2012
 
Developing Data Services to Support eScience/eResearch
Developing Data Services to Support eScience/eResearchDeveloping Data Services to Support eScience/eResearch
Developing Data Services to Support eScience/eResearch
 
Scientific data management (v2)
Scientific data management (v2)Scientific data management (v2)
Scientific data management (v2)
 
Scientific Data Management
Scientific Data ManagementScientific Data Management
Scientific Data Management
 
Research literature review
Research literature reviewResearch literature review
Research literature review
 
Scholarly communication
Scholarly communicationScholarly communication
Scholarly communication
 
Linking Scientific Metadata (presented at DC2010)
Linking Scientific Metadata (presented at DC2010)Linking Scientific Metadata (presented at DC2010)
Linking Scientific Metadata (presented at DC2010)
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Data repositories -- Xiamen University 2012 06-08

  • 1. Data  Repositories  and   Services   Xiamen  University  Library   June  8,  2012     Jian  Qin   School  of  InformaCon  Studies   Syracuse  University   hDp://eslib.ischool.syr.edu/jqin/  
  • 2. Agenda   •  What  is  a  repository?  Repository  soNware?   •  What  does  it  do?     •  How  does  it  work?   •  Case  studies:   –  Dryad:  an  internaConal  repository  of  data  and   publicaCons  for  basic  and  applied  biosciences   –  Dataverse:  a  data  repository  system   6/8/12   Data  repositories  and  services   2  
  • 3. What  is  a  data  repository?   Data  Repository  is  a  logical  (and   someCmes  physical)  parCConing   Repository  commonly   refers  to  a  locaCon  for   of  data  where  mulCple   storage,  oNen  for  safety   databases  which  apply  to   or  preservaCon.   specific  applicaCons  or  sets  of     applicaCons  reside.     hDp://en.wikipedia.org/wiki/Repository       hDp://www.learn.geekinterview.com/data-­‐warehouse/ dw-­‐basics/what-­‐is-­‐data-­‐repository.html     6/8/12   Data  repositories  and  services   3  
  • 4. WHAT  CAN  WE  EXPECT  IN  A  DATA   REPOSITORY?   6/8/12   Data  repositories  and  services   4  
  • 5. Technical  features   •  Standards   –  OAI-­‐PMH   –  Z39.50  protocol     –  Open  source  license   •  Hardware   •  Staff  requirements   –  Minimum  hardware  requirements   –  UNIX  systems   –  SAN  support   administrator   •  So;ware   –  Java  programmer   –  OS     –  PERL  programmer   –  Programming  language   –  Python  programmer   –  Database   –  Web  server   Open  Society  InsCtute.  (2004).  A  guide  to   –  Java  servlet  engine   insCtuConal  repository  soNware.  3rd  ed.   hDp://www.soros.org/openaccess/pdf/ –  Search  engine   OSI_Guide_to_IR_SoNware_v3.pdf       –  6/8/12   Other   Data  repositories  and  services   5  
  • 6. Features  and  funcCons   •  Repository  &  system  administraDon   –  User  registraCon,  authenCcaCon  &  password   administraCon   –  Module-­‐level  APIs   •  Content  submission  administraDon   –  Define  mulCple  collecCons  with  same  instance  of   system   –  Submission  stages   –  Submission  support   –  System  generated  usage  stats  and  reposts   Open  Society  InsCtute.  (2004).  A  guide  to  insCtuConal  repository  soNware.  3rd  ed.   hDp://www.soros.org/openaccess/pdf/OSI_Guide_to_IR_SoNware_v3.pdf       6/8/12   Data  repositories  and  services   6  
  • 7. FuncCons  of  repositories   •  Content  management   •  Archiving   –  Content  import/export   –  Persistent  document   idenCficaCon   –  Document/object  formats   –  Data  preservaCon  report   –  Metadata   –  Object  history/version  control   –  Real-­‐Cme  updaCng  and   indexing  of  accepted  content   •  System  maintenance   •  DisseminaCon   –  System  support   •  DocumentaCon/manual   –  User  interface   •  Listserv   –  Search  capability   •  Bug  track/feature  request   •  Full  text   system   •  All  descripCve  metadata   •  Formal  support/help  desk   •  Selected  metadata  fields   •  Browse   •  Sort  search  results   Open  Society  InsCtute.  (2004).  A  guide  to   –  Indexed  by  Google/other   insCtuConal  repository  soNware.  3rd  ed.   search  engines   hDp://www.soros.org/openaccess/pdf/ OSI_Guide_to_IR_SoNware_v3.pdf       6/8/12   Data  repositories  and  services   7  
  • 8. The  context  of  repositories   Research   community   InsCtuConal   repository   Data   repository   PublicaCons,   presentaCons,   Datasets   reports,  etc.     Disciplines   Standards   Technology   6/8/12   Data  repositories  and  services   8  
  • 9. InsCtuConal  repositories   InsCtuConal   •  An  insCtuConal  repository  (IR)consists  of  formally   repository   organized  and  managed  collecCons  of  digital  content   generated  by  faculty,  staff,  and  students  at  an  insCtuCon   PublicaCons,   presentaCons,   •  Types  of  IRs:   reports,  etc.     –  CollecCon-­‐based  digital  repositories  managed  by  library   professionals   –  Course  management  systems  and  associated  file  stores   –  CollecCon  of  research  data  and  reports  managed  by  research   units  (centers,  laboratories,  etc.)   –  Student  academic  porlolio  systems   –  InsCtuConal  file  storage  systems   –  Digital  asset  management  workflow  systems     –  Web  content  management  systems    used  by  insCtuCons  or   depts  to  store  and  stage  web  content   EDUCAUSE  Evolving  Technologies  CommiDee.  (2003).  InsCtuConal  repositories:  Enhancing  teaching,  learning,  and   research.  hDp://net.educause.edu/ir/library/pdf/DEC0303.pdf     6/8/12   Data  repositories  and  services   9  
  • 10. Data  repositories   •  No  one  agreed-­‐upon  definiCon   Data   •  CharacterisCcs:   repository   –  A  repository  operated  by  an  academic   insCtuCon/unit  or  a  research  organizaCon   Datasets   –  A  system  for  storing,  managing,  preserving,   and  providing  access  to  data   –  Centered  on  a  discipline  or  a  research  field   involving  mulCple  disciplines   –  Policies  governing  the  intellectual  property   rights,  management,  access,  sharing,  and   citaCon   6/8/12   Data  repositories  and  services   10  
  • 11. Dryad:  a  repository  for   data  and  publicaCons   hDp://datadryad.org/     •  As  a  data  repository,  Dryad  provides  a  plalorm  to  associate   data  with  underlying  publicaCons.     •  Content  acquisiCon:  user  submission   •  How  to  moCvate  users  to  submit  data?   •  Make  it  simple  and  rewarding   •  Provide  detailed  support  informaCon  about:   •  DeposiCng  data   •  Managing  data   •  Intellectual  property  rights  (CC0)   •  Download  data  packages   •  View  usage  staCsCcs   6/8/12   Data  repositories  and  services   11  
  • 12. hDp://datadryad.org/handle/10255/dryad.8085     Dryad   metadata   record   example   6/8/12   Data  repositories  and  services   12  
  • 13. Dryad  metadata  record  example  (cont’d)   Individual  files  in   the  data  package.   The  metadata   shows:   •  #  of  downloads   •  File  technical   data   •  Copyright  type   •  DocumentaCon   for  the  data  file   6/8/12   Data  repositories  and  services   13  
  • 14. Dryad  Backend   •  Uses  core  features  of  DSpace  with   modificaCons  or  complete  replacement   •  Uses  OAI-­‐PMH  to  allow  metadata  harvesCng   –  Metadata  formats  available  for  harvesCng  include   •  METS/MODS,  OAI-­‐DC  (Dublin  Core),  OAI-­‐ORE/Atom,   and  RDF/DC     •  Uses  DOI  to  idenCfy  Dryad  data  packages  and   files   hDp://wiki.datadryad.org/Category:Technical_DocumentaCon     6/8/12   Data  repositories  and  services   14  
  • 15. DOI  Examples       •  Data  packages   –  doi:10.5061/dryad.1664   –  doi:10.5061/dryad.642   –  doi:10.5061/dryad.1307   •  Data  files   –  doi:10.5061/dryad.1664/1   –  doi:10.5061/dryad.642/1   –  doi:10.5061/dryad.1307/1   –  doi:10.5061/dryad.1307/2   –  doi:10.5061/dryad.1307/3   6/8/12   Data  repositories  and  services   15  
  • 16. DATA  REPOSITORY  SOFTWARE   6/8/12   Data  repositories  and  services   16  
  • 17. 6/8/12   Data  repositories  and  services   17  
  • 18. Dataverse  metadata  ediCng  interface   6/8/12   Data  repositories  and  services   18  
  • 19. Dataverse  metadata  ediCng  interface  (cont’d)   6/8/12   Data  repositories  and  services   19  
  • 20. 6/8/12   Data  repositories  and  services   20  
  • 21. Standards  and  tools  for  repositories   •  Open  Archive  IniCaCve  (OAI)  and  its  Protocol  for   Metadata  HarvesCng  (OAI-­‐PMH)   •  Tools  (open  source):   –  DSpace  (hDp://www.dspace.org)     –  Fedora  (hDp://www.fedora-­‐commons.org/)   –  Dataverse  (hDp://thedata.org/)     –  EPrints  (hDp://www.eprints.org/)   –  More:   hDp://oad.simmons.edu/oadwiki/Free_and_open-­‐ source_repository_soNware     6/8/12   Data  repositories  and  services   21