SlideShare a Scribd company logo
1 of 30
Now taking submissions…




– revolutionizing data dissemination,
  organization and use.
              Tam Sneddon
             BGI-Hong Kong



        www.gigadb.org
Overview
Introduction          What is           ,
           /          why we want your
                      data and why you
                      should submit to us?


                       Published datasets
Data Publishing

                       New database features
    DOIs

                       Future tools: Galaxy/Cloud
 Reproducibility/Reuse


 Utility/Usability


 Standards/Searchability/Sharing


 Data publishing/DOI

              www.gigasciencejournal.com
                   www.gigadb.org
DataCite goal: “increase acceptance of research as legitimate, citable contributions to the scholarly record”
Currently: 36 public datasets

Humans                         - Crab-eating                 Plants
Ancient DNA                    Minipig                       Chinese cabbage
- Aboriginal Australian        Mouse methylomes              Cucumber, domestic
- Saqqaq Eskimo                Naked mole rat                Foxtail millet
Asian individual (YH)          Penguin                       Pigeonpea
- DNA methylome                - Adelie penguin              Potato
- Genome assembly              - Emperor penguin             Sorghum
- Transcriptome                Pigeon, domestic
Cancer                         Polar bear                    Microbes
- Hepatocellular carcinoma     Sheep, domestic               E. Coli O104:H4 TY-2482
- Single-cell bladder          Tibetan antelope              T2D gut metagenome
Human exome – chronic
hepatitis B infection          Invertebrates                 Cell-lines
predisposing variants          Ant                           Chinese Hamster Ovary
                               - Florida carpenter ant
Vertebrates                    - Jerdon’s jumping ant
Darwin finch                   - Leaf-cutter ant
Giant panda                    Roundworm
Macaque                        Schistosoma haematobium
- Chinese rhesus               Silkworm, domestic and wild
Currently: 36 public datasets

Humans                          - Crab-eating                 Plants
Ancient DNA                     Minipig                       Chinese cabbage
- Aboriginal Australian         Mouse methylomes              Cucumber, domestic
- Saqqaq Eskimo                 Naked mole rat                Foxtail millet
Asian individual (YH)           Penguin                       Pigeonpea
- DNA methylome                 - Adelie penguin              Potato
- Genome assembly               - Emperor penguin             Sorghum
- Transcriptome                 Pigeon, domestic
Cancer               *14TB*     Polar bear                    Microbes
- Hepatocellular carcinoma      Sheep, domestic               E. Coli O104:H4 TY-2482
- Single-cell bladder           Tibetan antelope              T2D gut metagenome
Human exome – chronic
hepatitis B infection           Invertebrates                 Cell-lines
predisposing variants           Ant                           Chinese Hamster Ovary
                                - Florida carpenter ant
Vertebrates                     - Jerdon’s jumping ant
Darwin finch                    - Leaf-cutter ant
Giant panda                     Roundworm
Macaque                         Schistosoma haematobium
- Chinese rhesus                Silkworm, domestic and wild
Currently: 36 public datasets
                                ***15 pre-publication***
Humans                           - Crab-eating                 Plants
Ancient DNA                      Minipig                       Chinese cabbage
- Aboriginal Australian          Mouse methylomes              Cucumber, domestic
- Saqqaq Eskimo                  Naked mole rat                Foxtail millet
Asian individual (YH)            Penguin                       Pigeonpea
- DNA methylome                  - Adelie penguin              Potato
- Genome assembly                - Emperor penguin             Sorghum
- Transcriptome                  Pigeon, domestic
Cancer                           Polar bear                    Microbes
- Hepatocellular carcinoma       Sheep, domestic               E. Coli O104:H4 TY-2482
- Single-cell bladder cancer     Tibetan antelope              T2D gut metagenome
Human exome – chronic
hepatitis B infection            Invertebrates                 Cell-lines
predisposing variants            Ant                           Chinese Hamster Ovary
                                 - Florida carpenter ant
Vertebrates                      - Jerdon’s jumping ant
Darwin finch                     - Leaf-cutter ant
Giant panda                      Roundworm
Macaque                          Schistosoma haematobium
- Chinese rhesus                 Silkworm, domestic and wild
Currently: 36 public datasets
                                 *5 citations in the references*
Humans                             - Crab-eating                 Plants
Ancient DNA                        Minipig                       Chinese cabbage
- Aboriginal Australian            *Mouse methylomes*            Cucumber, domestic
- Saqqaq Eskimo                    Naked mole rat                Foxtail millet
Asian individual (YH)              Penguin                       Pigeonpea
- DNA methylome                    - Adelie penguin              Potato
- Genome assembly                  - Emperor penguin             *Sorghum*
- *Transcriptome*                  Pigeon, domestic
Cancer                             *Polar bear*                  Microbes
- Hepatocellular carcinoma         Sheep, domestic               E. Coli O104:H4 TY-2482
- *Single-cell bladder cancer*     Tibetan antelope              T2D gut metagenome
Human exome – chronic
hepatitis B infection              Invertebrates                 Cell-lines
predisposing variants              Ant                           Chinese Hamster Ovary
                                   - Florida carpenter ant
Vertebrates                        - Jerdon’s jumping ant
Darwin finch                       - Leaf-cutter ant
Giant panda                        Roundworm
Macaque                            Schistosoma haematobium
- Chinese rhesus                   Silkworm, domestic and wild
Currently: 36 public datasets
                                 *5 citations in the references*
    Humans                         - Crab-eating                   Plants
    Ancient DNA                    Minipig                         Chinese cabbage
    - Aboriginal Australian        Mouse methylomes                Cucumber, domestic
    - Saqqaq Eskimo                Naked mole rat                  Foxtail millet
    Asian individual (YH)          Penguin                         Pigeonpea
    - DNA methylome                - Adelie penguin                Potato
    - Genome assembly              - Emperor penguin               *Sorghum*
    - Transcriptome                Pigeon, domestic
    Cancer                         Polar bear                      Microbes
    - Hepatocellular carcinoma     Sheep, domestic                 E. Coli O104:H4 TY-2482
    - Single-cell bladder          Tibetan antelope
    Human exome – chronic                                          Cell-lines
    hepatitis B infection            Invertebrates                 Chinese Hamster Ovary
    predisposing variants            Ant                           (CHO)
                                     - Florida carpenter ant
Complemented by data submitted to INSDC databases:
  Vertebrates                        - Jerdon’s jumping ant
- Raw data                   SRA:SRA046843
  Darwin finch                       - Leaf-cutter ant
- Assemblies of 3 strains    Genbank:AHAO00000000-AHAQ00000000
  Giant panda
- SNPs
                                     Roundworm
                             dbSNP batch ids:1056306-10563068
- Macaque                            Schistosoma haematobium

-
    CNVs
- - Chinese rhesus
    InDels
     SV
                           }         Silkworm, domestic and wild
                             dbVAR:nstd63
Currently: 36 public datasets
                             *5 citations in the references*
Humans                         - Crab-eating                 Plants
Ancient DNA                    Minipig                       Chinese cabbage
- Aboriginal Australian        Mouse methylomes              Cucumber, domestic
- Saqqaq Eskimo                Naked mole rat                Foxtail millet
Asian individual (YH)          Penguin                       Pigeonpea
- DNA methylome                - Adelie penguin              Potato
- Genome assembly              - Emperor penguin             Sorghum
- *Transcriptome*              Pigeon, domestic
Cancer                         Polar bear                    Microbes
- Hepatocellular carcinoma     Sheep, domestic               E. Coli O104:H4 TY-2482
- Single-cell bladder          Tibetan antelope
Human exome – chronic                                        Cell-lines
hepatitis B infection          Invertebrates                 Chinese Hamster Ovary
predisposing variants          Ant                           (CHO)
                               - Florida carpenter ant
Vertebrates                    - Jerdon’s jumping ant
Darwin finch                   - Leaf-cutter ant
Giant panda                    Roundworm
Macaque                        Schistosoma haematobium
- Chinese rhesus               Silkworm, domestic and wild
Currently: 36 public datasets
                             *5 citations in the references*
Humans                         - Crab-eating                 Plants
Ancient DNA                    Minipig                       Chinese cabbage
- Aboriginal Australian        Mouse methylomes              Cucumber, domestic
- Saqqaq Eskimo                Naked mole rat                Foxtail millet
Asian individual (YH)          Penguin                       Pigeonpea
- DNA methylome                - Adelie penguin              Potato
- Genome assembly              - Emperor penguin             Sorghum
- *Transcriptome*              Pigeon, domestic
Cancer                         *Polar bear*                  Microbes
- Hepatocellular carcinoma     Sheep, domestic               E. Coli O104:H4 TY-2482
- Single-cell bladder          Tibetan antelope
Human exome – chronic                                        Cell-lines
hepatitis B infection          Invertebrates                 Chinese Hamster Ovary
predisposing variants          Ant                           (CHO)
                               - Florida carpenter ant
Vertebrates                    - Jerdon’s jumping ant
Darwin finch                   - Leaf-cutter ant
Giant panda                    Roundworm
Macaque                        Schistosoma haematobium
- Chinese rhesus               Silkworm, domestic and wild
Currently: 36 public datasets
                                 *5 citations in the references*
Humans                             - Crab-eating                 Plants
Ancient DNA                        Minipig                       Chinese cabbage
- Aboriginal Australian            *Mouse methylomes*            Cucumber, domestic
- Saqqaq Eskimo                    Naked mole rat                Foxtail millet
Asian individual (YH)              Penguin                       Pigeonpea
- DNA methylome                    - Adelie penguin              Potato
- Genome assembly                  - Emperor penguin             Sorghum
- Transcriptome                    Pigeon, domestic
Cancer                             Polar bear                    Microbes
- Hepatocellular carcinoma         Sheep, domestic               E. Coli O104:H4 TY-2482
- *Single-cell bladder cancer*     Tibetan antelope
Human exome – chronic                                            Cell-lines
hepatitis B infection              Invertebrates                 Chinese Hamster Ovary
predisposing variants              Ant                           (CHO)
                                   - Florida carpenter ant
Vertebrates                        - Jerdon’s jumping ant
Darwin finch                       - Leaf-cutter ant
Giant panda                        Roundworm
Macaque                            Schistosoma haematobium
- Chinese rhesus                   Silkworm, domestic and wild
GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological
and biomedical research as it enters the era of “big-data”… (see more)
GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological
and biomedical research as it enters the era of “big-data”… (see more)
GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological
and biomedical research as it enters the era of “big-data”… (see more)
http://dx.doi.org/10.5524/100015
                                      http://gigadb.org/100015




Related DOIs:
10.5524/100013 (is supplemented by)
10.5524/100014 (is supplemented by)
Galaxy for GigaScience

Bioinformatics
Development      Biomedical and bioinformatics research   Publishing
Thanks to:
Laurie Goodman        Shaoguang Liang (BGI-SZ)
Scott Edmonds         Tin-Lap Lee (CUHK)
Alexandra Basford     Qiong Luo (HKUST)
Peter Li              Senghong Wang (HKUST)
Jesse Si Zhe          Yan Zhou (HKUST)
                      Cogini
                       editorial@gigasciencejournal.com
Contact us:            database@gigasciencejournal.com

                        @gigascience

 Follow us:             facebook.com/GigaScience

                        blogs.openaccesscentral.com/blogs/gigablog/


              www.gigadb.org

More Related Content

What's hot

Lec 07 Non Dom
Lec 07 Non DomLec 07 Non Dom
Lec 07 Non Dom
DrAlana
 
13 4 applications of genetic engineering
13 4 applications of genetic engineering13 4 applications of genetic engineering
13 4 applications of genetic engineering
arislantern
 
Aniket_An Integrated Approach to Biology
Aniket_An Integrated Approach to BiologyAniket_An Integrated Approach to Biology
Aniket_An Integrated Approach to Biology
Aniket Bhattacharya
 

What's hot (19)

Rabid presentation for Medical students
Rabid presentation for Medical studentsRabid presentation for Medical students
Rabid presentation for Medical students
 
Animal toxins zootoxins and snake venom toxicity by Dr N B Shridhar
Animal toxins zootoxins  and  snake venom toxicity by Dr N B ShridharAnimal toxins zootoxins  and  snake venom toxicity by Dr N B Shridhar
Animal toxins zootoxins and snake venom toxicity by Dr N B Shridhar
 
Termite evolution: Rise of Termitidae
Termite evolution: Rise of TermitidaeTermite evolution: Rise of Termitidae
Termite evolution: Rise of Termitidae
 
DNA Technology 2: Genetic Engineering
DNA Technology 2: Genetic EngineeringDNA Technology 2: Genetic Engineering
DNA Technology 2: Genetic Engineering
 
Forensic Entmology
Forensic EntmologyForensic Entmology
Forensic Entmology
 
Introduction to Virology
Introduction to Virology Introduction to Virology
Introduction to Virology
 
Forensic Entomology
Forensic EntomologyForensic Entomology
Forensic Entomology
 
Bruce Deagle - Opening Plenary
Bruce Deagle - Opening PlenaryBruce Deagle - Opening Plenary
Bruce Deagle - Opening Plenary
 
The Opisthokonts, 2014
The Opisthokonts, 2014The Opisthokonts, 2014
The Opisthokonts, 2014
 
Haploid production by centromere mediated genome elimination
Haploid production by centromere mediated genome eliminationHaploid production by centromere mediated genome elimination
Haploid production by centromere mediated genome elimination
 
Lec 07 Non Dom
Lec 07 Non DomLec 07 Non Dom
Lec 07 Non Dom
 
Genetic engineering
Genetic engineeringGenetic engineering
Genetic engineering
 
Forrensic entomology by ved prakash sharma 2016
Forrensic entomology by ved prakash sharma 2016Forrensic entomology by ved prakash sharma 2016
Forrensic entomology by ved prakash sharma 2016
 
Introductionto biotechnology
Introductionto biotechnologyIntroductionto biotechnology
Introductionto biotechnology
 
13 4 applications of genetic engineering
13 4 applications of genetic engineering13 4 applications of genetic engineering
13 4 applications of genetic engineering
 
Aniket_An Integrated Approach to Biology
Aniket_An Integrated Approach to BiologyAniket_An Integrated Approach to Biology
Aniket_An Integrated Approach to Biology
 
Protozoa III
Protozoa IIIProtozoa III
Protozoa III
 
Transgenic animals, mice and fish
Transgenic animals, mice and fishTransgenic animals, mice and fish
Transgenic animals, mice and fish
 
e. coli
e. colie. coli
e. coli
 

Viewers also liked

Viewers also liked (20)

Major achievements of CEG
Major achievements of CEGMajor achievements of CEG
Major achievements of CEG
 
Always the bridesmaid: Should pigeon pea take the center stage?
Always the bridesmaid: Should pigeon pea take the center stage?Always the bridesmaid: Should pigeon pea take the center stage?
Always the bridesmaid: Should pigeon pea take the center stage?
 
34b Kb Saxena Objective6 Phase Ii
34b Kb Saxena Objective6 Phase Ii34b Kb Saxena Objective6 Phase Ii
34b Kb Saxena Objective6 Phase Ii
 
11 Bekele Shiferaw Objective1 Pigeonpea
11 Bekele Shiferaw Objective1 Pigeonpea11 Bekele Shiferaw Objective1 Pigeonpea
11 Bekele Shiferaw Objective1 Pigeonpea
 
National Chickpea Innovation Platform: Way forward in Ethiopia
National Chickpea Innovation Platform: Way forward in EthiopiaNational Chickpea Innovation Platform: Way forward in Ethiopia
National Chickpea Innovation Platform: Way forward in Ethiopia
 
Presentation1
Presentation1Presentation1
Presentation1
 
Pigeonpea in ESA - A story of two decades
Pigeonpea in ESA - A story of two decadesPigeonpea in ESA - A story of two decades
Pigeonpea in ESA - A story of two decades
 
Super-early pigeonpea varieties and hybrids: New intervener for maximized, ti...
Super-early pigeonpea varieties and hybrids: New intervener for maximized, ti...Super-early pigeonpea varieties and hybrids: New intervener for maximized, ti...
Super-early pigeonpea varieties and hybrids: New intervener for maximized, ti...
 
Policy issues in pulses in India
Policy issues in pulses in IndiaPolicy issues in pulses in India
Policy issues in pulses in India
 
Deploying genome sequence information for pigeonpea improvement
Deploying genome sequence information for pigeonpea improvementDeploying genome sequence information for pigeonpea improvement
Deploying genome sequence information for pigeonpea improvement
 
Varshney
VarshneyVarshney
Varshney
 
Pigeonpea by utkarsh
Pigeonpea by utkarshPigeonpea by utkarsh
Pigeonpea by utkarsh
 
Conquering gene pools in pigeonpea
Conquering gene pools in pigeonpeaConquering gene pools in pigeonpea
Conquering gene pools in pigeonpea
 
Koro gude pigeon pea
Koro gude pigeon peaKoro gude pigeon pea
Koro gude pigeon pea
 
Research advances in pulses and benefit to stakeholders dr. c. l. gowda
Research advances in pulses and benefit to stakeholders   dr. c. l. gowdaResearch advances in pulses and benefit to stakeholders   dr. c. l. gowda
Research advances in pulses and benefit to stakeholders dr. c. l. gowda
 
IFPRI- Myanmar Pulses Production, Trade and Technology - Issues and Prospect...
IFPRI- Myanmar Pulses Production, Trade and Technology - Issues and  Prospect...IFPRI- Myanmar Pulses Production, Trade and Technology - Issues and  Prospect...
IFPRI- Myanmar Pulses Production, Trade and Technology - Issues and Prospect...
 
IFPRI- Boosting Pulse Production in India-What worked and what did not, N P S...
IFPRI- Boosting Pulse Production in India-What worked and what did not, N P S...IFPRI- Boosting Pulse Production in India-What worked and what did not, N P S...
IFPRI- Boosting Pulse Production in India-What worked and what did not, N P S...
 
The Pulse of Pulses: Story of Pigeonpea
The Pulse of Pulses: Story of PigeonpeaThe Pulse of Pulses: Story of Pigeonpea
The Pulse of Pulses: Story of Pigeonpea
 
Pulse Genomics Comes of Age
Pulse Genomics Comes of AgePulse Genomics Comes of Age
Pulse Genomics Comes of Age
 
Development of First Multiparent Advanced Generation Inter-cross (MAGIC) Popu...
Development of First Multiparent Advanced Generation Inter-cross (MAGIC) Popu...Development of First Multiparent Advanced Generation Inter-cross (MAGIC) Popu...
Development of First Multiparent Advanced Generation Inter-cross (MAGIC) Popu...
 

Similar to Tam Sneddon: Revolutionizing data dissemination, organization and use.

SURCA 2016 poster
SURCA 2016 posterSURCA 2016 poster
SURCA 2016 poster
Mitchell Go
 
L13 functional and_comparative_genomics
L13 functional and_comparative_genomicsL13 functional and_comparative_genomics
L13 functional and_comparative_genomics
MUBOSScz
 
13 genetic engineering bw
13 genetic engineering bw13 genetic engineering bw
13 genetic engineering bw
honey444
 

Similar to Tam Sneddon: Revolutionizing data dissemination, organization and use. (20)

Scott Edmunds at DataCite 2012: Adventures in Data Citation
Scott Edmunds at DataCite 2012: Adventures in Data CitationScott Edmunds at DataCite 2012: Adventures in Data Citation
Scott Edmunds at DataCite 2012: Adventures in Data Citation
 
Nest parasitism in birds
Nest parasitism in birdsNest parasitism in birds
Nest parasitism in birds
 
Tears Of The Cheetah
Tears Of The CheetahTears Of The Cheetah
Tears Of The Cheetah
 
Bmz Worms Bb
Bmz  Worms BbBmz  Worms Bb
Bmz Worms Bb
 
SURCA 2016 poster
SURCA 2016 posterSURCA 2016 poster
SURCA 2016 poster
 
Category
CategoryCategory
Category
 
L13 functional and_comparative_genomics
L13 functional and_comparative_genomicsL13 functional and_comparative_genomics
L13 functional and_comparative_genomics
 
K.A. Seifert - Algae, Protists & Fungi Plenary
K.A. Seifert - Algae, Protists & Fungi PlenaryK.A. Seifert - Algae, Protists & Fungi Plenary
K.A. Seifert - Algae, Protists & Fungi Plenary
 
Recombinant DNA technology
Recombinant DNA technologyRecombinant DNA technology
Recombinant DNA technology
 
Evolution and Biodiversity 2016
Evolution and Biodiversity 2016Evolution and Biodiversity 2016
Evolution and Biodiversity 2016
 
Parasitic bat fly fitness and survival after separation from host
Parasitic bat fly fitness and survival after separation from host Parasitic bat fly fitness and survival after separation from host
Parasitic bat fly fitness and survival after separation from host
 
Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Ge...
Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Ge...Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Ge...
Jonathan Eisen talk on "Phylogenomics of Microbes" at Lake Arrowhead Small Ge...
 
13 genetic engineering bw
13 genetic engineering bw13 genetic engineering bw
13 genetic engineering bw
 
Dmpstr diver poster
Dmpstr diver posterDmpstr diver poster
Dmpstr diver poster
 
Pierre Taberlet - Saturday Closing Plenary
Pierre Taberlet - Saturday Closing PlenaryPierre Taberlet - Saturday Closing Plenary
Pierre Taberlet - Saturday Closing Plenary
 
Human encodeproject
Human encodeprojectHuman encodeproject
Human encodeproject
 
r-DNA Technology
r-DNA Technologyr-DNA Technology
r-DNA Technology
 
Transformations of cells
Transformations of cellsTransformations of cells
Transformations of cells
 
B26 vq applied genetics
 B26 vq applied genetics B26 vq applied genetics
B26 vq applied genetics
 
virus structure,cultural properties, serological identification
virus structure,cultural properties, serological identificationvirus structure,cultural properties, serological identification
virus structure,cultural properties, serological identification
 

More from GigaScience, BGI Hong Kong

More from GigaScience, BGI Hong Kong (20)

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Tam Sneddon: Revolutionizing data dissemination, organization and use.

  • 1. Now taking submissions… – revolutionizing data dissemination, organization and use. Tam Sneddon BGI-Hong Kong www.gigadb.org
  • 2. Overview Introduction What is , / why we want your data and why you should submit to us? Published datasets Data Publishing New database features DOIs Future tools: Galaxy/Cloud
  • 3.  Reproducibility/Reuse  Utility/Usability  Standards/Searchability/Sharing  Data publishing/DOI www.gigasciencejournal.com www.gigadb.org
  • 4. DataCite goal: “increase acceptance of research as legitimate, citable contributions to the scholarly record”
  • 5.
  • 6. Currently: 36 public datasets Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - Transcriptome Pigeon, domestic Cancer Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder Tibetan antelope T2D gut metagenome Human exome – chronic hepatitis B infection Invertebrates Cell-lines predisposing variants Ant Chinese Hamster Ovary - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild
  • 7. Currently: 36 public datasets Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - Transcriptome Pigeon, domestic Cancer *14TB* Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder Tibetan antelope T2D gut metagenome Human exome – chronic hepatitis B infection Invertebrates Cell-lines predisposing variants Ant Chinese Hamster Ovary - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild
  • 8. Currently: 36 public datasets ***15 pre-publication*** Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - Transcriptome Pigeon, domestic Cancer Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder cancer Tibetan antelope T2D gut metagenome Human exome – chronic hepatitis B infection Invertebrates Cell-lines predisposing variants Ant Chinese Hamster Ovary - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild
  • 9. Currently: 36 public datasets *5 citations in the references* Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian *Mouse methylomes* Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin *Sorghum* - *Transcriptome* Pigeon, domestic Cancer *Polar bear* Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - *Single-cell bladder cancer* Tibetan antelope T2D gut metagenome Human exome – chronic hepatitis B infection Invertebrates Cell-lines predisposing variants Ant Chinese Hamster Ovary - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild
  • 10. Currently: 36 public datasets *5 citations in the references* Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin *Sorghum* - Transcriptome Pigeon, domestic Cancer Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder Tibetan antelope Human exome – chronic Cell-lines hepatitis B infection Invertebrates Chinese Hamster Ovary predisposing variants Ant (CHO) - Florida carpenter ant Complemented by data submitted to INSDC databases: Vertebrates - Jerdon’s jumping ant - Raw data SRA:SRA046843 Darwin finch - Leaf-cutter ant - Assemblies of 3 strains Genbank:AHAO00000000-AHAQ00000000 Giant panda - SNPs Roundworm dbSNP batch ids:1056306-10563068 - Macaque Schistosoma haematobium - CNVs - - Chinese rhesus InDels SV } Silkworm, domestic and wild dbVAR:nstd63
  • 11. Currently: 36 public datasets *5 citations in the references* Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - *Transcriptome* Pigeon, domestic Cancer Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder Tibetan antelope Human exome – chronic Cell-lines hepatitis B infection Invertebrates Chinese Hamster Ovary predisposing variants Ant (CHO) - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild
  • 12. Currently: 36 public datasets *5 citations in the references* Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - *Transcriptome* Pigeon, domestic Cancer *Polar bear* Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder Tibetan antelope Human exome – chronic Cell-lines hepatitis B infection Invertebrates Chinese Hamster Ovary predisposing variants Ant (CHO) - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild
  • 13. Currently: 36 public datasets *5 citations in the references* Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian *Mouse methylomes* Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - Transcriptome Pigeon, domestic Cancer Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - *Single-cell bladder cancer* Tibetan antelope Human exome – chronic Cell-lines hepatitis B infection Invertebrates Chinese Hamster Ovary predisposing variants Ant (CHO) - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild
  • 14. GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological and biomedical research as it enters the era of “big-data”… (see more)
  • 15. GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological and biomedical research as it enters the era of “big-data”… (see more)
  • 16.
  • 17. GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological and biomedical research as it enters the era of “big-data”… (see more)
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27. http://dx.doi.org/10.5524/100015 http://gigadb.org/100015 Related DOIs: 10.5524/100013 (is supplemented by) 10.5524/100014 (is supplemented by)
  • 28.
  • 29. Galaxy for GigaScience Bioinformatics Development Biomedical and bioinformatics research Publishing
  • 30. Thanks to: Laurie Goodman Shaoguang Liang (BGI-SZ) Scott Edmonds Tin-Lap Lee (CUHK) Alexandra Basford Qiong Luo (HKUST) Peter Li Senghong Wang (HKUST) Jesse Si Zhe Yan Zhou (HKUST) Cogini editorial@gigasciencejournal.com Contact us: database@gigasciencejournal.com @gigascience Follow us: facebook.com/GigaScience blogs.openaccesscentral.com/blogs/gigablog/ www.gigadb.org

Editor's Notes

  1. I would like to thank the organizers for the opportunity to present the Giga database this evening.I realize I’m all that stands between you and the alcohol outside this room so I’ll try not to run over time!
  2. I would first like to give a brief introduction to GigaDB and GigaScience, then I’ll describe GigaDB in more detail, say why we want your data, and hopefully give you convincing reasons WHY you should submit to us! I’ll then mention DataCite and what it means for a dataset to be assigned a DOI, then I’give you examples of some of the datasets in GigaDB and how they are cited and acknowledged, describe the features of the new GigaDB website (expected next month) and finally I’ll sum up with tools our team are working on and hope to integrate with GigaDB in the upcoming months.
  3. Basically, GigaScience aims to be a home for large scale biological and biomedical studies by providing a place for data hosting, and providing additional credit to authors for making their data available by assigning DOIs to each published dataset.The GigaSciencejournal is open-access and published online by BioMed Central in collaboration with BGI. Scott Edmunds, the Editor, is in the audience. GigaScience officially launched in July this year with GigaDB as the associated database built to host the supplementary files, images, software and any other data from the GigaScience article.The criteria and focus of both the GigaScience journal and the database includes:Reproducibility/ReuseBecause the GigaScience datasets are all open-access and assigned a DOI they are stable and permanent so results can be tested and reproduced and the data reused for reanalysis or comparison of new analyses. Utility/UsabilityThe new GigaDB website will have integrated tools such as Galaxy and MyExperiment(which I’ll mention briefly at the end of my talk) to promote more widespread access, viewing, and analysis of data and integration of the BGI Cloud Computing resources for handling and analyzing large-scale data will allow any researcher to access and analyze the data no matter how large or small their institution’s IT infrastructure.Standards/Searchability/SharingWe support the use Biosharing and the use of ISA-Tab to aid and promote best practice in metadata reporting and sharing so the data can be portable across other platforms.We mandate all supporting data must be publicly available.And we encourageMIBBI (Minimum Information for Biological and Biomedical Investigations)compliance and use of community reporting checklists.Data publishing/DOIFinally, as mentioned, we register all datasets and DOIs with DataCite which are citable and we hope this will promote rapid release of data and encourage researchers to release their data pre-publication.
  4. So, a little bit about DOIs or Digital Object Identifiers.DOIs are unique identifiers that are also resolvable to a webpage and have been used in the journal world for a long time to provide a permanent identifiers and links to journal articles.We register our DOIs with DataCite, which was set up specifically aimed towards datasets and providing incentives and credit to the data producers. Their goal is to “increase acceptance of research as legitimate, citable contributions to the scholarly record”. We automatically generate the metadata XML from GigaDB and provide as much as possible within the DataCite schema to aid discovery of the datasets via a central metadata repository (with an open API) and other metadata harvesters including the upcoming Data Citation Index by Thomson Reuters.For example, if you search DataCite for ‘GigaDB’ there are 35 records returned corresponding to the 35 published datasets in GigaDB.The 10.5524 prefix is unique for the GigaScience dataset project and our datasets start with Genomic Data from E Coli, the first DOI we released pre-publication, at 100001 and then go up sequentially. The first 5 datasets listed here just happen to be Genomic but we currently have Transcriptomic, Epigenomic and Metagenomic datasets with Proteomic datasets in the pipeline and plans to extend to include the likes of biomedical imaging and environmental studies.
  5. If we randomly select DOI:10.55224/100015 you can view the metadata associated with the Genome Sequence of the YH individual. The citation includes the authors, year of publication, title, publisher and resolvable DOI. This url takes you to the GigaDB landing page for this study so even if the url changes we can update the metadata and the webpage will always be resolved. We have then registered the abstract, resource type, a subject tag of ‘Genomic’, the CC0 license, size of the dataset and related identifiers. In this case the DOI is referenced by the Nature article and is supplemented by the GigaScience datasets 100013 and 100014, which are the supplementary transcriptome and the methylome datasets of YH individual, respectively.
  6. As you saw with the DOI search, to date we have issued DOIs to 36 datasets including human, vertebrates, invertebrates, plants, microbes and cell-lines.
  7. We have the capacity to store very large datasets at BGI, which is exemplified by the Asian Cancer Research Groups’ Hepatocellular carcinoma dataset which is 14 terabytes in size. By providing tools and integration with the BGI Cloud we hope to make this important dataset available for anyone to access and analyze.Many of the datasets in GigaDB are also part of larger collaborations and projects such as the Genome 10K which includes our most recent release of the Darwin finch genome assembly and annotation. With the new GigaDB interface you can search specifically for datasets from these projects.
  8. Many of these datasets were made public and the DOI released prior to publication, and – I would like to stress - this DID NOT prevent subsequent publication.
  9. Indeed, five subsequent publications cite the respective GigaScience DOI in the references…The transcriptome from the YH lymphoblastoid cell lineThe single-cell whole exome sequence from an individual bladder cancerThe MEDUSA computational pipeline used to identify differentially methylated regions in mouseThe polar bear genome And the sorghum genomePublications are in the pipeline for several of the remaining datasets on the list.
  10. the first of which was the Sorghum genome and analyses, published in Genome Biology last year. As noted reference 62 cites the dataset DOI. I would also like to stress that the DOI is a complement to and not a replacement for deposition of relevant data in appropriate INSDC databases at EBI, NCBI or DDBJ and it is a requirement prior to submission to GigaDB that data be deposited in such repositories. In the case of Sorghum we also worked with the authors to help them submit the SNP and structural variants to dbSNP and dbVar respectively.
  11. A GigaScience dataset citation is also included in the YH Transcriptome paper published in Nature Biotechnology in February this year.As you can see the dataset was published in 2011 but this did not prevent subsequent publication of the analysis paper.
  12. In the case of the polar bear, a group different to the one that produced the original dataset, published in Science, citing the GigaScience dataset.
  13. Finally, there are two citations from the GigaScience Journal in the last couple months since it’s official launch. One is the Mouse Methylome computational pipeline and the other is the Single Cell Bladder Cancer genome.I would like to highlight that the dataset for the Mouse methylome paper includes not only the raw fastq and alignment files which were submitted to the SRA and GEO repositories but also the MEDUSA software and bigwig methylation files, all of which are represented in ISA-TAB format.So, I hope I have convinced you that making your data public prior to publication is not just in the best interests of science but also increases your publication and citation list to aid in grant applications and career advancement!!!
  14. And now that you all want to submit to GigaDB, how do you do that and how will people search and find your data and, other than citing your DOI, what will they be able to do with the data? We have redesigned the underlying Giga database and we’re working on the front end which we hope to be public early next month so the following slides are a mix of screenshots from the development site overlaid with tweaks made in powerpoint to illustrate features you can hope to see when we go live.These include:a home page image slider for browsing datasetsa text box search which I will demonstrate shortly
  15. and an advanced search option…
  16. …which if you click, gives you detailed instructions of the syntax used by the Sphinx search engine.
  17. Here I would like to mention the login system where a user can save searches, sign up for email alerts and submit Excel submission files.
  18. This is my profile page. I am logged in and have two saved searches. If new GigaScience datasets are released that match my search criteria I will be emailed a notification with links to the datasets so I don’t have to keep checking GigaDB for new content that I may be interested in.
  19. Since I am logged in I also have the option to submit to GigaDB.
  20. An Excel template file is provided for download, along with 2 completed example files for guidance.
  21. There is also the help pages for more detailed instructions on using the website and submitting data to GigaDB.Once I confirm that I have read the GigaDB Terms and Conditions, I can upload my Excel submission file and a member of the GigaDB team should contact me within 3-5 working days. We welcome feedback on the submission system so please do let us know of any improvements to the Excel submission file to ease the process.
  22. Now, if we move on to the search facility, as an example if we search for the YH individual in the search box we get 3 datasets returned.The original YH Genome and the supplementary methylome and transcriptome datasets from the same individual.If you have many results you can use the Filters to narrow down your search, restricting by Organism, Dataset type, project, publication date or modification date.
  23. You can also hover over a dataset to read the abstract before clicking through to a DOI landing page.
  24. Alternatively, if you are looking for files to download across datasets, you can click on the tab file and use the Filters to further refineyour file search.Here narrowing down your search by filtering on File type, File format, File size or Release date.
  25. Incidentally, all the hover-over ‘I’ icons you see are information, in this example describing what the different file formats are.
  26. This download function is still being worked on but will also allow you to select multiple files for download or for direct upload to Galaxy and other tools in development which I’ll touch on at the end of my talk.
  27. This is an example landing page for DOI 10.5524/100015 for the YH genome dataset. It will be accessible both from the GigaDBurl and the DOI url.These pages are still in development but what you will see is the dataset metadata including:date releaseddataset typetitle abstract how the dataset should be citedLinks to related manuscripts, datasets, additional information, genome browsers, accessions and projectsSample details
  28. And finally at the bottom, file descriptions and options (not shown in this illustration) to download the files (or upload them to tools such as Galaxy)
  29. Leading on from that, current and future plans include collaborating with Tin-Lap Lee at the Chinese University of Hong Kong to integrate an instance of the Galaxy bioinformatics platform with GigaDB so users can make full use of the data in GigaDB by linking it to other resources and we can incorporate fully executable papers. One such submission is a new SOAPdenovo pipeline. The SOAP tools have been wrapped in Galaxy, the workflow defined in MyExperiment and the data will be issued with a DOI and accessible via GigaDB. Utilizing the BGI cloud if necessary, users will then be able to reproduce all the steps described in the GigaScience paper to test, reanalyze, compare results etc.Since we would like GigaDB to be a host for data types that have no other home, such as imaging data, we are investigating adding other tools such as an image viewer and the like to support accessibility to and usability of the data. So, if you have a large-scale biological or biomedical dataset and/or a pipeline or software that you would like to submit to GigaScience we would love to hear from you so please come and talk to Scott or myself.
  30. That just leaves me to thank the GigaScience team: Laurie, Scott, Alexandra, Peter and Jesse, BGI for their support - specifically Shaoguang for IT and bioinformatics support – our collaborators on the database, website and tools: Tin-Lap, Qiong, Senhong, Yan, the Cogini web design team, Datacite for providing the DOI service and the isacommons team for their support and advocacy for best practice use of metadata reporting and sharing.Thank you for listening.