Tam Sneddon: Revolutionizing data dissemination, organization and use.

•Download as PPTX, PDF•

1 like•1,002 views

GigaDB is a new database integrated with the GigaScience journal to meet the needs of biological and biomedical research in the era of big data. It currently contains 36 public datasets across various domains including humans, plants, microbes, vertebrates and invertebrates. The database aims to improve reproducibility, usability, standards, and sharing of data.

Technology Health & Medicine

Now taking submissions…

– revolutionizing data dissemination,
organization and use.
Tam Sneddon
BGI-Hong Kong

www.gigadb.org

Overview
Introduction What is ,
/ why we want your
data and why you
should submit to us?

Published datasets
Data Publishing

New database features
DOIs

Future tools: Galaxy/Cloud

 Reproducibility/Reuse

 Utility/Usability

 Standards/Searchability/Sharing

 Data publishing/DOI

www.gigasciencejournal.com
www.gigadb.org

DataCite goal: “increase acceptance of research as legitimate, citable contributions to the scholarly record”

Currently: 36 public datasets

Humans - Crab-eating Plants
Ancient DNA Minipig Chinese cabbage
- Aboriginal Australian Mouse methylomes Cucumber, domestic
- Saqqaq Eskimo Naked mole rat Foxtail millet
Asian individual (YH) Penguin Pigeonpea
- DNA methylome - Adelie penguin Potato
- Genome assembly - Emperor penguin Sorghum
- Transcriptome Pigeon, domestic
Cancer Polar bear Microbes
- Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482
- Single-cell bladder Tibetan antelope T2D gut metagenome
Human exome – chronic
hepatitis B infection Invertebrates Cell-lines
predisposing variants Ant Chinese Hamster Ovary
- Florida carpenter ant
Vertebrates - Jerdon’s jumping ant
Darwin finch - Leaf-cutter ant
Giant panda Roundworm
Macaque Schistosoma haematobium
- Chinese rhesus Silkworm, domestic and wild

Currently: 36 public datasets
***15 pre-publication***
Humans - Crab-eating Plants
Ancient DNA Minipig Chinese cabbage
- Aboriginal Australian Mouse methylomes Cucumber, domestic
- Saqqaq Eskimo Naked mole rat Foxtail millet
Asian individual (YH) Penguin Pigeonpea
- DNA methylome - Adelie penguin Potato
- Genome assembly - Emperor penguin Sorghum
- Transcriptome Pigeon, domestic
Cancer Polar bear Microbes
- Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482
- Single-cell bladder cancer Tibetan antelope T2D gut metagenome
Human exome – chronic
hepatitis B infection Invertebrates Cell-lines
predisposing variants Ant Chinese Hamster Ovary
- Florida carpenter ant
Vertebrates - Jerdon’s jumping ant
Darwin finch - Leaf-cutter ant
Giant panda Roundworm
Macaque Schistosoma haematobium
- Chinese rhesus Silkworm, domestic and wild

Currently: 36 public datasets
*5 citations in the references*
Humans - Crab-eating Plants
Ancient DNA Minipig Chinese cabbage
- Aboriginal Australian *Mouse methylomes* Cucumber, domestic
- Saqqaq Eskimo Naked mole rat Foxtail millet
Asian individual (YH) Penguin Pigeonpea
- DNA methylome - Adelie penguin Potato
- Genome assembly - Emperor penguin *Sorghum*
- *Transcriptome* Pigeon, domestic
Cancer *Polar bear* Microbes
- Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482
- *Single-cell bladder cancer* Tibetan antelope T2D gut metagenome
Human exome – chronic
hepatitis B infection Invertebrates Cell-lines
predisposing variants Ant Chinese Hamster Ovary
- Florida carpenter ant
Vertebrates - Jerdon’s jumping ant
Darwin finch - Leaf-cutter ant
Giant panda Roundworm
Macaque Schistosoma haematobium
- Chinese rhesus Silkworm, domestic and wild

GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological
and biomedical research as it enters the era of “big-data”… (see more)

http://dx.doi.org/10.5524/100015
http://gigadb.org/100015

Related DOIs:
10.5524/100013 (is supplemented by)
10.5524/100014 (is supplemented by)

Galaxy for GigaScience

Bioinformatics
Development Biomedical and bioinformatics research Publishing

Thanks to:
Laurie Goodman Shaoguang Liang (BGI-SZ)
Scott Edmonds Tin-Lap Lee (CUHK)
Alexandra Basford Qiong Luo (HKUST)
Peter Li Senghong Wang (HKUST)
Jesse Si Zhe Yan Zhou (HKUST)
Cogini
editorial@gigasciencejournal.com
Contact us: database@gigasciencejournal.com

@gigascience

Follow us: facebook.com/GigaScience

blogs.openaccesscentral.com/blogs/gigablog/

www.gigadb.org

What's hot

Rabid presentation for Medical students

siderits

Animal toxins zootoxins and snake venom toxicity by Dr N B Shridhar

Department of Veterinary Pharmacology and Toxicology, KVAFSU,Veterinary College, Shivamogga

Termite evolution: Rise of Termitidae

Kishor6460

DNA Technology 2: Genetic Engineering

Robin Seamon

Forensic Entomology is the use of the insects, and their arthropod relatives that inhabit decomposing remains, to aid legal investigations.Forensic entomology is commonly used to estimate the time of death when the circumstances surrounding the crime are unknown.Insects arrive at a decomposing body in a particular order and then complete their life cycle based on the surrounding temperature. By collecting and studying the types of insects found on a body, a forensic entomologist can predict the time of death

Forensic Entmology

ParulSharma328

Introduction to Virology

Arun Geetha Viswanathan

Forensic Entomology

Madona Mathew

Bruce Deagle - Opening Plenary

Consortium for the Barcode of Life (CBOL)

The Opisthokonts, 2014

irtrillo

Haploid production by centromere mediated genome elimination

IARI, New Delhi

Lec 07 Non Dom

DrAlana

Genetic engineering

Aishwarya Hajra

Forrensic entomology by ved prakash sharma 2016

Ved prakash Sharma

Introductionto biotechnology

ADELABU Olusesan Adeyemi

13 4 applications of genetic engineering

arislantern

Aniket_An Integrated Approach to Biology

Aniket Bhattacharya

Protozoa III

dindinhorneja

Introduction. Definition. Importance of transgenic animals. Transgenic mice Methods for introducing a foreign gene: The retroviral vector method The DNA microinjection method/ pronuclear microinjection Genetically engineered embryonic stem cells Transgenic fish What is transgenic fish? A few facts to know to know about transgenic fish. Important points needed for genetic engineering (gene transfer) to produce transgenic fish. Development of transgenic fishes. A few examples Auto-transgenesis. Controlled culture of transgenic fish and feed. Gene transfer technology for development of transgenic fishes. Gene flow. Food safety issues. Conclusion. Bibliography.

Transgenic animals, mice and fish

KAUSHAL SAHU

e. coli

Rucha Joshi

What's hot (19)

Rabid presentation for Medical students

Animal toxins zootoxins and snake venom toxicity by Dr N B Shridhar

Termite evolution: Rise of Termitidae

DNA Technology 2: Genetic Engineering

Forensic Entmology

Introduction to Virology

Forensic Entomology

Bruce Deagle - Opening Plenary

The Opisthokonts, 2014

Haploid production by centromere mediated genome elimination

Lec 07 Non Dom

Genetic engineering

Forrensic entomology by ved prakash sharma 2016

Introductionto biotechnology

13 4 applications of genetic engineering

Aniket_An Integrated Approach to Biology

Protozoa III

Transgenic animals, mice and fish

e. coli

Viewers also liked

Major achievements of CEG

ICRISAT

Always the bridesmaid: Should pigeon pea take the center stage?

FAO

34b Kb Saxena Objective6 Phase Ii

World Agroforestry (ICRAF)

11 Bekele Shiferaw Objective1 Pigeonpea

World Agroforestry (ICRAF)

Bringing together all actors in the chickpea value chain was a key focus for setting up a National Chickpea Innovation Platform. Other new initiatives include enhancing chickpea productivity and marketing based on the targets of the Ethiopian Growth and Transformation Plan 2 (GTP2) and enhancing household consumption for nutrition and food security– were discussed at a recent workshop in Ethiopia.

National Chickpea Innovation Platform: Way forward in Ethiopia

Tropical Legumes III

Presentation1

Varsha Gayatonde

Pigeonpea in ESA - A story of two decades

ICRISAT

Neglected crop of yesteryears, pigeonpea is a multipurpose, versatile food legume, which has seen greater evolution in its plant architecture, duration and yielding patterns as time passed. Attractive market price for pigeonpea has drifted farmer’s attention from traditional cereal farming to pigeonpea production, giving opportunity for breeders to develop super-early maturity class in pigeonpea. With the life span of less than 100 days, latter proves to be foundation for future pigeonpea breeding due to its earliness, photo-insensitive nature, impressive per day productivity, adaptability across the varying range of altitudes, stress escape mechanism, niche to fit well in pulse – wheat cropping system and rice fallows as well as high density cropping systems. Faster generation turn over, is a boon to the breeders for faster introgression of trait of interest and to carry out studies on genetics of biotic and abiotic stress by developing mapping population within a very short duration. In the above context super-early varieties and hybrids serves as a wonderful breeding material to secure future sustainable dry land pigeonpea production.

Super-early pigeonpea varieties and hybrids: New intervener for maximized, ti...

ICRISAT

Policy issues in pulses in India

A Amarender Reddy

Deploying genome sequence information for pigeonpea improvement

ICARDA

Varshney

Chandra Sekar

Pigeonpea by utkarsh

utkarsh2011

Conquering gene pools in pigeonpea

ICRISAT

Koro gude pigeon pea

herbalfood

Research advances in pulses and benefit to stakeholders dr. c. l. gowda

ipga

IFPRI- Myanmar Pulses Production, Trade and Technology - Issues and Prospect...

International Food Policy Research Institute- South Asia Office

IFPRI- Boosting Pulse Production in India-What worked and what did not, N P S...

International Food Policy Research Institute- South Asia Office

The Pulse of Pulses: Story of Pigeonpea

ICRISAT

Pulse Genomics Comes of Age

ICARDA

Development of First Multiparent Advanced Generation Inter-cross (MAGIC) Popu...

ICRISAT

Viewers also liked (20)

Major achievements of CEG

Always the bridesmaid: Should pigeon pea take the center stage?

34b Kb Saxena Objective6 Phase Ii

11 Bekele Shiferaw Objective1 Pigeonpea

National Chickpea Innovation Platform: Way forward in Ethiopia

Presentation1

Pigeonpea in ESA - A story of two decades

Super-early pigeonpea varieties and hybrids: New intervener for maximized, ti...

Policy issues in pulses in India

Deploying genome sequence information for pigeonpea improvement

Varshney

Pigeonpea by utkarsh

Conquering gene pools in pigeonpea

Koro gude pigeon pea

Research advances in pulses and benefit to stakeholders dr. c. l. gowda

IFPRI- Myanmar Pulses Production, Trade and Technology - Issues and Prospect...

IFPRI- Boosting Pulse Production in India-What worked and what did not, N P S...

The Pulse of Pulses: Story of Pigeonpea

Pulse Genomics Comes of Age

Development of First Multiparent Advanced Generation Inter-cross (MAGIC) Popu...

Similar to Tam Sneddon: Revolutionizing data dissemination, organization and use.

Scott Edmunds at DataCite 2012: Adventures in Data Citation

GigaScience, BGI Hong Kong

Nest parasitism in birds

Noor Zada

Tears Of The Cheetah

Karobi Moitra CFD, MS, PhD

Bmz Worms Bb

Smallfry1007

SURCA 2016 poster

Mitchell Go

Similar to Tam Sneddon: Revolutionizing data dissemination, organization and use. (20)

Scott Edmunds at DataCite 2012: Adventures in Data Citation

Nest parasitism in birds

Tears Of The Cheetah

Bmz Worms Bb

SURCA 2016 poster

Recently uploaded

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

Building Digital Trust in a Digital Economy Veronica Tan, Director - Cyber Security Agency of Singapore Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

apidays

The presentation explores the development and application of artificial intelligence (AI) from its inception to its current status in the modern world. The term "artificial intelligence" was first coined by John McCarthy in 1956 to describe efforts to develop computer programs capable of performing tasks that typically require human intelligence. This concept was first introduced at a conference held at Dartmouth College, where programs demonstrated capabilities such as playing chess, proving theorems, and interpreting texts. In the early stages, Alan Turing contributed to the field by defining intelligence as the ability of a being to respond to certain questions intelligently, proposing what is now known as the Turing Test to evaluate the presence of intelligent behavior in machines. As the decades progressed, AI evolved significantly. The 1980s focused on machine learning, teaching computers to learn from data, leading to the development of models that could improve their performance based on their experiences. The 1990s and 2000s saw further advances in algorithms and computational power, which allowed for more sophisticated data analysis techniques, including data mining. By the 2010s, the proliferation of big data and the refinement of deep learning techniques enabled AI to become mainstream. Notable milestones included the success of Google's AlphaGo and advancements in autonomous vehicles by companies like Tesla and Waymo. A major theme of the presentation is the application of generative AI, which has been used for tasks such as natural language text generation, translation, and question answering. Generative AI uses large datasets to train models that can then produce new, coherent pieces of text or other media. The presentation also discusses the ethical implications and the need for regulation in AI, highlighting issues such as privacy, bias, and the potential for misuse. These concerns have prompted calls for comprehensive regulations to ensure the safe and equitable use of AI technologies. Artificial intelligence has also played a significant role in healthcare, particularly highlighted during the COVID-19 pandemic, where it was used in drug discovery, vaccine development, and analyzing the spread of the virus. The capabilities of AI in healthcare are vast, ranging from medical diagnostics to personalized medicine, demonstrating the technology's potential to revolutionize fields beyond just technical or consumer applications. In conclusion, AI continues to be a rapidly evolving field with significant implications for various aspects of society. The development from theoretical concepts to real-world applications illustrates both the potential benefits and the challenges that come with integrating advanced technologies into everyday life. The ongoing discussion about AI ethics and regulation underscores the importance of managing these technologies responsibly to maximize their their benefits while minimizing potential harms.

Artificial Intelligence: Facts and Myths

Joaquim Jorge

[2024]Digital Global Overview Report 2024 Meltwater.pdf

hans926745

Scaling API-first – The story of a global engineering organization Ian Reasor, Senior Computer Scientist - Adobe Radu Cotescu, Senior Computer Scientist - Adobe Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

apidays

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Rafal Los

Data Cloud, More than a CDP by Matt Robison

Anna Loughnan Colquhoun

Handwritten Text Recognition for manuscripts and early printed texts

Maria Levchenko

Join our latest Connector Corner webinar to discover how UiPath Integration Service revolutionizes API-centric automation in a 'Quote to Cash' process—and how that automation empowers businesses to accelerate revenue generation. A comprehensive demo will explore connecting systems, GenAI, and people, through powerful pre-built connectors designed to speed process cycle times. Speakers: James Dickson, Senior Software Engineer Charlie Greenberg, Host, Product Marketing Manager

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

DianaGray10

Imagine a world where information flows as swiftly as thought itself, making decision-making as fluid as the data driving it. Every moment is critical, and the right tools can significantly boost your organization’s performance. The power of real-time data automation through FME can turn this vision into reality. Aimed at professionals eager to leverage real-time data for enhanced decision-making and efficiency, this webinar will cover the essentials of real-time data and its significance. We’ll explore: FME’s role in real-time event processing, from data intake and analysis to transformation and reporting An overview of leveraging streams vs. automations FME’s impact across various industries highlighted by real-life case studies Live demonstrations on setting up FME workflows for real-time data Practical advice on getting started, best practices, and tips for effective implementation Join us to enhance your skills in real-time data automation with FME, and take your operational capabilities to the next level.

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Safe Software

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Created by Mozilla Research in 2012 and now part of Linux Foundation Europe, the Servo project is an experimental rendering engine written in Rust. It combines memory safety and concurrency to create an independent, modular, and embeddable rendering engine that adheres to web standards. Stewardship of Servo moved from Mozilla Research to the Linux Foundation in 2020, where its mission remains unchanged. After some slow years, in 2023 there has been renewed activity on the project, with a roadmap now focused on improving the engine’s CSS 2 conformance, exploring Android support, and making Servo a practical embeddable rendering engine. In this presentation, Rakhi Sharma reviews the status of the project, our recent developments in 2023, our collaboration with Tauri to make Servo an easy-to-use embeddable rendering engine, and our plans for the future to make Servo an alternative web rendering engine for the embedded devices industry. (c) Embedded Open Source Summit 2024 April 16-18, 2024 Seattle, Washington (US) https://events.linuxfoundation.org/embedded-open-source-summit/ https://ossna2024.sched.com/event/1aBNF/a-year-of-servo-reboot-where-are-we-now-rakhi-sharma-igalia

A Year of the Servo Reboot: Where Are We Now?

Igalia

Discord is a free app offering voice, video, and text chat functionalities, primarily catering to the gaming community. It serves as a hub for users to create and join servers tailored to their interests. Discord’s ecosystem comprises servers, each functioning as a distinct online community with its own channels dedicated to specific topics or activities. Users can engage in text-based discussions, voice calls, or video chats within these channels. Understanding Discord Servers Discord servers are virtual spaces where users congregate to interact, share content, and build communities. Servers may revolve around gaming, hobbies, interests, or fandoms, providing a platform for like-minded individuals to connect. Communication Features Discord offers a range of communication tools, including text channels for messaging, voice channels for real-time audio conversations, and video channels for face-to-face interactions. These features facilitate seamless communication and collaboration. What Does NSFW Mean? The acronym NSFW stands for “Not Safe For Work,” indicating content that may be inappropriate for professional or public settings. NSFW Content NSFW content encompasses material that is sexually explicit, violent, or otherwise graphic in nature. It often includes nudity, profanity, or depictions of sensitive topics.

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

UK Journal

Enterprise Knowledge’s Urmi Majumder, Principal Data Architecture Consultant, and Fernando Aguilar Islas, Senior Data Science Consultant, presented "Driving Behavioral Change for Information Management through Data-Driven Green Strategy" on March 27, 2024 at Enterprise Data World (EDW) in Orlando, Florida. In this presentation, Urmi and Fernando discussed a case study describing how the information management division in a large supply chain organization drove user behavior change through awareness of the carbon footprint of their duplicated and near-duplicated content, identified via advanced data analytics. Check out their presentation to gain valuable perspectives on utilizing data-driven strategies to influence positive behavioral shifts and support sustainability initiatives within your organization. In this session, participants gained answers to the following questions: - What is a Green Information Management (IM) Strategy, and why should you have one? - How can Artificial Intelligence (AI) and Machine Learning (ML) support your Green IM Strategy through content deduplication? - How can an organization use insights into their data to influence employee behavior for IM? - How can you reap additional benefits from content reduction that go beyond Green IM?

Driving Behavioral Change for Information Management through Data-Driven Gree...

Enterprise Knowledge

This presentation explores the impact of HTML injection attacks on web applications, detailing how attackers exploit vulnerabilities to inject malicious code into web pages. Learn about the potential consequences of such attacks and discover effective mitigation strategies to protect your web applications from HTML injection vulnerabilities. for more information visit https://bostoninstituteofanalytics.org/category/cyber-security-ethical-hacking/

HTML Injection Attacks: Impact and Mitigation Strategies

Boston Institute of Analytics

Effective data discovery is crucial for maintaining compliance and mitigating risks in today's rapidly evolving privacy landscape. However, traditional manual approaches often struggle to keep pace with the growing volume and complexity of data. Join us for an insightful webinar where industry leaders from TrustArc and Privya will share their expertise on leveraging AI-powered solutions to revolutionize data discovery. You'll learn how to: - Effortlessly maintain a comprehensive, up-to-date data inventory - Harness code scanning insights to gain complete visibility into data flows leveraging the advantages of code scanning over DB scanning - Simplify compliance by leveraging Privya's integration with TrustArc - Implement proven strategies to mitigate third-party risks Our panel of experts will discuss real-world case studies and share practical strategies for overcoming common data discovery challenges. They'll also explore the latest trends and innovations in AI-driven data management, and how these technologies can help organizations stay ahead of the curve in an ever-changing privacy landscape.

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

TrustArc

Axa Assurance Maroc - Insurer Innovation Award 2024

The Digital Insurer

This presentations targets students or working professionals. You may know Google for search, YouTube, Android, Chrome, and Gmail, but did you know Google has many developer tools, platforms & APIs? This comprehensive yet still high-level overview outlines the most impactful tools for where to run your code, store & analyze your data. It will also inspire you as to what's possible. This talk is 50 minutes in length.

Powerful Google developer tools for immediate impact! (2023-24 C)

wesley chun

Finology Group – Insurtech Innovation Award 2024

The Digital Insurer

presentation ICT roal in 21st century education

jfdjdjcjdnsjd

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Artificial Intelligence: Facts and Myths

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Data Cloud, More than a CDP by Matt Robison

Handwritten Text Recognition for manuscripts and early printed texts

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

A Year of the Servo Reboot: Where Are We Now?

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Driving Behavioral Change for Information Management through Data-Driven Gree...

HTML Injection Attacks: Impact and Mitigation Strategies

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Axa Assurance Maroc - Insurer Innovation Award 2024

Powerful Google developer tools for immediate impact! (2023-24 C)

Finology Group – Insurtech Innovation Award 2024

presentation ICT roal in 21st century education

Tam Sneddon: Revolutionizing data dissemination, organization and use.

1. Now taking submissions… – revolutionizing data dissemination, organization and use. Tam Sneddon BGI-Hong Kong www.gigadb.org

2. Overview Introduction What is , / why we want your data and why you should submit to us? Published datasets Data Publishing New database features DOIs Future tools: Galaxy/Cloud

3.  Reproducibility/Reuse  Utility/Usability  Standards/Searchability/Sharing  Data publishing/DOI www.gigasciencejournal.com www.gigadb.org

4. DataCite goal: “increase acceptance of research as legitimate, citable contributions to the scholarly record”

6. Currently: 36 public datasets Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - Transcriptome Pigeon, domestic Cancer Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder Tibetan antelope T2D gut metagenome Human exome – chronic hepatitis B infection Invertebrates Cell-lines predisposing variants Ant Chinese Hamster Ovary - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild

7. Currently: 36 public datasets Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - Transcriptome Pigeon, domestic Cancer *14TB* Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder Tibetan antelope T2D gut metagenome Human exome – chronic hepatitis B infection Invertebrates Cell-lines predisposing variants Ant Chinese Hamster Ovary - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild

8. Currently: 36 public datasets ***15 pre-publication*** Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - Transcriptome Pigeon, domestic Cancer Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder cancer Tibetan antelope T2D gut metagenome Human exome – chronic hepatitis B infection Invertebrates Cell-lines predisposing variants Ant Chinese Hamster Ovary - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild

9. Currently: 36 public datasets *5 citations in the references* Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian *Mouse methylomes* Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin *Sorghum* - *Transcriptome* Pigeon, domestic Cancer *Polar bear* Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - *Single-cell bladder cancer* Tibetan antelope T2D gut metagenome Human exome – chronic hepatitis B infection Invertebrates Cell-lines predisposing variants Ant Chinese Hamster Ovary - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild

10. Currently: 36 public datasets *5 citations in the references* Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin *Sorghum* - Transcriptome Pigeon, domestic Cancer Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder Tibetan antelope Human exome – chronic Cell-lines hepatitis B infection Invertebrates Chinese Hamster Ovary predisposing variants Ant (CHO) - Florida carpenter ant Complemented by data submitted to INSDC databases: Vertebrates - Jerdon’s jumping ant - Raw data SRA:SRA046843 Darwin finch - Leaf-cutter ant - Assemblies of 3 strains Genbank:AHAO00000000-AHAQ00000000 Giant panda - SNPs Roundworm dbSNP batch ids:1056306-10563068 - Macaque Schistosoma haematobium - CNVs - - Chinese rhesus InDels SV } Silkworm, domestic and wild dbVAR:nstd63

11. Currently: 36 public datasets *5 citations in the references* Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - *Transcriptome* Pigeon, domestic Cancer Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder Tibetan antelope Human exome – chronic Cell-lines hepatitis B infection Invertebrates Chinese Hamster Ovary predisposing variants Ant (CHO) - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild

12. Currently: 36 public datasets *5 citations in the references* Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian Mouse methylomes Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - *Transcriptome* Pigeon, domestic Cancer *Polar bear* Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - Single-cell bladder Tibetan antelope Human exome – chronic Cell-lines hepatitis B infection Invertebrates Chinese Hamster Ovary predisposing variants Ant (CHO) - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild

13. Currently: 36 public datasets *5 citations in the references* Humans - Crab-eating Plants Ancient DNA Minipig Chinese cabbage - Aboriginal Australian *Mouse methylomes* Cucumber, domestic - Saqqaq Eskimo Naked mole rat Foxtail millet Asian individual (YH) Penguin Pigeonpea - DNA methylome - Adelie penguin Potato - Genome assembly - Emperor penguin Sorghum - Transcriptome Pigeon, domestic Cancer Polar bear Microbes - Hepatocellular carcinoma Sheep, domestic E. Coli O104:H4 TY-2482 - *Single-cell bladder cancer* Tibetan antelope Human exome – chronic Cell-lines hepatitis B infection Invertebrates Chinese Hamster Ovary predisposing variants Ant (CHO) - Florida carpenter ant Vertebrates - Jerdon’s jumping ant Darwin finch - Leaf-cutter ant Giant panda Roundworm Macaque Schistosoma haematobium - Chinese rhesus Silkworm, domestic and wild

14. GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological and biomedical research as it enters the era of “big-data”… (see more)

15. GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological and biomedical research as it enters the era of “big-data”… (see more)

16.

17. GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological and biomedical research as it enters the era of “big-data”… (see more)

18.

19.

20.

21.

22.

23.

24.

25.

26.

27. http://dx.doi.org/10.5524/100015 http://gigadb.org/100015 Related DOIs: 10.5524/100013 (is supplemented by) 10.5524/100014 (is supplemented by)

28.

29. Galaxy for GigaScience Bioinformatics Development Biomedical and bioinformatics research Publishing

30. Thanks to: Laurie Goodman Shaoguang Liang (BGI-SZ) Scott Edmonds Tin-Lap Lee (CUHK) Alexandra Basford Qiong Luo (HKUST) Peter Li Senghong Wang (HKUST) Jesse Si Zhe Yan Zhou (HKUST) Cogini editorial@gigasciencejournal.com Contact us: database@gigasciencejournal.com @gigascience Follow us: facebook.com/GigaScience blogs.openaccesscentral.com/blogs/gigablog/ www.gigadb.org

Editor's Notes

I would like to thank the organizers for the opportunity to present the Giga database this evening.I realize I’m all that stands between you and the alcohol outside this room so I’ll try not to run over time!
I would first like to give a brief introduction to GigaDB and GigaScience, then I’ll describe GigaDB in more detail, say why we want your data, and hopefully give you convincing reasons WHY you should submit to us! I’ll then mention DataCite and what it means for a dataset to be assigned a DOI, then I’give you examples of some of the datasets in GigaDB and how they are cited and acknowledged, describe the features of the new GigaDB website (expected next month) and finally I’ll sum up with tools our team are working on and hope to integrate with GigaDB in the upcoming months.
Basically, GigaScience aims to be a home for large scale biological and biomedical studies by providing a place for data hosting, and providing additional credit to authors for making their data available by assigning DOIs to each published dataset.The GigaSciencejournal is open-access and published online by BioMed Central in collaboration with BGI. Scott Edmunds, the Editor, is in the audience. GigaScience officially launched in July this year with GigaDB as the associated database built to host the supplementary files, images, software and any other data from the GigaScience article.The criteria and focus of both the GigaScience journal and the database includes:Reproducibility/ReuseBecause the GigaScience datasets are all open-access and assigned a DOI they are stable and permanent so results can be tested and reproduced and the data reused for reanalysis or comparison of new analyses. Utility/UsabilityThe new GigaDB website will have integrated tools such as Galaxy and MyExperiment(which I’ll mention briefly at the end of my talk) to promote more widespread access, viewing, and analysis of data and integration of the BGI Cloud Computing resources for handling and analyzing large-scale data will allow any researcher to access and analyze the data no matter how large or small their institution’s IT infrastructure.Standards/Searchability/SharingWe support the use Biosharing and the use of ISA-Tab to aid and promote best practice in metadata reporting and sharing so the data can be portable across other platforms.We mandate all supporting data must be publicly available.And we encourageMIBBI (Minimum Information for Biological and Biomedical Investigations)compliance and use of community reporting checklists.Data publishing/DOIFinally, as mentioned, we register all datasets and DOIs with DataCite which are citable and we hope this will promote rapid release of data and encourage researchers to release their data pre-publication.
So, a little bit about DOIs or Digital Object Identifiers.DOIs are unique identifiers that are also resolvable to a webpage and have been used in the journal world for a long time to provide a permanent identifiers and links to journal articles.We register our DOIs with DataCite, which was set up specifically aimed towards datasets and providing incentives and credit to the data producers. Their goal is to “increase acceptance of research as legitimate, citable contributions to the scholarly record”. We automatically generate the metadata XML from GigaDB and provide as much as possible within the DataCite schema to aid discovery of the datasets via a central metadata repository (with an open API) and other metadata harvesters including the upcoming Data Citation Index by Thomson Reuters.For example, if you search DataCite for ‘GigaDB’ there are 35 records returned corresponding to the 35 published datasets in GigaDB.The 10.5524 prefix is unique for the GigaScience dataset project and our datasets start with Genomic Data from E Coli, the first DOI we released pre-publication, at 100001 and then go up sequentially. The first 5 datasets listed here just happen to be Genomic but we currently have Transcriptomic, Epigenomic and Metagenomic datasets with Proteomic datasets in the pipeline and plans to extend to include the likes of biomedical imaging and environmental studies.
If we randomly select DOI:10.55224/100015 you can view the metadata associated with the Genome Sequence of the YH individual. The citation includes the authors, year of publication, title, publisher and resolvable DOI. This url takes you to the GigaDB landing page for this study so even if the url changes we can update the metadata and the webpage will always be resolved. We have then registered the abstract, resource type, a subject tag of ‘Genomic’, the CC0 license, size of the dataset and related identifiers. In this case the DOI is referenced by the Nature article and is supplemented by the GigaScience datasets 100013 and 100014, which are the supplementary transcriptome and the methylome datasets of YH individual, respectively.
As you saw with the DOI search, to date we have issued DOIs to 36 datasets including human, vertebrates, invertebrates, plants, microbes and cell-lines.
We have the capacity to store very large datasets at BGI, which is exemplified by the Asian Cancer Research Groups’ Hepatocellular carcinoma dataset which is 14 terabytes in size. By providing tools and integration with the BGI Cloud we hope to make this important dataset available for anyone to access and analyze.Many of the datasets in GigaDB are also part of larger collaborations and projects such as the Genome 10K which includes our most recent release of the Darwin finch genome assembly and annotation. With the new GigaDB interface you can search specifically for datasets from these projects.
Many of these datasets were made public and the DOI released prior to publication, and – I would like to stress - this DID NOT prevent subsequent publication.
Indeed, five subsequent publications cite the respective GigaScience DOI in the references…The transcriptome from the YH lymphoblastoid cell lineThe single-cell whole exome sequence from an individual bladder cancerThe MEDUSA computational pipeline used to identify differentially methylated regions in mouseThe polar bear genome And the sorghum genomePublications are in the pipeline for several of the remaining datasets on the list.
the first of which was the Sorghum genome and analyses, published in Genome Biology last year. As noted reference 62 cites the dataset DOI. I would also like to stress that the DOI is a complement to and not a replacement for deposition of relevant data in appropriate INSDC databases at EBI, NCBI or DDBJ and it is a requirement prior to submission to GigaDB that data be deposited in such repositories. In the case of Sorghum we also worked with the authors to help them submit the SNP and structural variants to dbSNP and dbVar respectively.
A GigaScience dataset citation is also included in the YH Transcriptome paper published in Nature Biotechnology in February this year.As you can see the dataset was published in 2011 but this did not prevent subsequent publication of the analysis paper.
In the case of the polar bear, a group different to the one that produced the original dataset, published in Science, citing the GigaScience dataset.
Finally, there are two citations from the GigaScience Journal in the last couple months since it’s official launch. One is the Mouse Methylome computational pipeline and the other is the Single Cell Bladder Cancer genome.I would like to highlight that the dataset for the Mouse methylome paper includes not only the raw fastq and alignment files which were submitted to the SRA and GEO repositories but also the MEDUSA software and bigwig methylation files, all of which are represented in ISA-TAB format.So, I hope I have convinced you that making your data public prior to publication is not just in the best interests of science but also increases your publication and citation list to aid in grant applications and career advancement!!!
And now that you all want to submit to GigaDB, how do you do that and how will people search and find your data and, other than citing your DOI, what will they be able to do with the data? We have redesigned the underlying Giga database and we’re working on the front end which we hope to be public early next month so the following slides are a mix of screenshots from the development site overlaid with tweaks made in powerpoint to illustrate features you can hope to see when we go live.These include:a home page image slider for browsing datasetsa text box search which I will demonstrate shortly
and an advanced search option…
…which if you click, gives you detailed instructions of the syntax used by the Sphinx search engine.
Here I would like to mention the login system where a user can save searches, sign up for email alerts and submit Excel submission files.
This is my profile page. I am logged in and have two saved searches. If new GigaScience datasets are released that match my search criteria I will be emailed a notification with links to the datasets so I don’t have to keep checking GigaDB for new content that I may be interested in.
Since I am logged in I also have the option to submit to GigaDB.
An Excel template file is provided for download, along with 2 completed example files for guidance.
There is also the help pages for more detailed instructions on using the website and submitting data to GigaDB.Once I confirm that I have read the GigaDB Terms and Conditions, I can upload my Excel submission file and a member of the GigaDB team should contact me within 3-5 working days. We welcome feedback on the submission system so please do let us know of any improvements to the Excel submission file to ease the process.
Now, if we move on to the search facility, as an example if we search for the YH individual in the search box we get 3 datasets returned.The original YH Genome and the supplementary methylome and transcriptome datasets from the same individual.If you have many results you can use the Filters to narrow down your search, restricting by Organism, Dataset type, project, publication date or modification date.
You can also hover over a dataset to read the abstract before clicking through to a DOI landing page.
Alternatively, if you are looking for files to download across datasets, you can click on the tab file and use the Filters to further refineyour file search.Here narrowing down your search by filtering on File type, File format, File size or Release date.
Incidentally, all the hover-over ‘I’ icons you see are information, in this example describing what the different file formats are.
This download function is still being worked on but will also allow you to select multiple files for download or for direct upload to Galaxy and other tools in development which I’ll touch on at the end of my talk.
This is an example landing page for DOI 10.5524/100015 for the YH genome dataset. It will be accessible both from the GigaDBurl and the DOI url.These pages are still in development but what you will see is the dataset metadata including:date releaseddataset typetitle abstract how the dataset should be citedLinks to related manuscripts, datasets, additional information, genome browsers, accessions and projectsSample details
And finally at the bottom, file descriptions and options (not shown in this illustration) to download the files (or upload them to tools such as Galaxy)
Leading on from that, current and future plans include collaborating with Tin-Lap Lee at the Chinese University of Hong Kong to integrate an instance of the Galaxy bioinformatics platform with GigaDB so users can make full use of the data in GigaDB by linking it to other resources and we can incorporate fully executable papers. One such submission is a new SOAPdenovo pipeline. The SOAP tools have been wrapped in Galaxy, the workflow defined in MyExperiment and the data will be issued with a DOI and accessible via GigaDB. Utilizing the BGI cloud if necessary, users will then be able to reproduce all the steps described in the GigaScience paper to test, reanalyze, compare results etc.Since we would like GigaDB to be a host for data types that have no other home, such as imaging data, we are investigating adding other tools such as an image viewer and the like to support accessibility to and usability of the data. So, if you have a large-scale biological or biomedical dataset and/or a pipeline or software that you would like to submit to GigaScience we would love to hear from you so please come and talk to Scott or myself.
That just leaves me to thank the GigaScience team: Laurie, Scott, Alexandra, Peter and Jesse, BGI for their support - specifically Shaoguang for IT and bioinformatics support – our collaborators on the database, website and tools: Tin-Lap, Qiong, Senhong, Yan, the Cogini web design team, Datacite for providing the DOI service and the isacommons team for their support and advocacy for best practice use of metadata reporting and sharing.Thank you for listening.

Tam Sneddon: Revolutionizing data dissemination, organization and use.

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (20)

Similar to Tam Sneddon: Revolutionizing data dissemination, organization and use.

Similar to Tam Sneddon: Revolutionizing data dissemination, organization and use. (20)

More from GigaScience, BGI Hong Kong

More from GigaScience, BGI Hong Kong (20)

Recently uploaded

Recently uploaded (20)

Tam Sneddon: Revolutionizing data dissemination, organization and use.

Editor's Notes