SlideShare uma empresa Scribd logo
1 de 27
We are always looking for data
Finding & Accessing
Human Genomic
Data for research
BioSB 2017
Tweets welcome
#dataeureka
@repositiveio
Genomic data is important for research
Pre-clinical
drug discovery
Diagnostics and treatments
of genetic diseases
“Consensus among researchers, clinicians,
politicians & the public that
genomics will transform biomedical
research, healthcare and lifestyle choices”
Stephan Beck, UCL
OPPORTUNITY
Genome Technology Evolution
2001: 1 human genome
2005: Personal Genome Project
Human Genome Diversity Project
HapMap
2016: 2M AstraZeneca - HLI
2008: 1000 Genomes (1092 genomes, since increased to ~2500)
Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE)
2011: H3Africa
2012: International Cancer Genome Consortium
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Large amounts of data, but not accessible
≈ .5PB
Sequence
available
80+PB
Sequenced
every year
WGS data available
in public repos
Exponential
growth rate
Under-utilised data
has huge potential for
medical research
How many data sources?
How many sources of human
genomic data do you know about?
Hundreds of data sources
…but they aren’t easy to find!
http://tinyurl.com/plos-biology-repositiveFirst 30 data sources listed here:
10 25 33 35
102
174
239
314
506
582
0
100
200
300
400
500
600
700
Jan-15 Mar-15 Jun-15 Sep-15 Dec-15 Mar-16 Jun-16 Sep-16 Dec-16 Mar-17
Data Sources Identified
The researchers’ pain points
FRAGMENTED
11
155
2
2
4
4
7
780
0
5
10
15
20
25
30
35
40
45
GB FI NL FR DE CH EE BE DK ES SI IE SE
0
5
10
15
20
25
30
35
CA MD MA WA NY TX AZ DC NJ NC PA UT TN CO IN FL LA VA IL ME OH MO MI SC OR
1
1
1
1
1
1
Data sources across the globe
GEO location of 278
data sources analysed.
Found by tracking IP address
of the source.
These include:
 Public Repositories
 Universities
 Companies
 BioBanks
 Research consortiums
The researchers’ pain points
CONFUSING
• Required by funders
• Cannot publish unless accession
number given
• Specialised for genomics
• ArrayExpress
• EGA
• dbGaP
• GEO…
• Generalist
• Dryad
• Figshare…
See http://discover.repositive.io for more
Public Repositories
The researchers’ pain points
FRAGMENTED
No holistic approach
to discover new data
HIDDEN
The researchers’ pain points
FRAGMENTED
No holistic approach
to discover new data
ADMIN
BURDEN
Open Access
• Eg. PGP, CC0
• Bermuda Accord
Managed (Restricted or Controlled Access)
• Data Access Committee
• No effective agreement (policy vacuum)
GOVERNANCE Models
Data accessibility
Can download the
data straight away
or after logging in.
Need to apply for
access to the data.
Has both Open and Restricted
access data within one repository.
Access to Restricted Data
Benefits:
• Strict governance
• Individuals are protected
• Review of consent
• Applicant signs for full
responsibility for governance
Disadvantages:
• No control of data once access
is given
• High barrier for access – too
high?
Often a long process
Bottlenecks:
• Finding relevant and usable
data
• Getting authorisation to
access data
• Formatting data
• Storing and moving data
We studied the problem with
qualitative interviews followed
by a survey of researchers in
human genetics
T. A. van Schaik et al
The need to redefine genomic data sharing: a focus on
data accessibility, Applied & Translational Genomics, 2014
http://tinyurl.com/schaik-dnadigest
NIH / eRA Commons login
No
Yes
Organisation registered with eRA
Organisation has DUNS number
No
No
Write research proposal
Yes
+ 2-3 days
+ 1-2 weeks
+ 1 week
Yes
Submit proposal
+ 1-2 days
Access granted
Find/Download/Decrypt data
+ 1-4 weeks
Science…
+ 1-2 days
PRO Tip: If you use human
genomic data, apply for the
GRU datasets in dbGaP, one
application – access to all the
GRU datasets.
dbGaP application process
Blog Post:
http://blog.repositive.io/how-to-successfully-apply-for-access-to-dbgap/
Sanger eDAM Account
No
Write research proposal
+ 1 hour
Yes
Submit proposal
+ 1-2 days
Access granted
Find/Download/Decrypt data
+ 2-7 days
Science…
+ 1-2 days
EGA application process
Blog Post:
http://blog.repositive.io/how-to-successfully-apply-for-access-to-ega/
Where Repositive came from…
Fiona Nielsen
FOUNDER & CEO
@repositiveio
We are enabling best practices
MAKE DATA
DISCOVERABLE
SIMPLIFY
WORKFLOWS
CONTRIBUTE TO
COMMUNITY
A platform to make human genomic data accessible for research
1-click to human genomic data access
to make finding data as easy as finding a book
on Amazon, book a hotel on Expedia!
Repositive
Simpler workflow
for data access
Our expertise is data search platforms
Discover and
access
Search, see
related results
Find colleagues &
their data interests
Co-annotate data &
community feedback
Connecting the world of genomic data
http://discover.repositive.io
charlotte@repositive.io
Biosb2017_Repositive

Mais conteúdo relacionado

Mais procurados

Data Sharing and Release Legislation
Data Sharing and Release Legislation   Data Sharing and Release Legislation
Data Sharing and Release Legislation ARDC
 
Clinical research innovation hub walking deck v12
Clinical research innovation hub walking deck v12Clinical research innovation hub walking deck v12
Clinical research innovation hub walking deck v12Ryan Tubbs
 
Almaden presentation 15-dec-2015
Almaden presentation 15-dec-2015Almaden presentation 15-dec-2015
Almaden presentation 15-dec-2015Paul Courtney
 
Pistoia Alliance European Conference 2015 - Julia Wilson / Global Alliance fo...
Pistoia Alliance European Conference 2015 - Julia Wilson / Global Alliance fo...Pistoia Alliance European Conference 2015 - Julia Wilson / Global Alliance fo...
Pistoia Alliance European Conference 2015 - Julia Wilson / Global Alliance fo...Pistoia Alliance
 
Investigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspectiveInvestigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspectiveARDC
 
Genome sharing projects around the world nijmegen oct 29 - 2015
Genome sharing projects around the world   nijmegen oct 29 - 2015Genome sharing projects around the world   nijmegen oct 29 - 2015
Genome sharing projects around the world nijmegen oct 29 - 2015Fiona Nielsen
 
Expert Panel on Data Challenges in Translational Research
Expert Panel on Data Challenges in Translational ResearchExpert Panel on Data Challenges in Translational Research
Expert Panel on Data Challenges in Translational ResearchEagle Genomics
 
Beacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data SharingBeacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data SharingMiro Cupak
 
ALSWH accessible webinar 6 Sep 2017
ALSWH accessible webinar 6 Sep 2017ALSWH accessible webinar 6 Sep 2017
ALSWH accessible webinar 6 Sep 2017ARDC
 
Governance microbubble poster v4
Governance microbubble poster v4Governance microbubble poster v4
Governance microbubble poster v4Karim Keshavjee
 
Digital transformation of translational medicine
Digital transformation of translational medicineDigital transformation of translational medicine
Digital transformation of translational medicineEagle Genomics
 
Beacon: A Protocol for Federated Discovery and Sharing of Genomic Data
Beacon: A Protocol for Federated Discovery and Sharing of Genomic DataBeacon: A Protocol for Federated Discovery and Sharing of Genomic Data
Beacon: A Protocol for Federated Discovery and Sharing of Genomic DataMiro Cupak
 
Beacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data SharingBeacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data SharingMiro Cupak
 
RDAP 16 Poster: Measuring adoption of Electronic Lab Notebooks and their impa...
RDAP 16 Poster: Measuring adoption of Electronic Lab Notebooks and their impa...RDAP 16 Poster: Measuring adoption of Electronic Lab Notebooks and their impa...
RDAP 16 Poster: Measuring adoption of Electronic Lab Notebooks and their impa...ASIS&T
 
Common Ground: a policy framework for open access to research data
Common Ground: a  policy framework for open access to research dataCommon Ground: a  policy framework for open access to research data
Common Ground: a policy framework for open access to research dataLIBER Europe
 
Validating microbiome claims – including the latest DNA techniques
Validating microbiome claims – including the latest DNA techniquesValidating microbiome claims – including the latest DNA techniques
Validating microbiome claims – including the latest DNA techniquesEagle Genomics
 

Mais procurados (20)

Data Sharing and Release Legislation
Data Sharing and Release Legislation   Data Sharing and Release Legislation
Data Sharing and Release Legislation
 
Clinical research innovation hub walking deck v12
Clinical research innovation hub walking deck v12Clinical research innovation hub walking deck v12
Clinical research innovation hub walking deck v12
 
Almaden presentation 15-dec-2015
Almaden presentation 15-dec-2015Almaden presentation 15-dec-2015
Almaden presentation 15-dec-2015
 
Open Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality DataOpen Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality Data
 
Pistoia Alliance European Conference 2015 - Julia Wilson / Global Alliance fo...
Pistoia Alliance European Conference 2015 - Julia Wilson / Global Alliance fo...Pistoia Alliance European Conference 2015 - Julia Wilson / Global Alliance fo...
Pistoia Alliance European Conference 2015 - Julia Wilson / Global Alliance fo...
 
Investigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspectiveInvestigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspective
 
Genome sharing projects around the world nijmegen oct 29 - 2015
Genome sharing projects around the world   nijmegen oct 29 - 2015Genome sharing projects around the world   nijmegen oct 29 - 2015
Genome sharing projects around the world nijmegen oct 29 - 2015
 
Expert Panel on Data Challenges in Translational Research
Expert Panel on Data Challenges in Translational ResearchExpert Panel on Data Challenges in Translational Research
Expert Panel on Data Challenges in Translational Research
 
Beacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data SharingBeacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data Sharing
 
ALSWH accessible webinar 6 Sep 2017
ALSWH accessible webinar 6 Sep 2017ALSWH accessible webinar 6 Sep 2017
ALSWH accessible webinar 6 Sep 2017
 
Governance microbubble poster v4
Governance microbubble poster v4Governance microbubble poster v4
Governance microbubble poster v4
 
Digital transformation of translational medicine
Digital transformation of translational medicineDigital transformation of translational medicine
Digital transformation of translational medicine
 
Beacon: A Protocol for Federated Discovery and Sharing of Genomic Data
Beacon: A Protocol for Federated Discovery and Sharing of Genomic DataBeacon: A Protocol for Federated Discovery and Sharing of Genomic Data
Beacon: A Protocol for Federated Discovery and Sharing of Genomic Data
 
Beacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data SharingBeacon Network: A System for Global Genomic Data Sharing
Beacon Network: A System for Global Genomic Data Sharing
 
RDAP 16 Poster: Measuring adoption of Electronic Lab Notebooks and their impa...
RDAP 16 Poster: Measuring adoption of Electronic Lab Notebooks and their impa...RDAP 16 Poster: Measuring adoption of Electronic Lab Notebooks and their impa...
RDAP 16 Poster: Measuring adoption of Electronic Lab Notebooks and their impa...
 
Common Ground: a policy framework for open access to research data
Common Ground: a  policy framework for open access to research dataCommon Ground: a  policy framework for open access to research data
Common Ground: a policy framework for open access to research data
 
Data!
Data!Data!
Data!
 
Validating microbiome claims – including the latest DNA techniques
Validating microbiome claims – including the latest DNA techniquesValidating microbiome claims – including the latest DNA techniques
Validating microbiome claims – including the latest DNA techniques
 
Inglis Preprints in Biology and Medicine
Inglis Preprints in Biology and MedicineInglis Preprints in Biology and Medicine
Inglis Preprints in Biology and Medicine
 
Henderson The Central Role of Scholarly Societies in Preprints
Henderson The Central Role of Scholarly Societies in PreprintsHenderson The Central Role of Scholarly Societies in Preprints
Henderson The Central Role of Scholarly Societies in Preprints
 

Semelhante a Biosb2017_Repositive

Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsManuel Corpas
 
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...Fiona Nielsen
 
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Philip Bourne
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forumChris Dwan
 
Workshop finding and accessing data - fiona - lunteren april 18 2016
Workshop   finding and accessing data - fiona - lunteren april 18 2016Workshop   finding and accessing data - fiona - lunteren april 18 2016
Workshop finding and accessing data - fiona - lunteren april 18 2016Fiona Nielsen
 
Data dialogue - Human Genomic Data Discovery
Data dialogue - Human Genomic Data DiscoveryData dialogue - Human Genomic Data Discovery
Data dialogue - Human Genomic Data DiscoveryFiona Nielsen
 
ODF III - 3.15.16 - Day Two Morning Sessions
ODF III - 3.15.16 - Day Two Morning SessionsODF III - 3.15.16 - Day Two Morning Sessions
ODF III - 3.15.16 - Day Two Morning SessionsMichael Kerr
 
Open Science: Where Theory Meets Practice
Open Science: Where Theory Meets PracticeOpen Science: Where Theory Meets Practice
Open Science: Where Theory Meets PracticePhilip Bourne
 
Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Fiona Nielsen
 
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...Warren Kibbe
 
Khoury ashg2014
Khoury ashg2014Khoury ashg2014
Khoury ashg2014muink
 
Lessons learned in scaling up
Lessons learned in scaling upLessons learned in scaling up
Lessons learned in scaling upWarren Kibbe
 
National Cancer Policy Forum Summit - Warren Kibbe Keynote November 2013
National Cancer Policy Forum Summit - Warren Kibbe Keynote November 2013National Cancer Policy Forum Summit - Warren Kibbe Keynote November 2013
National Cancer Policy Forum Summit - Warren Kibbe Keynote November 2013Warren Kibbe
 
Sdal air health and social development (jan. 27, 2014) final
Sdal air health and social development (jan. 27, 2014) finalSdal air health and social development (jan. 27, 2014) final
Sdal air health and social development (jan. 27, 2014) finalkimlyman
 
Open data in Health Science: towards achieving the SDGs/John Ataguba
Open data in Health Science: towards achieving the SDGs/John AtagubaOpen data in Health Science: towards achieving the SDGs/John Ataguba
Open data in Health Science: towards achieving the SDGs/John AtagubaAfrican Open Science Platform
 
Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...Carolyn Ten Holter
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutIUPUI
 

Semelhante a Biosb2017_Repositive (20)

Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics Datasets
 
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
 
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forum
 
2015 04-18-wilson cg
2015 04-18-wilson cg2015 04-18-wilson cg
2015 04-18-wilson cg
 
Workshop finding and accessing data - fiona - lunteren april 18 2016
Workshop   finding and accessing data - fiona - lunteren april 18 2016Workshop   finding and accessing data - fiona - lunteren april 18 2016
Workshop finding and accessing data - fiona - lunteren april 18 2016
 
Big data sharing
Big data sharingBig data sharing
Big data sharing
 
Data dialogue - Human Genomic Data Discovery
Data dialogue - Human Genomic Data DiscoveryData dialogue - Human Genomic Data Discovery
Data dialogue - Human Genomic Data Discovery
 
ODF III - 3.15.16 - Day Two Morning Sessions
ODF III - 3.15.16 - Day Two Morning SessionsODF III - 3.15.16 - Day Two Morning Sessions
ODF III - 3.15.16 - Day Two Morning Sessions
 
Open Science: Where Theory Meets Practice
Open Science: Where Theory Meets PracticeOpen Science: Where Theory Meets Practice
Open Science: Where Theory Meets Practice
 
Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016
 
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
 
Khoury ashg2014
Khoury ashg2014Khoury ashg2014
Khoury ashg2014
 
Genomics privacy
Genomics privacyGenomics privacy
Genomics privacy
 
Lessons learned in scaling up
Lessons learned in scaling upLessons learned in scaling up
Lessons learned in scaling up
 
National Cancer Policy Forum Summit - Warren Kibbe Keynote November 2013
National Cancer Policy Forum Summit - Warren Kibbe Keynote November 2013National Cancer Policy Forum Summit - Warren Kibbe Keynote November 2013
National Cancer Policy Forum Summit - Warren Kibbe Keynote November 2013
 
Sdal air health and social development (jan. 27, 2014) final
Sdal air health and social development (jan. 27, 2014) finalSdal air health and social development (jan. 27, 2014) final
Sdal air health and social development (jan. 27, 2014) final
 
Open data in Health Science: towards achieving the SDGs/John Ataguba
Open data in Health Science: towards achieving the SDGs/John AtagubaOpen data in Health Science: towards achieving the SDGs/John Ataguba
Open data in Health Science: towards achieving the SDGs/John Ataguba
 
Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - Handout
 

Último

Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excelysmaelreyes
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一F sss
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 

Último (20)

Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excel
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 

Biosb2017_Repositive

  • 1. We are always looking for data Finding & Accessing Human Genomic Data for research BioSB 2017 Tweets welcome #dataeureka @repositiveio
  • 2. Genomic data is important for research Pre-clinical drug discovery Diagnostics and treatments of genetic diseases
  • 3. “Consensus among researchers, clinicians, politicians & the public that genomics will transform biomedical research, healthcare and lifestyle choices” Stephan Beck, UCL OPPORTUNITY
  • 4. Genome Technology Evolution 2001: 1 human genome 2005: Personal Genome Project Human Genome Diversity Project HapMap 2016: 2M AstraZeneca - HLI 2008: 1000 Genomes (1092 genomes, since increased to ~2500) Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) 2011: H3Africa 2012: International Cancer Genome Consortium
  • 5. 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 Large amounts of data, but not accessible ≈ .5PB Sequence available 80+PB Sequenced every year WGS data available in public repos Exponential growth rate Under-utilised data has huge potential for medical research
  • 6. How many data sources? How many sources of human genomic data do you know about?
  • 7. Hundreds of data sources …but they aren’t easy to find! http://tinyurl.com/plos-biology-repositiveFirst 30 data sources listed here: 10 25 33 35 102 174 239 314 506 582 0 100 200 300 400 500 600 700 Jan-15 Mar-15 Jun-15 Sep-15 Dec-15 Mar-16 Jun-16 Sep-16 Dec-16 Mar-17 Data Sources Identified
  • 8. The researchers’ pain points FRAGMENTED
  • 9. 11 155 2 2 4 4 7 780 0 5 10 15 20 25 30 35 40 45 GB FI NL FR DE CH EE BE DK ES SI IE SE 0 5 10 15 20 25 30 35 CA MD MA WA NY TX AZ DC NJ NC PA UT TN CO IN FL LA VA IL ME OH MO MI SC OR 1 1 1 1 1 1 Data sources across the globe GEO location of 278 data sources analysed. Found by tracking IP address of the source. These include:  Public Repositories  Universities  Companies  BioBanks  Research consortiums
  • 10. The researchers’ pain points CONFUSING
  • 11. • Required by funders • Cannot publish unless accession number given • Specialised for genomics • ArrayExpress • EGA • dbGaP • GEO… • Generalist • Dryad • Figshare… See http://discover.repositive.io for more Public Repositories
  • 12. The researchers’ pain points FRAGMENTED No holistic approach to discover new data HIDDEN
  • 13. The researchers’ pain points FRAGMENTED No holistic approach to discover new data ADMIN BURDEN
  • 14. Open Access • Eg. PGP, CC0 • Bermuda Accord Managed (Restricted or Controlled Access) • Data Access Committee • No effective agreement (policy vacuum) GOVERNANCE Models
  • 15. Data accessibility Can download the data straight away or after logging in. Need to apply for access to the data. Has both Open and Restricted access data within one repository.
  • 16. Access to Restricted Data Benefits: • Strict governance • Individuals are protected • Review of consent • Applicant signs for full responsibility for governance Disadvantages: • No control of data once access is given • High barrier for access – too high?
  • 17. Often a long process Bottlenecks: • Finding relevant and usable data • Getting authorisation to access data • Formatting data • Storing and moving data We studied the problem with qualitative interviews followed by a survey of researchers in human genetics T. A. van Schaik et al The need to redefine genomic data sharing: a focus on data accessibility, Applied & Translational Genomics, 2014 http://tinyurl.com/schaik-dnadigest
  • 18. NIH / eRA Commons login No Yes Organisation registered with eRA Organisation has DUNS number No No Write research proposal Yes + 2-3 days + 1-2 weeks + 1 week Yes Submit proposal + 1-2 days Access granted Find/Download/Decrypt data + 1-4 weeks Science… + 1-2 days PRO Tip: If you use human genomic data, apply for the GRU datasets in dbGaP, one application – access to all the GRU datasets. dbGaP application process Blog Post: http://blog.repositive.io/how-to-successfully-apply-for-access-to-dbgap/
  • 19. Sanger eDAM Account No Write research proposal + 1 hour Yes Submit proposal + 1-2 days Access granted Find/Download/Decrypt data + 2-7 days Science… + 1-2 days EGA application process Blog Post: http://blog.repositive.io/how-to-successfully-apply-for-access-to-ega/
  • 20. Where Repositive came from… Fiona Nielsen FOUNDER & CEO
  • 22. We are enabling best practices MAKE DATA DISCOVERABLE SIMPLIFY WORKFLOWS CONTRIBUTE TO COMMUNITY A platform to make human genomic data accessible for research
  • 23. 1-click to human genomic data access to make finding data as easy as finding a book on Amazon, book a hotel on Expedia! Repositive
  • 24. Simpler workflow for data access Our expertise is data search platforms Discover and access Search, see related results Find colleagues & their data interests Co-annotate data & community feedback
  • 25. Connecting the world of genomic data

Notas do Editor

  1. Genomics data is needed for research and drug discovery  It enables researchers to develop diagnostics and treatments for genetic diseases
  2. Because interpretation requires LOTS of data And although data exists around the world, it is siloed, and even if available, it is not accessible This is Jenn, a genetic researcher –our target customer- seeking to interpret data from genetic diseases and cancer She needs data from other patients to compare and interpret Mabels DNA She also has data available in her own lab, but she cannot share because of concerns how to deal with secure access to sensitive data and data governance, e.g. vetting of users
  3. Population scale genome sequencing projects have been launched all over the world More than 80PB of human genomic data is being sequenced Every year BUT To date only around .5PB of data available in public repositories
  4. Data is fragmented in unconnected silos – makes it very difficult to discover data
  5. There are many public repositories, but It can be hugely confusing to know where to look for the right kind of data
  6. Data privacy Is a concern and controlled access is a requirement for many clinical datasets
  7. Accessing data is a time-consuming and bureaucratic exercise
  8. Because interpretation requires LOTS of data And although data exists around the world, it is siloed, and even if available, it is not accessible This is Jenn, a genetic researcher –our target customer- seeking to interpret data from genetic diseases and cancer She needs data from other patients to compare and interpret Mabels DNA She also has data available in her own lab, but she cannot share because of concerns how to deal with secure access to sensitive data and data governance, e.g. vetting of users
  9. Just like Liz, and researcher struggling to get hold of the genomics data she needed for her researcher. So… she quite her job at illumina and decided to try and do something about that problem.
  10. Our mission is to speed up research and diagnostics for genetic diseases by enabling efficient and ethical access to genomic research data
  11. FAIR data: https://www.force11.org/group/fairgroup/fairprinciples
  12. Our vision is to make genomic data access as easy as finding a book on Amazon or book a hotel on Expedia
  13. KEY POINTS: Repositive builds tools for genomics data search & access. We’re really good at it. We have the expertise in-house. It’s what we do. Aside from building a highly functional tool, we’ve taken the time to prioritise User Experience, streamlining of user workflows & presentation. Within a month of our formal platform launch we have over 600 registered users. The Repositive platform is an online community and marketplace connecting data consumers with data providers. On Repositive, Jenn has Easy, Interactive search Faster data access workflow Easy access to new data collaborators Benefiting from reading feedback on data from community, colleagues, to assess data quality and utility The Repositive platform and technology will remove barriers to data sharing and will incentivise users to explore, contribute and collaborate in alignment with best practices
  14. DNA.land OpenSNP PersonalGenomesProject Direct to consumer genetic tests & microbiome
  15. Our mission is to speed up research and diagnostics for genetic diseases by enabling efficient and ethical access to genomic research data
  16. Because interpretation requires LOTS of data And although data exists around the world, it is siloed, and even if available, it is not accessible This is Jenn, a genetic researcher –our target customer- seeking to interpret data from genetic diseases and cancer She needs data from other patients to compare and interpret Mabels DNA She also has data available in her own lab, but she cannot share because of concerns how to deal with secure access to sensitive data and vetting of users