Scott Edmunds talk at The Genomic Open: Where are we now? session on "Views of the genome data wars from the field". A genomic history via rice. 18th November, UCSC.
Scott Edmunds: The Genomic Open: Where are we now? Views of the genome data wars from the field.
1. Where are we now? Views of the
genome data wars from the field.
0000-0001-6444-1436
@SCEdmunds
scott@gigasciencejournal.com
1
2. Circa 2002: Genome Wars pt. II
Rice was a key battle between
the Bermuda & Fort Lauderdale
meetings.
Commercial (syngenta) v
academic research community.
Like Celera paper, Science again
willing to publish genome
without data in public domain.
2
https://www.newscientist.com/article/dn2061-fears-over-rice-genome-access/
3. Genome Wars: the Empire Strikes Back
"A maximum of 15 Kb of DNA or 15 K
amino acids can be submitted in a FASTA
format, and appropriate BLAST searches
will be performed by SBI. Alignment
results of the search will be sent via e-
mail to the requestor. Rice contigs
identified by these alignments can be
requested for further analysis using the
sequence submission/contig request
form. Up to 100 Kb of sequence
information may be downloaded per
week under your account.”
”TMRI will make its sequence assembly
of the whole rice genome available on a
CD-ROM under the terms of the Free
Public Access Agreement for TMRI Whole
Genome Sequence.”
https://web.archive.org/web/20021009130336/http://portal.tmri.org/rice/RiceAccess.html
5. “Science
congratulates
Chinese
scientists”
Back to back publication, April 2002
Yu et al., (BGI) & Goff et al.
(Syngenta/Myriad), Science 296, 79
BGI data public [AAAA00000000]
Circa 2002: Genome Wars
5 April, 2002
Beijing
http://www.agbioforum.org/v8n23/v8n23a07-pray.htm
6. Syngenta closed TMRI database, data
became part of IRGSP consortium
paper published in 2005.
Fort Lauderdale, January 2003.
NAS "UPSIDE: the Uniform Principle
for Sharing Integral Data and materials
Expeditiously”.
AAAS: “‘All data necessary to
understand, assess, and extend the
conclusions of the manuscript must be
available to any reader of Science’ ”.
Circa 2003: The aftermath
7. 0
100
200
300
400
500
600
700
rice wheat
Rice v Wheat: consequences of publically available
genome data.
Papers
http://www.tandfonline.com/doi/abs/10.1080/08109028.2011.631275
Circa 2003-date: The Legacy
7
8. IRRI GALAXY
Rice 3K project: 3,000 rice genomes, 13.4TB public data
Circa 2014: Big Data
8
http://www.gigasciencejournal.com/content/3/1/7
9. IRRI GALAXY
Rice 3K project: 3,000 rice genomes, 120 TB public data
Circa 2015: Bigger Data
9
https://aws.amazon.com/public-data-sets/3000-rice-genome/
12. Compute publishing: consequences?
• Cost us $1000 AWS credits to
review one paper. Scalable?
• Is the era of free open-data
over?
• Are we happy with AWSification
of research? Research-as-a-
Service?
• If not, who will pay?
12