SlideShare uma empresa Scribd logo
1 de 52
Baixar para ler offline
Biodiversity	
  Informa1cs	
  of	
  the	
  
Cyperaceae:	
  Where	
  we	
  stand	
  and	
  
where	
  we’re	
  heading	
  
Andrew	
  Hipp,	
  Marlene	
  Hahn,	
  	
  
Ed	
  Baker,	
  Vince	
  Smith	
  and	
  	
  
The	
  Cariceae	
  Working	
  Group	
  
A	
  set	
  of	
  tools	
  for	
  Cariceae	
  
informa1cs	
  
Andrew	
  Hipp,	
  Marlene	
  Hahn,	
  	
  
Ed	
  Baker,	
  Vince	
  Smith	
  and	
  	
  
The	
  Cariceae	
  Working	
  Group	
  
Iden1fy	
  gaps	
  in	
  our	
  
knowledge	
  and	
  
sampling	
  
Formulate	
  sampling	
  
plan	
  
New	
  collec1ons	
  
DNA	
  
sequences	
  
DNA	
  matrices	
  
Mul1ple	
  
alignments	
  
Species	
  tree	
  	
  
es1mates	
  
Revised	
  
classifica1on	
  
A	
  central	
  database	
  for	
  specimen-­‐level	
  data	
  
What	
  tools	
  do	
  we	
  need?	
  
	
  	
  
• An	
  easily-­‐updated	
  hierarchical	
  checklist	
  to	
  visualize	
  
sampling	
  progress	
  across	
  labs,	
  extrac1ons,	
  sequences;	
  
• 	
  A	
  specimen-­‐level	
  phylogene6cs	
  pipeline	
  that	
  we	
  can	
  use	
  
to	
  harvest	
  exis1ng	
  data	
  from	
  NCBI	
  as	
  well	
  as	
  generate	
  
ongoing	
  phylogene1c	
  snapshots;	
  
• 	
  A	
  way	
  to	
  automate	
  mapping	
  from	
  specimen	
  data,	
  so	
  that	
  
we	
  can	
  visualize	
  (and	
  assess	
  our	
  visualiza1ons	
  of)	
  species	
  
distribu1ons	
  in	
  geographic	
  and	
  ecological	
  space;	
  and	
  
• 	
  A	
  pla8orm	
  for	
  collabora6on	
  –	
  a	
  virtual	
  research	
  
environment	
  to	
  bring	
  together	
  researchers	
  worldwide	
  
	
  
I.	
  A	
  hierarchical	
  checklist	
  and	
  
sampling	
  progress	
  reports	
  
In	
  2011	
  
•  A	
  flat	
  checklist	
  exported	
  
from	
  WCM	
  
•  A	
  set	
  of	
  spreadsheets	
  from	
  
collabora1ng	
  labs	
  
inventorying	
  their	
  DNA	
  and	
  
sequence	
  collec1ons	
  
•  A	
  vague	
  idea	
  of	
  what	
  trips	
  
are	
  needed	
  
Today	
  
•  A	
  hierarchical	
  checklist	
  by	
  
subgenus,	
  sec1on	
  
•  A	
  synthesis	
  of	
  what	
  
materials	
  and	
  sequences	
  
collaborators	
  have	
  on	
  hand,	
  
and	
  what	
  taxa	
  are	
  
unsampled	
  
•  A	
  concrete	
  sampling	
  plan	
  
with	
  trips	
  and	
  taxa	
  
iden1fied*	
  
*	
  Okay,	
  we’re	
  working	
  on	
  this	
  one!	
  
Taxonomy	
  
Specimen(s)	
  
DNA	
  
extrac6on(s)	
  
Sequence(s)	
  
Trace	
  file(s)	
  /	
  
con6g(s)	
  
We	
  are	
  aiming	
  toward	
  a	
  
database	
  in	
  which	
  the	
  
taxonomy,	
  specimen	
  
data,	
  DNA	
  extrac1ons,	
  
raw	
  sequencing	
  data	
  and	
  
DNA	
  matrices	
  all	
  live	
  
together	
  and	
  can	
  be	
  
curated	
  and	
  worked	
  on	
  
jointly	
  by	
  the	
  community.	
  
Taxonomy	
  
Specimen(s)	
  
DNA	
  
extrac6on(s)	
  
Sequence(s)	
  
Trace	
  file(s)	
  /	
  
con6g(s)	
  
Spring	
  2012:	
  Hierarchical	
  checklist	
  
Taxonomy	
  
Specimen(s)	
  
DNA	
  
extrac6on(s)	
  
Sequence(s)	
  
Trace	
  file(s)	
  /	
  
con6g(s)	
  
!	
  
Taxonomy	
  
Specimen(s)	
  
DNA	
  
extrac6on(s)	
  
Sequence(s)	
  
Trace	
  file(s)	
  /	
  
con6g(s)	
  
!	
  
Specimen	
  Record	
  
Tissue	
  
Extrac1on	
  
DNA	
  seq.	
  
Metadata	
  flow	
  
DNA	
  seq.	
  
DNA	
  seq.	
  
A	
  centralized	
  workflow	
  
•  Spreadsheets	
  imported	
  into	
  a	
  single	
  Excel	
  file	
  
•  Names	
  cleaned	
  (variable)	
  
•  DNA	
  data	
  summary	
  formula	
  created	
  for	
  each	
  
spreadsheet	
  (ca.	
  5	
  mins	
  per	
  user)	
  
•  Names	
  matched	
  to	
  our	
  Scratchpads	
  checklist	
  
•  All	
  files	
  exported	
  to	
  CSV	
  
•  Sample	
  sheets	
  and	
  SP	
  checklist	
  imported	
  to	
  R	
  
•  DNA	
  records	
  added	
  to	
  checklist	
  as	
  nodes	
  that	
  are	
  
children	
  to	
  their	
  taxa.	
  
•  Hierarchical	
  checklist	
  exported	
  in	
  text	
  format,	
  with	
  
unsampled	
  taxa	
  marked	
  for	
  searching	
  
ß	
  Sec1on	
  name	
  
ß	
  Sampled	
  taxon	
  with	
  its	
  DNA	
  vouchers	
  and	
  summaries	
  
ß	
  Unsampled	
  taxon	
  
Because	
  Kew	
  has	
  coded	
  geography	
  using	
  TDWG	
  
standards,	
  we	
  can	
  export	
  geographic	
  hit-­‐lists	
  
Taxonomy	
  
Specimen(s)	
  
DNA	
  
extrac6on(s)	
  
Sequence(s)	
  
Trace	
  file(s)	
  /	
  
con6g(s)	
  
!	
  
!	
  
!	
  
?	
  
II.	
  A	
  specimen-­‐level	
  
phylogene1c	
  pipeline	
  
NCBI	
  is	
  a	
  morass	
  of	
  data.	
  
Geneious	
  
•  Query	
  nucleo1de	
  	
  database	
  (NCBI)	
  for	
  
Organism	
  contains:	
  “Carex”,	
  “Uncinia”,	
  
“Schoenoxiphium”,	
  “Kobresia”,	
  
“Vesicarex”,	
  or	
  “Cymophyllus”	
  
•  Export	
  as	
  
•  FASTA	
  
•  TAB-­‐Delim	
  
•  XML	
  	
  
•  Only	
  export	
  that	
  maintains	
  all	
  informa1on	
  
in	
  NCBI.	
  
•  Necessary	
  to	
  obtain	
  data	
  that	
  can	
  be	
  used	
  
to	
  connect	
  sequence	
  to	
  a	
  specimen.	
  
Hinchliff	
  and	
  Roalson.	
  2013.	
  Systema(c	
  Biology	
  62:	
  205–219.	
  
Hinchliff	
  and	
  Roalson.	
  2013.	
  Systema(c	
  Biology	
  62:	
  205–219.	
  
A	
  workflow	
  for	
  specimen-­‐level	
  mul1gene	
  
datasets	
  from	
  NCBI	
  
•  Download	
  from	
  NCBI	
  [we	
  used	
  Geneious,	
  but	
  any	
  bulk	
  download	
  is	
  
fine]	
  
•  Parse	
  out	
  collector	
  name,	
  collector	
  number,	
  isolate	
  number,	
  geography	
  
•  Manually	
  clean	
  collector	
  names	
  (3	
  days	
  for	
  >6500	
  records)	
  
•  Iden1fy	
  specimens	
  by	
  unique	
  combina1ons	
  of	
  collector	
  name,	
  collector	
  
number,	
  isolate	
  
•  Toss	
  out	
  “accessions”	
  having	
  more	
  than	
  one	
  scien1fic	
  name	
  
•  Clean	
  gene	
  region	
  names	
  so	
  that	
  names	
  are	
  not	
  duplicated	
  (30	
  minutes	
  
for	
  >6500	
  records)	
  
•  Export	
  datasets	
  to	
  MUSCLE	
  and	
  align;	
  export	
  log	
  file	
  
•  Manually	
  check	
  alignments	
  and	
  code	
  logfile	
  (D,	
  RC;	
  variable)	
  
•  Rerun	
  MUSCLE	
  and	
  export	
  RAxML	
  batchfile	
  
•  Analyze	
  
•  Screen	
  for	
  non-­‐monophyly;	
  concatenate	
  and	
  con1nue!	
  
6692	
  sequence	
  records	
  in	
  Cariceae	
  
Tab-­‐delimited	
  metadata	
  from	
  NCBI	
  /	
  Geneious	
  is	
  
handy,	
  but	
  it	
  lacks	
  almost	
  all	
  the	
  informa1on	
  that	
  
could	
  be	
  used	
  as	
  voucher	
  IDs.	
  No	
  way	
  to	
  link	
  
sequences	
  to	
  specimens!	
  	
  However,	
  some	
  NCBI	
  
records	
  do	
  contain	
  this	
  data.	
  How	
  do	
  we	
  access	
  it?	
  
NCBI	
  
Specimen	
  
Record	
  
The FEATURES/Qualifier1 section has
information that allows us to connect sequences to
a specific specimen.
(for example,
some records contain the qualifier specimen_voucher)
To get this additional information, we need to
export the data as an XML file, and parse the data
out into a useable tab delimited file.
Other good information to export
We	
  parsed	
  the	
  NCBI	
  XML	
  and	
  embedded	
  fields	
  within	
  
<qualifiers1>	
  to	
  get	
  voucher,	
  DNA	
  isolate,	
  popula1on	
  
variants,	
  country,	
  geographic	
  coordinates,	
  collec1on	
  
date,	
  collector	
  name,	
  and	
  other	
  fields…	
  many	
  
informa1ve	
  about	
  the	
  iden1ty	
  of	
  the	
  plants	
  sequenced.	
  
	
  
To	
  make	
  clean	
  voucher	
  IDs,	
  we	
  used	
  last	
  name,	
  
collec1on	
  number,	
  and	
  DNA	
  isolate	
  (used	
  by	
  some	
  labs).	
  
For	
  this	
  analysis,	
  sequences	
  that	
  could	
  not	
  be	
  assigned	
  to	
  
a	
  single-­‐species	
  voucher	
  were	
  discarded.	
  
6692	
  sequence	
  records	
  à	
  	
  
3004	
  individuals,	
  54	
  genes,	
  5846	
  sequences	
  
ITS,	
  ETS,	
  matK,	
  trnL-­‐trnF	
  
3,370	
  DNA	
  sequences	
  
2,196	
  individuals	
  
723	
  spp	
  
397	
  spp	
  >	
  1	
  individual	
  
31.7%	
  of	
  those	
  spp	
  monophyle1c	
  
Iden1fy	
  gaps	
  in	
  our	
  
knowledge	
  and	
  
sampling	
  
Formulate	
  sampling	
  
plan	
  
New	
  collec1ons	
  
DNA	
  
sequences	
  
DNA	
  matrices	
  
Mul1ple	
  
alignments	
  
Species	
  tree	
  	
  
es1mates	
  
Revised	
  
classifica1on	
  
A	
  central	
  database	
  for	
  specimen-­‐level	
  data	
  
Iden1fy	
  gaps	
  in	
  our	
  
knowledge	
  and	
  
sampling	
  
Formulate	
  sampling	
  
plan	
  
New	
  collec1ons	
  
DNA	
  
sequences	
  
DNA	
  matrices	
  
Mul1ple	
  
alignments	
  
Species	
  tree	
  	
  
es1mates	
  
Revised	
  
classifica1on	
  
A	
  central	
  database	
  for	
  specimen-­‐level	
  data	
  
Iden1fy	
  gaps	
  in	
  our	
  
knowledge	
  and	
  
sampling	
  
Formulate	
  sampling	
  
plan	
  
New	
  collec1ons	
  
DNA	
  
sequences	
  
DNA	
  matrices	
  
Mul1ple	
  
alignments	
  
Species	
  tree	
  	
  
es1mates	
  
Revised	
  
classifica1on	
  
A	
  central	
  database	
  for	
  specimen-­‐level	
  data	
  
Iden1fy	
  gaps	
  in	
  our	
  
knowledge	
  and	
  
sampling	
  
Formulate	
  sampling	
  
plan	
  
New	
  collec1ons	
  
DNA	
  
sequences	
  
DNA	
  matrices	
  
Mul1ple	
  
alignments	
  
Species	
  tree	
  	
  
es1mates	
  
Revised	
  
classifica1on	
  
A	
  central	
  database	
  for	
  specimen-­‐level	
  data	
  
III.	
  Genera1ng	
  maps	
  from	
  
specimen	
  data	
  
Carex	
  macloviana	
  D’Urv	
  
GBIF	
  map,	
  2013-­‐07-­‐06	
  
Mapping	
  	
  GBIF	
  Data	
  	
  
• Generate	
  species	
  list	
  to	
  extract	
  GBIF	
  
data.	
  (i.e.	
  accepted	
  names	
  in	
  World	
  
Checklist)	
  
• Download	
  GBIF	
  data	
  using	
  a	
  wrapper	
  to	
  
dismo::gbif	
  (R),	
  allowing	
  us	
  to	
  capture	
  
and	
  log	
  errors	
  and	
  missing	
  data.	
  
	
  
Clean	
  up	
  downloaded	
  GBIF	
  data	
  
•  Flag	
  duplicate	
  specimen	
  datasets	
  
–  Flags	
  specimens	
  within	
  the	
  same	
  species	
  that	
  have	
  
iden1cal	
  coordinates.	
  	
  
–  This	
  should	
  be	
  expanded	
  to	
  include	
  specimens	
  that	
  have	
  
iden1cal	
  locality	
  descrip1ons.	
  
•  Flag	
  imprecise	
  loca1on	
  data	
  
–  Flags	
  specimens	
  in	
  which	
  the	
  la1tude	
  is	
  precise	
  only	
  to	
  the	
  
degree	
  or	
  to	
  a	
  tenth	
  of	
  a	
  degree.	
  
–  This	
  threshold	
  could	
  be	
  adjusted,	
  but	
  is	
  tailored	
  to	
  the	
  
Worldclim	
  database	
  we	
  are	
  using	
  (2.5	
  arc	
  minutes).	
  
•  Create	
  a	
  delimited	
  file	
  for	
  each	
  species	
  containing	
  
specimen	
  data	
  with	
  flagged	
  columns	
  (reference	
  file	
  of	
  
which	
  data	
  are	
  u1lized	
  excluded	
  in	
  mapping	
  step).	
  This	
  
file	
  becomes	
  part	
  of	
  our	
  analysis	
  archive,	
  so	
  that	
  we	
  
can	
  always	
  go	
  back	
  and	
  edit	
  or	
  evaluate	
  old	
  data.	
  
Example	
  of	
  a	
  file	
  generated	
  from	
  clean_gbif	
  
Mapping	
  "cleaned-­‐up"	
  dataset	
  
(Map_gbif_jpeg_imprecise)	
  
•  Maps	
  need	
  to	
  be	
  
manually	
  checked	
  for	
  
accuracy	
  and	
  
completeness	
  
•  We	
  export	
  the	
  maps	
  
as	
  images	
  to	
  a	
  
Scratchpads	
  media	
  
gallery	
  that	
  can	
  be	
  
queried	
  or	
  filtered	
  by	
  
taxon	
  
•  Map	
  reviewing	
  is	
  
conducted	
  in	
  a	
  
dedicated	
  SP2	
  forum	
  
There	
  are	
  bugs	
  to	
  work	
  out,	
  though	
  
Some	
  taxa	
  are	
  missing	
  data.	
  
Example:	
  Carex	
  humilis	
  
•  Map	
  of	
  2331	
  specimen	
  records	
  
from	
  R	
  code	
  download	
  
•  Website	
  	
  individual	
  species	
  
download	
  
–  Filtered	
  for	
  specimens	
  with	
  
coordinate	
  data	
  	
  (=	
  7209	
  
records)	
  
–  Missing	
  records	
  include	
  some	
  	
  
	
  	
  	
  from	
  France,	
  Japan,	
  &	
  
	
  	
  	
  South	
  Korea	
  
	
  
	
  
Some	
  maps	
  will	
  need	
  adjustments:	
  in	
  next	
  itera1ons,	
  
it	
  should	
  be	
  possible	
  to	
  automate	
  some	
  of	
  this	
  
Carex	
  alata	
  specimen	
  is	
  missing	
  a	
  “-­‐”	
  in	
  longitude	
  column	
  
	
  
Carex	
  lanceolata	
  has	
  specimens	
  where	
  the	
  la1tude	
  and	
  
longitude	
  are	
  switched.	
  
In	
  the	
  end,	
  
integra1ng	
  clean	
  
coordinate	
  data	
  
with	
  WorldClim	
  
clima1c	
  data	
  allows	
  
us	
  to	
  correlate	
  
clima1c	
  niche	
  
evolu1on	
  with	
  
morphological	
  and	
  
lineage	
  
diversifica1on*.	
  
	
  
*	
  See	
  Thursday	
  talk	
  for	
  exci1ng	
  
findings	
  in	
  subgenus	
  Vignea!	
  
h{ps://mor-­‐systema1cs.googlecode.com/svn/trunk/cariceae	
  
We’ve	
  been	
  wri1ng	
  these	
  tools	
  in	
  R,	
  
for	
  the	
  simple	
  reason	
  that	
  that’s	
  what	
  
we	
  know.	
  Bits	
  could	
  easily	
  be	
  ported	
  
to	
  PHP	
  for	
  integra1on	
  into	
  
Scratchpads,	
  or	
  Python	
  for	
  web	
  
implementa1on.	
  
	
  
Code	
  is	
  available	
  at:	
  
Iden1fy	
  gaps	
  in	
  our	
  
knowledge	
  and	
  
sampling	
  
Formulate	
  sampling	
  
plan	
  
New	
  collec1ons	
  
DNA	
  
sequences	
  
DNA	
  matrices	
  
Mul1ple	
  
alignments	
  
Species	
  tree	
  	
  
es1mates	
  
Revised	
  
classifica1on	
  
A	
  central	
  database	
  for	
  specimen-­‐level	
  data	
  
If	
  there	
  is	
  1me,	
  I’ll	
  take	
  
ques1ons!	
  

Mais conteúdo relacionado

Mais procurados

Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingStuti Nayak
 
Computational biology bls 303
Computational biology bls 303Computational biology bls 303
Computational biology bls 303Bruno Mmassy
 
Gene bank by kk sahu
Gene bank by kk sahuGene bank by kk sahu
Gene bank by kk sahuKAUSHAL SAHU
 
UNL UCARE Summer Symposium Poster
UNL UCARE Summer Symposium PosterUNL UCARE Summer Symposium Poster
UNL UCARE Summer Symposium PosterNichole Leacock
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...Syed Ahmad Chan Bukhari, PhD
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Databasenist-spin
 
Sequencedatabases
SequencedatabasesSequencedatabases
SequencedatabasesAbhik Seal
 
NIST Microbial Genomic RM BERM14 2015-10-15
NIST Microbial Genomic RM BERM14 2015-10-15NIST Microbial Genomic RM BERM14 2015-10-15
NIST Microbial Genomic RM BERM14 2015-10-15Nathan Olson
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224GenomeInABottle
 
Bioinformatics Final Report
Bioinformatics Final ReportBioinformatics Final Report
Bioinformatics Final ReportShruthi Choudary
 
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...taxonbytes
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute
 
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoning
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoningFranz ludaescher tdwg 2016 an update on taxonomic concept reasoning
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoningtaxonbytes
 

Mais procurados (20)

Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic Modeling
 
Computational biology bls 303
Computational biology bls 303Computational biology bls 303
Computational biology bls 303
 
NCBI
NCBINCBI
NCBI
 
Gene bank by kk sahu
Gene bank by kk sahuGene bank by kk sahu
Gene bank by kk sahu
 
RML NCBI Resources
RML NCBI ResourcesRML NCBI Resources
RML NCBI Resources
 
UNL UCARE Summer Symposium Poster
UNL UCARE Summer Symposium PosterUNL UCARE Summer Symposium Poster
UNL UCARE Summer Symposium Poster
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 
Major biological nucleotide databases
Major biological nucleotide databasesMajor biological nucleotide databases
Major biological nucleotide databases
 
Ncbi
NcbiNcbi
Ncbi
 
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
Sequencedatabases
SequencedatabasesSequencedatabases
Sequencedatabases
 
NIST Microbial Genomic RM BERM14 2015-10-15
NIST Microbial Genomic RM BERM14 2015-10-15NIST Microbial Genomic RM BERM14 2015-10-15
NIST Microbial Genomic RM BERM14 2015-10-15
 
Rishi
RishiRishi
Rishi
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224
 
Bioinformatics Final Report
Bioinformatics Final ReportBioinformatics Final Report
Bioinformatics Final Report
 
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoning
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoningFranz ludaescher tdwg 2016 an update on taxonomic concept reasoning
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoning
 

Destaque

Towards automated monitoring of Orthoptera (and some other noisy stuff)
Towards automated monitoring of Orthoptera (and some other noisy stuff)Towards automated monitoring of Orthoptera (and some other noisy stuff)
Towards automated monitoring of Orthoptera (and some other noisy stuff)Edward Baker
 
The Great Pretenders 4
The Great Pretenders 4The Great Pretenders 4
The Great Pretenders 4Edward Baker
 
Scratchpads & Citizen Science
Scratchpads & Citizen ScienceScratchpads & Citizen Science
Scratchpads & Citizen ScienceEdward Baker
 
Cockroaches: from the beginning
Cockroaches: from the beginningCockroaches: from the beginning
Cockroaches: from the beginningEdward Baker
 
NHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-LifeNHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-LifeEdward Baker
 
Biodiversity Informatics at the Natural History Museum
Biodiversity Informatics at the Natural History MuseumBiodiversity Informatics at the Natural History Museum
Biodiversity Informatics at the Natural History MuseumEdward Baker
 
Nature Live! Cockroaches from the beginning (31/07/2011)
Nature Live! Cockroaches from the beginning (31/07/2011)Nature Live! Cockroaches from the beginning (31/07/2011)
Nature Live! Cockroaches from the beginning (31/07/2011)Edward Baker
 
Java 7 JUG Summer Camp
Java 7 JUG Summer CampJava 7 JUG Summer Camp
Java 7 JUG Summer Campjulien.ponge
 
Java 7 at SoftShake 2011
Java 7 at SoftShake 2011Java 7 at SoftShake 2011
Java 7 at SoftShake 2011julien.ponge
 

Destaque (9)

Towards automated monitoring of Orthoptera (and some other noisy stuff)
Towards automated monitoring of Orthoptera (and some other noisy stuff)Towards automated monitoring of Orthoptera (and some other noisy stuff)
Towards automated monitoring of Orthoptera (and some other noisy stuff)
 
The Great Pretenders 4
The Great Pretenders 4The Great Pretenders 4
The Great Pretenders 4
 
Scratchpads & Citizen Science
Scratchpads & Citizen ScienceScratchpads & Citizen Science
Scratchpads & Citizen Science
 
Cockroaches: from the beginning
Cockroaches: from the beginningCockroaches: from the beginning
Cockroaches: from the beginning
 
NHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-LifeNHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-Life
 
Biodiversity Informatics at the Natural History Museum
Biodiversity Informatics at the Natural History MuseumBiodiversity Informatics at the Natural History Museum
Biodiversity Informatics at the Natural History Museum
 
Nature Live! Cockroaches from the beginning (31/07/2011)
Nature Live! Cockroaches from the beginning (31/07/2011)Nature Live! Cockroaches from the beginning (31/07/2011)
Nature Live! Cockroaches from the beginning (31/07/2011)
 
Java 7 JUG Summer Camp
Java 7 JUG Summer CampJava 7 JUG Summer Camp
Java 7 JUG Summer Camp
 
Java 7 at SoftShake 2011
Java 7 at SoftShake 2011Java 7 at SoftShake 2011
Java 7 at SoftShake 2011
 

Semelhante a Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re heading

Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global communityExternalEvents
 
The UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overviewThe UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overviewVictoria Perreau
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportAraport
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...GenomeInABottle
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128GenomeInABottle
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural cropsPulipati Gangadhara Rao
 
The server of the Spanish Population Variability
The server of the Spanish Population VariabilityThe server of the Spanish Population Variability
The server of the Spanish Population VariabilityJoaquin Dopazo
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim D. Pruitt
 
Curation Introduction - Apollo Workshop
Curation Introduction - Apollo WorkshopCuration Introduction - Apollo Workshop
Curation Introduction - Apollo WorkshopMonica Munoz-Torres
 
Group 5 DNA Tech - Ecology & Envt
Group 5 DNA Tech - Ecology & EnvtGroup 5 DNA Tech - Ecology & Envt
Group 5 DNA Tech - Ecology & EnvtJessica Kabigting
 
Giab poster structural variants ashg 2018
Giab poster structural variants ashg 2018Giab poster structural variants ashg 2018
Giab poster structural variants ashg 2018GenomeInABottle
 
Primary Databases.pptx
Primary Databases.pptxPrimary Databases.pptx
Primary Databases.pptxSwarup Malakar
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchAnshika Bansal
 
wings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualizewings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualizeAnn Loraine
 

Semelhante a Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re heading (20)

Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
The UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overviewThe UCSC genome browser: A Neuroscience focused overview
The UCSC genome browser: A Neuroscience focused overview
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - Araport
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128
 
Data base in detail
Data base in detailData base in detail
Data base in detail
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural crops
 
The server of the Spanish Population Variability
The server of the Spanish Population VariabilityThe server of the Spanish Population Variability
The server of the Spanish Population Variability
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015
 
Curation Introduction - Apollo Workshop
Curation Introduction - Apollo WorkshopCuration Introduction - Apollo Workshop
Curation Introduction - Apollo Workshop
 
NCBI
NCBINCBI
NCBI
 
Group 5 DNA Tech - Ecology & Envt
Group 5 DNA Tech - Ecology & EnvtGroup 5 DNA Tech - Ecology & Envt
Group 5 DNA Tech - Ecology & Envt
 
Giab poster structural variants ashg 2018
Giab poster structural variants ashg 2018Giab poster structural variants ashg 2018
Giab poster structural variants ashg 2018
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Primary Databases.pptx
Primary Databases.pptxPrimary Databases.pptx
Primary Databases.pptx
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
wings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualizewings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualize
 

Mais de Edward Baker

Data Sharing in Ecoacoustics
Data Sharing in EcoacousticsData Sharing in Ecoacoustics
Data Sharing in EcoacousticsEdward Baker
 
Ecoacoustic Challenges: UKAN Soundscapes Workshop
Ecoacoustic Challenges: UKAN Soundscapes WorkshopEcoacoustic Challenges: UKAN Soundscapes Workshop
Ecoacoustic Challenges: UKAN Soundscapes WorkshopEdward Baker
 
BioAcoustica: an online repository and analysis platform for wildlife sound
BioAcoustica: an online repository and analysis platform for wildlife soundBioAcoustica: an online repository and analysis platform for wildlife sound
BioAcoustica: an online repository and analysis platform for wildlife soundEdward Baker
 
Phasmids as Pests of Agriculture and Forestry
Phasmids as Pests of Agriculture and ForestryPhasmids as Pests of Agriculture and Forestry
Phasmids as Pests of Agriculture and ForestryEdward Baker
 
Phasmid Study Group: Name changes talk (Summer Meeting 2014)
Phasmid Study Group: Name changes talk (Summer Meeting 2014)Phasmid Study Group: Name changes talk (Summer Meeting 2014)
Phasmid Study Group: Name changes talk (Summer Meeting 2014)Edward Baker
 
NHM MSc: Automated Acoustic Identification
NHM MSc: Automated Acoustic IdentificationNHM MSc: Automated Acoustic Identification
NHM MSc: Automated Acoustic IdentificationEdward Baker
 
Measuring Impact: Towards a data citation metric
Measuring Impact: Towards a data citation metricMeasuring Impact: Towards a data citation metric
Measuring Impact: Towards a data citation metricEdward Baker
 
New tools for monitoring biodiversity and environments
New tools for monitoring biodiversity and environmentsNew tools for monitoring biodiversity and environments
New tools for monitoring biodiversity and environmentsEdward Baker
 
Building highways in the informatics landscape
Building highways in the informatics landscapeBuilding highways in the informatics landscape
Building highways in the informatics landscapeEdward Baker
 
What will a digitial Natural History Museum look like in 10 years time?
What will a digitial Natural History Museum look like in 10 years time?What will a digitial Natural History Museum look like in 10 years time?
What will a digitial Natural History Museum look like in 10 years time?Edward Baker
 
The story of a Wikipedia page
The story of a Wikipedia pageThe story of a Wikipedia page
The story of a Wikipedia pageEdward Baker
 
ViBRANT Citizen Science: Intro
ViBRANT Citizen Science: IntroViBRANT Citizen Science: Intro
ViBRANT Citizen Science: IntroEdward Baker
 
European initiatives
European initiativesEuropean initiatives
European initiativesEdward Baker
 
Scratchpads Training Course
Scratchpads Training CourseScratchpads Training Course
Scratchpads Training CourseEdward Baker
 
Nature Live!: Cockroaches from the beginning (May 2012)
Nature Live!: Cockroaches from the beginning (May 2012)Nature Live!: Cockroaches from the beginning (May 2012)
Nature Live!: Cockroaches from the beginning (May 2012)Edward Baker
 
Scratchpads Intro: Swiss Orchid Foundation
Scratchpads Intro: Swiss Orchid FoundationScratchpads Intro: Swiss Orchid Foundation
Scratchpads Intro: Swiss Orchid FoundationEdward Baker
 
Swiss Orchid Foundation Scratchpads and ViBRANT overview
Swiss Orchid Foundation Scratchpads and ViBRANT overviewSwiss Orchid Foundation Scratchpads and ViBRANT overview
Swiss Orchid Foundation Scratchpads and ViBRANT overviewEdward Baker
 
Connecting the dots: Natural Science Collections and the Web
Connecting the dots: Natural Science Collections and the WebConnecting the dots: Natural Science Collections and the Web
Connecting the dots: Natural Science Collections and the WebEdward Baker
 
Scratchpads past,present,future
Scratchpads past,present,futureScratchpads past,present,future
Scratchpads past,present,futureEdward Baker
 

Mais de Edward Baker (20)

Data Sharing in Ecoacoustics
Data Sharing in EcoacousticsData Sharing in Ecoacoustics
Data Sharing in Ecoacoustics
 
Ecoacoustic Challenges: UKAN Soundscapes Workshop
Ecoacoustic Challenges: UKAN Soundscapes WorkshopEcoacoustic Challenges: UKAN Soundscapes Workshop
Ecoacoustic Challenges: UKAN Soundscapes Workshop
 
BioAcoustica: an online repository and analysis platform for wildlife sound
BioAcoustica: an online repository and analysis platform for wildlife soundBioAcoustica: an online repository and analysis platform for wildlife sound
BioAcoustica: an online repository and analysis platform for wildlife sound
 
Phasmids as Pests of Agriculture and Forestry
Phasmids as Pests of Agriculture and ForestryPhasmids as Pests of Agriculture and Forestry
Phasmids as Pests of Agriculture and Forestry
 
Phasmid Study Group: Name changes talk (Summer Meeting 2014)
Phasmid Study Group: Name changes talk (Summer Meeting 2014)Phasmid Study Group: Name changes talk (Summer Meeting 2014)
Phasmid Study Group: Name changes talk (Summer Meeting 2014)
 
NHM MSc: Automated Acoustic Identification
NHM MSc: Automated Acoustic IdentificationNHM MSc: Automated Acoustic Identification
NHM MSc: Automated Acoustic Identification
 
Measuring Impact: Towards a data citation metric
Measuring Impact: Towards a data citation metricMeasuring Impact: Towards a data citation metric
Measuring Impact: Towards a data citation metric
 
New tools for monitoring biodiversity and environments
New tools for monitoring biodiversity and environmentsNew tools for monitoring biodiversity and environments
New tools for monitoring biodiversity and environments
 
Building highways in the informatics landscape
Building highways in the informatics landscapeBuilding highways in the informatics landscape
Building highways in the informatics landscape
 
What will a digitial Natural History Museum look like in 10 years time?
What will a digitial Natural History Museum look like in 10 years time?What will a digitial Natural History Museum look like in 10 years time?
What will a digitial Natural History Museum look like in 10 years time?
 
The story of a Wikipedia page
The story of a Wikipedia pageThe story of a Wikipedia page
The story of a Wikipedia page
 
ViBRANT Citizen Science: Intro
ViBRANT Citizen Science: IntroViBRANT Citizen Science: Intro
ViBRANT Citizen Science: Intro
 
European initiatives
European initiativesEuropean initiatives
European initiatives
 
Scratchpads Training Course
Scratchpads Training CourseScratchpads Training Course
Scratchpads Training Course
 
Nature Live!: Cockroaches from the beginning (May 2012)
Nature Live!: Cockroaches from the beginning (May 2012)Nature Live!: Cockroaches from the beginning (May 2012)
Nature Live!: Cockroaches from the beginning (May 2012)
 
Scratchpads Intro: Swiss Orchid Foundation
Scratchpads Intro: Swiss Orchid FoundationScratchpads Intro: Swiss Orchid Foundation
Scratchpads Intro: Swiss Orchid Foundation
 
Swiss Orchid Foundation Scratchpads and ViBRANT overview
Swiss Orchid Foundation Scratchpads and ViBRANT overviewSwiss Orchid Foundation Scratchpads and ViBRANT overview
Swiss Orchid Foundation Scratchpads and ViBRANT overview
 
Connecting the dots: Natural Science Collections and the Web
Connecting the dots: Natural Science Collections and the WebConnecting the dots: Natural Science Collections and the Web
Connecting the dots: Natural Science Collections and the Web
 
Scratchpads past,present,future
Scratchpads past,present,futureScratchpads past,present,future
Scratchpads past,present,future
 
ViBRANT Overview
ViBRANT OverviewViBRANT Overview
ViBRANT Overview
 

Último

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 

Último (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 

Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re heading

  • 1. Biodiversity  Informa1cs  of  the   Cyperaceae:  Where  we  stand  and   where  we’re  heading   Andrew  Hipp,  Marlene  Hahn,     Ed  Baker,  Vince  Smith  and     The  Cariceae  Working  Group  
  • 2. A  set  of  tools  for  Cariceae   informa1cs   Andrew  Hipp,  Marlene  Hahn,     Ed  Baker,  Vince  Smith  and     The  Cariceae  Working  Group  
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. Iden1fy  gaps  in  our   knowledge  and   sampling   Formulate  sampling   plan   New  collec1ons   DNA   sequences   DNA  matrices   Mul1ple   alignments   Species  tree     es1mates   Revised   classifica1on   A  central  database  for  specimen-­‐level  data  
  • 8. What  tools  do  we  need?       • An  easily-­‐updated  hierarchical  checklist  to  visualize   sampling  progress  across  labs,  extrac1ons,  sequences;   •   A  specimen-­‐level  phylogene6cs  pipeline  that  we  can  use   to  harvest  exis1ng  data  from  NCBI  as  well  as  generate   ongoing  phylogene1c  snapshots;   •   A  way  to  automate  mapping  from  specimen  data,  so  that   we  can  visualize  (and  assess  our  visualiza1ons  of)  species   distribu1ons  in  geographic  and  ecological  space;  and   •   A  pla8orm  for  collabora6on  –  a  virtual  research   environment  to  bring  together  researchers  worldwide    
  • 9. I.  A  hierarchical  checklist  and   sampling  progress  reports  
  • 10. In  2011   •  A  flat  checklist  exported   from  WCM   •  A  set  of  spreadsheets  from   collabora1ng  labs   inventorying  their  DNA  and   sequence  collec1ons   •  A  vague  idea  of  what  trips   are  needed   Today   •  A  hierarchical  checklist  by   subgenus,  sec1on   •  A  synthesis  of  what   materials  and  sequences   collaborators  have  on  hand,   and  what  taxa  are   unsampled   •  A  concrete  sampling  plan   with  trips  and  taxa   iden1fied*   *  Okay,  we’re  working  on  this  one!  
  • 11. Taxonomy   Specimen(s)   DNA   extrac6on(s)   Sequence(s)   Trace  file(s)  /   con6g(s)   We  are  aiming  toward  a   database  in  which  the   taxonomy,  specimen   data,  DNA  extrac1ons,   raw  sequencing  data  and   DNA  matrices  all  live   together  and  can  be   curated  and  worked  on   jointly  by  the  community.  
  • 12. Taxonomy   Specimen(s)   DNA   extrac6on(s)   Sequence(s)   Trace  file(s)  /   con6g(s)  
  • 13. Spring  2012:  Hierarchical  checklist   Taxonomy   Specimen(s)   DNA   extrac6on(s)   Sequence(s)   Trace  file(s)  /   con6g(s)   !  
  • 14. Taxonomy   Specimen(s)   DNA   extrac6on(s)   Sequence(s)   Trace  file(s)  /   con6g(s)   !  
  • 15. Specimen  Record   Tissue   Extrac1on   DNA  seq.   Metadata  flow   DNA  seq.   DNA  seq.  
  • 16. A  centralized  workflow   •  Spreadsheets  imported  into  a  single  Excel  file   •  Names  cleaned  (variable)   •  DNA  data  summary  formula  created  for  each   spreadsheet  (ca.  5  mins  per  user)   •  Names  matched  to  our  Scratchpads  checklist   •  All  files  exported  to  CSV   •  Sample  sheets  and  SP  checklist  imported  to  R   •  DNA  records  added  to  checklist  as  nodes  that  are   children  to  their  taxa.   •  Hierarchical  checklist  exported  in  text  format,  with   unsampled  taxa  marked  for  searching  
  • 17. ß  Sec1on  name   ß  Sampled  taxon  with  its  DNA  vouchers  and  summaries   ß  Unsampled  taxon  
  • 18. Because  Kew  has  coded  geography  using  TDWG   standards,  we  can  export  geographic  hit-­‐lists  
  • 19.
  • 20. Taxonomy   Specimen(s)   DNA   extrac6on(s)   Sequence(s)   Trace  file(s)  /   con6g(s)   !   !   !   ?  
  • 21. II.  A  specimen-­‐level   phylogene1c  pipeline  
  • 22.
  • 23. NCBI  is  a  morass  of  data.   Geneious   •  Query  nucleo1de    database  (NCBI)  for   Organism  contains:  “Carex”,  “Uncinia”,   “Schoenoxiphium”,  “Kobresia”,   “Vesicarex”,  or  “Cymophyllus”   •  Export  as   •  FASTA   •  TAB-­‐Delim   •  XML     •  Only  export  that  maintains  all  informa1on   in  NCBI.   •  Necessary  to  obtain  data  that  can  be  used   to  connect  sequence  to  a  specimen.  
  • 24. Hinchliff  and  Roalson.  2013.  Systema(c  Biology  62:  205–219.  
  • 25. Hinchliff  and  Roalson.  2013.  Systema(c  Biology  62:  205–219.  
  • 26. A  workflow  for  specimen-­‐level  mul1gene   datasets  from  NCBI   •  Download  from  NCBI  [we  used  Geneious,  but  any  bulk  download  is   fine]   •  Parse  out  collector  name,  collector  number,  isolate  number,  geography   •  Manually  clean  collector  names  (3  days  for  >6500  records)   •  Iden1fy  specimens  by  unique  combina1ons  of  collector  name,  collector   number,  isolate   •  Toss  out  “accessions”  having  more  than  one  scien1fic  name   •  Clean  gene  region  names  so  that  names  are  not  duplicated  (30  minutes   for  >6500  records)   •  Export  datasets  to  MUSCLE  and  align;  export  log  file   •  Manually  check  alignments  and  code  logfile  (D,  RC;  variable)   •  Rerun  MUSCLE  and  export  RAxML  batchfile   •  Analyze   •  Screen  for  non-­‐monophyly;  concatenate  and  con1nue!  
  • 27. 6692  sequence  records  in  Cariceae  
  • 28. Tab-­‐delimited  metadata  from  NCBI  /  Geneious  is   handy,  but  it  lacks  almost  all  the  informa1on  that   could  be  used  as  voucher  IDs.  No  way  to  link   sequences  to  specimens!    However,  some  NCBI   records  do  contain  this  data.  How  do  we  access  it?  
  • 29. NCBI   Specimen   Record   The FEATURES/Qualifier1 section has information that allows us to connect sequences to a specific specimen. (for example, some records contain the qualifier specimen_voucher) To get this additional information, we need to export the data as an XML file, and parse the data out into a useable tab delimited file. Other good information to export
  • 30. We  parsed  the  NCBI  XML  and  embedded  fields  within   <qualifiers1>  to  get  voucher,  DNA  isolate,  popula1on   variants,  country,  geographic  coordinates,  collec1on   date,  collector  name,  and  other  fields…  many   informa1ve  about  the  iden1ty  of  the  plants  sequenced.     To  make  clean  voucher  IDs,  we  used  last  name,   collec1on  number,  and  DNA  isolate  (used  by  some  labs).   For  this  analysis,  sequences  that  could  not  be  assigned  to   a  single-­‐species  voucher  were  discarded.  
  • 31. 6692  sequence  records  à     3004  individuals,  54  genes,  5846  sequences  
  • 32. ITS,  ETS,  matK,  trnL-­‐trnF   3,370  DNA  sequences   2,196  individuals   723  spp   397  spp  >  1  individual   31.7%  of  those  spp  monophyle1c  
  • 33.
  • 34. Iden1fy  gaps  in  our   knowledge  and   sampling   Formulate  sampling   plan   New  collec1ons   DNA   sequences   DNA  matrices   Mul1ple   alignments   Species  tree     es1mates   Revised   classifica1on   A  central  database  for  specimen-­‐level  data  
  • 35. Iden1fy  gaps  in  our   knowledge  and   sampling   Formulate  sampling   plan   New  collec1ons   DNA   sequences   DNA  matrices   Mul1ple   alignments   Species  tree     es1mates   Revised   classifica1on   A  central  database  for  specimen-­‐level  data  
  • 36. Iden1fy  gaps  in  our   knowledge  and   sampling   Formulate  sampling   plan   New  collec1ons   DNA   sequences   DNA  matrices   Mul1ple   alignments   Species  tree     es1mates   Revised   classifica1on   A  central  database  for  specimen-­‐level  data  
  • 37. Iden1fy  gaps  in  our   knowledge  and   sampling   Formulate  sampling   plan   New  collec1ons   DNA   sequences   DNA  matrices   Mul1ple   alignments   Species  tree     es1mates   Revised   classifica1on   A  central  database  for  specimen-­‐level  data  
  • 38. III.  Genera1ng  maps  from   specimen  data  
  • 39. Carex  macloviana  D’Urv   GBIF  map,  2013-­‐07-­‐06  
  • 40. Mapping    GBIF  Data     • Generate  species  list  to  extract  GBIF   data.  (i.e.  accepted  names  in  World   Checklist)   • Download  GBIF  data  using  a  wrapper  to   dismo::gbif  (R),  allowing  us  to  capture   and  log  errors  and  missing  data.    
  • 41. Clean  up  downloaded  GBIF  data   •  Flag  duplicate  specimen  datasets   –  Flags  specimens  within  the  same  species  that  have   iden1cal  coordinates.     –  This  should  be  expanded  to  include  specimens  that  have   iden1cal  locality  descrip1ons.   •  Flag  imprecise  loca1on  data   –  Flags  specimens  in  which  the  la1tude  is  precise  only  to  the   degree  or  to  a  tenth  of  a  degree.   –  This  threshold  could  be  adjusted,  but  is  tailored  to  the   Worldclim  database  we  are  using  (2.5  arc  minutes).   •  Create  a  delimited  file  for  each  species  containing   specimen  data  with  flagged  columns  (reference  file  of   which  data  are  u1lized  excluded  in  mapping  step).  This   file  becomes  part  of  our  analysis  archive,  so  that  we   can  always  go  back  and  edit  or  evaluate  old  data.  
  • 42. Example  of  a  file  generated  from  clean_gbif  
  • 43. Mapping  "cleaned-­‐up"  dataset   (Map_gbif_jpeg_imprecise)   •  Maps  need  to  be   manually  checked  for   accuracy  and   completeness   •  We  export  the  maps   as  images  to  a   Scratchpads  media   gallery  that  can  be   queried  or  filtered  by   taxon   •  Map  reviewing  is   conducted  in  a   dedicated  SP2  forum  
  • 44.
  • 45. There  are  bugs  to  work  out,  though   Some  taxa  are  missing  data.   Example:  Carex  humilis   •  Map  of  2331  specimen  records   from  R  code  download   •  Website    individual  species   download   –  Filtered  for  specimens  with   coordinate  data    (=  7209   records)   –  Missing  records  include  some          from  France,  Japan,  &        South  Korea      
  • 46. Some  maps  will  need  adjustments:  in  next  itera1ons,   it  should  be  possible  to  automate  some  of  this   Carex  alata  specimen  is  missing  a  “-­‐”  in  longitude  column     Carex  lanceolata  has  specimens  where  the  la1tude  and   longitude  are  switched.  
  • 47. In  the  end,   integra1ng  clean   coordinate  data   with  WorldClim   clima1c  data  allows   us  to  correlate   clima1c  niche   evolu1on  with   morphological  and   lineage   diversifica1on*.     *  See  Thursday  talk  for  exci1ng   findings  in  subgenus  Vignea!  
  • 48. h{ps://mor-­‐systema1cs.googlecode.com/svn/trunk/cariceae   We’ve  been  wri1ng  these  tools  in  R,   for  the  simple  reason  that  that’s  what   we  know.  Bits  could  easily  be  ported   to  PHP  for  integra1on  into   Scratchpads,  or  Python  for  web   implementa1on.     Code  is  available  at:  
  • 49. Iden1fy  gaps  in  our   knowledge  and   sampling   Formulate  sampling   plan   New  collec1ons   DNA   sequences   DNA  matrices   Mul1ple   alignments   Species  tree     es1mates   Revised   classifica1on   A  central  database  for  specimen-­‐level  data  
  • 50.
  • 51.
  • 52. If  there  is  1me,  I’ll  take   ques1ons!