1. The BARCODE
Data Standard
David E. Schindel, Executive Secretary
National Museum of Natural History
Smithsonian Institution
SchindelD@si.edu; http://www.barcoding.si.edu
202/633-0812; fax 202/633-2938
2.
3. BARCODE Data Standard is:
A set of required elements for a reserved
Keyword (‘BARCODE’) in GenBank
A set of sequence quality requirements
Required or recommneded formats for
data interoperability with:
– Voucher specimens in biorepositories
– Georeferenced data
– Taxonomic literature
4. An Internal ID System for All Animals
The Mitochondrial Genome
DNA
D-Loop Small ribosomal RNA
Cytochrome b
ND1
ND6
Typical Animal Cell
ND5
COI ND2
mtDNA L-strand
H-strand
ND4
ND4L
COII
ND3
COIII ATPase subunit 8
Mitochondrion ATPase subunit 6
5. Non-COI regions for other taxa
Land plants:
– Chloroplast matK and rbcL approved Nov 09
– 70-75% resolving ability, higher in angiosperms
– Non-coding plastid and nuclear regions being
explored
Fungi:
– CBOL Working Group met this week in Amsterdam
– Agreed to recommend ITS; 72% effective
Protists:
– CBOL Working Group July meeting, Berlin
6. BARCODE Record Flow Chart
Key
Mirroring
Update
Channel
Private Records
USER
/GenBank
9. Required Elements for BARCODE
Taxonomic identification to species
Voucher specimen ID in standard format
Name of barcode region
Length, quality, 2 trace files
Forward/reverse primer sequences, names
Country/Ocean/Sea of origin
11. Traditional GSC Minimum Traditional
Taxonomy Standards GenBank
(MI*)
Voucher specimen
ID XXX XXX
Species ID XXX X X
Identified by XXX
DNA sequence XXX XXX
Gene region XXX
Geographic origin
(country, ocean) XXX X
Latitude/Longitude XXX XXX
Collection date,
collector name XXX XXX
Trace files XXX XX
Primer information X XX
12. BARCODE Records in INSDC
Specimen Voucher Species
Metadata Specimen Name
Georeference Indices
Habitat - Catalogue of Life
Character sets
Images
Barcode - GBIF/ECAT
Nomenclators
Behavior
Other genes
Sequence - Zoo Record
Trace files Primers - IPNI
- NameBank
Publication links
Literature - New species
citation Record in Databases
- Provisional sp.
BOLD
13. Compliance with Standard (1)
1.37 million records in BOLD
514,390 BARCODE records in INSDC
395,774 have ordinal name plus Barcode
Index Number for taxonomic ID
– Rapid data release versus time for annotation
– Exposure to data theft, risk of misidentification
– Added value of Linnean name
– Incidence of misidentifications in GenBank
– Danger of circular reasoning
14. Taxonomic Identification
The genus and species combination that
can be found in:
– a taxonomic index such as Catalog of Life,
Zoological Record or IPNI;
– a taxonomic treatment of a previously
published species name; or
– a published description of the species; or
A provisional label for a potential new
species;
16. Taxonomic Content in iBOL Data
iBOL ‘Phase 1’ GenBank ‘Phase 0’
Org name: Tentative name is in
Order + BIN BOLD, unreleased
Tentative Name: GenBank ‘Phase 1’
blank
Org name =
iBOL ‘Phase 2’ Order + BIN plus
Org name: Tentative name
Order + BIN
Tentative Name: GenBank ‘Phase 2’
blank Org name = sp. name
17. Unique identifier for the
voucher specimen
In standardized format based on Darwin Core:
Institutional acronym:Collection code:Specimen
number
Institutional acronym:Specimen number
personal:Collection code:Specimen number
GTI/CBOL/iBOL Workshop, 7 November 2009
18. Compliance with Standard (2)
514,390 BARCODE records in INSDC
– Traces, primers, length, country, and presence
of voucherID checked by GenBank
99.9% have entry for /specimen_voucher
13,151 have formatted voucher from 38
institutions
– 20 confirmed in biorepositories
– 11 unconfirmed
– 7 unlisted
19. Darwin Core Triplet
Structured Link to Vouchers
Institutional Collection : Catalog
: Code ID
Acronym
NHM : LEP : 123456
personal : DHJanzen : SRNP12345
20. Icelandic Institute of Natural History,
AMNH Akureyri Division Akureyri Iceland
AMNH American Museum of Natural History New York USA
Monterrey, Nuevo
UNL Universidad Autónoma de Nuevo León León Mexico
UNL University of Nebraska State Museum Lincoln, Nebraska USA
Centro de Estratigrafia e Paleobiologia da
UNL Universidade Nova de Lisboa Monte de Caparica Portugal
ZMK Zoological Musem, Kristiania Oslo Norway
ZMK Zoologisches Museum der Universität Kiel Kiel Germany
ZMK Zoological Museum, Copenhagen Copenhagen Denmark