This document discusses a schema for describing and exchanging the content of taxonomic publications in a way that allows both human and machine access. It proposes using semantic markup like XML to tag elements in publications like names, descriptions, and references in a way that links related data across sources. This would allow content to be more accessible for tasks like data mining while maintaining context. The schema is part of ongoing work by Plazi to apply semantic markup to digitize existing publications and structure new ones for improved dissemination and reuse of taxonomic knowledge.
DevEX - reference for building teams, processes, and platforms
20110725 ibc xml
1. A Schema for Description and Exchange of TaxonomicPublication's Content Donat Agosti, Terry Catapano, Lyubomir Penev & Guido Sautter Plazi, Bern, Switzerland 25. July 2011, IBC, Melbourne
5. “JSTOR's the one that should be in prison, man, for locking up knowledge.”Hufpost Politics, July 19, 2011http://www.huffingtonpost.com/2011/07/19/huffpost-hill----gang-vio_n_904027.html
11. extracted graph of 30,000+ relationships and 5,500 genes and proteins“protein-protein interaction networks” John Wilbanks, Neurocommons
12. 27,266 papers 128,437 papers 41,985 papers 4,563 papers 10,365 papers In a semantic Web environment (where machines talk to each other and do most of our work), data need to be able to talk to each other: “protein-protein interaction networks” John Wilbanks, Neurocommons
16. It is about digesting millions of pages: >>100 M pages taxonomic literature25M scientific publications / year25K journals>2K with zoological taxonomic descriptions18K descriptions of new species / year
22. <tax:treatment> <tax:nomenclature> <tax:name> <tax:xid source="HNS" identifier="193329"/> <tax:xmldata> <dc:Genus>Mystrium</dc:Genus> <dc:Species>leonie</dc:Species> </tax:xmldata> Mystrium leonie Bihn & Verhaagh, new species </tax:name> <tax:status>n. sp.</tax:status> Fig 1 D - F </tax:nomenclature> <tax:div type="description"> <tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL 1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin strongly curving to a sharp apical tooth, the apex parallel to the anterior clypeal margin. (Holotype with material in mandibles, so mandibles and anterior clypeus described below from paratypes.) Median clypeus .... </treatment>
25. Azteca instabilis Would then read like <tax:name> <tax:xid source=“LSID" identifier=“urn:lsid:biosci.ohio-state.edu.osuc_concetps:13452"/> Link to external database <tax:xmldata> Normalization of data <dc:Genus>Azteca</dc:Genus> <dc:Species>instabilis</dc:Species> </tax:xmldata> Azteca instabilis </tax:name>