3. EOL Today
Key Milestones in 2013
1.1 million species pages
240+ content providers
3.3 million unique annual
visitors from 235
countries
4. 0 100000 200000 300000 400000 500000 600000 700000 800000
Distribution
MolecularBiology
Multiple topics
TypeInformation
Habitat
ConservationStatus
Threats
Morphology
Conservation
Management
Trends
Size
Associations
Uses
TrophicStrategy
Cyclicity & Life Cycle
PopulationBiology
Reproduction
Migration
Taxonomy
LifeExpectancy
Identification
Behaviour
Ecology
Diseases
Number of text objectsSubjectoftextobject
5. Text mining, crowdsourcing, standardizing
see http://eol.org/info/fellows
Co-occurrence, term extraction &
linked data
Thessen & Devries
EnvO habitat terms Pafilis et al.
Altitude Specificity of Flower
Coloration
Wright
Morphological impacts of extinction
risk in fish
Chang
Butterfly-hostplant associations Ferrer-Parris et al.
Species Interactions Poelen & Mungall
et al.
6. 14 datasets containing 25k
taxa, 422k
interactions, for 3k
locations
alpha version of
ingestion, normalization,
aggregation
alpha version of web API
alpha version of data
exports
Dr. Katy Börner led
Information Visualization
MOOC
GLoBI http://globalbioticinteractions.wordpress.com/
7. EOL TraitBank
Funded: Marine focus
Virtuoso triple store, re-using URIs where possible
5 datasets 128,050 data points for 20,896 taxa
Harvest and display on data tab
Downloads, fancy searching
Machine access
8.
9.
10. Uploads & harvests will be by spreadsheet
and Darwin Core Archive
Support for annotation and curation
Please contact me to be part of the private beta
11. Easy access to analyzable trait data
“Are blue organisms more common in high altitudes?”
“Does the evolution of mammalian bacula appear to be
related to the pattern of promiscuous mating?”
“What organisms should I collect to fill in gaps in genome
quality tissue collections?”
• Look for trait, download for all taxa
• Create a collection of taxa, download all data
• Use Reol: an R interface to EOL (Banbury, O’Meara)
http://reolblog.wordpress.com/
• Find more specialized data repositories
13. Thanks
Funding & other contributions
Sloan Foundation
Smithsonian Institution
David Rubenstein
Marine Biological Laboratory
Harvard University
Our content partners
Thousands of individual
contributors, and hundreds of
volunteer curators
Image credits
Jenny from Taipei
Cynthia Parr
Chief Scientist @eol
@cydparr parrc@si.edu
Alexandria Archive: Sarah Kansa, Eric Kansa, 34 othe
zooarchaeologists
GLoBI: Jorrit Poelen (lead/software), Chris Mungall
(ontologies), James Simons (biologist) and Robert
Reiz (software). Datasets shared by: Peter D.
Roopnarine, Rachel Hertog, Carlos García-
Robledo, James Simons, Jenny L. Wrast, C.
Barnes, International Council for the Exploration of
the Sea (ICES), Jose R. Ferrer Paris, Senol
Akin, Malcolm Storey (BioInfo.org.uk), Ivy E.
Baremore, Joel Sachs (SPIRE), Colt W. Cook, David A.
Blewett
14. Quick math
In Phenoscape
57 publications had 565,158 anatomical trait
descriptions for 2,527 kinds of organisms
= 223 traits/organism
In ZFIN
38,189 trait descriptions for 4,727 genes for Zebra
Fish
1.9 million species on the planet
= LOTS OF TRAITS
15. Anatolia Zooarchaeology Case Study led by
Alexandria Archive Institute
1. 14 different sites
2. 34+ zooarchaeologists
3. Decoding, cleanup, metadata documentation
4. 220,000+ specimens
5. 450 entities linked to 143 EOL taxon concepts
6. Anatomical entities linked to Uberon.org
7. Biometrics linked to measurement ontology
8. Collaborative analysis
http://opencontext.org/
Notas do Editor
We have a working infrastructure as well as more than 200 partners, We harvest and sort text and multimedia by topic and by species and put it on our pages. Curation + user-added content from the crowds is added to the mix.This is fed back to providers, giving them traffic, quality control on their own content, and new content for them to use And, we are already seeing spinoff products. We make it easy for developers, and everything is either public domain or CC-licensed so it can be re-used.
We now have over a million pages with content, some of it is even in other languages like Arabic, Spanish, and Chinese. And we are getting traffic mostly from the general public, from all over the world.
Most of our 5.4 million content objects are text blobs and here are the subjects of that text. Most often, our text objects are about distribution. But there are many other subjects involved including essays that include multiple subjects.
Except for the first, links for that one on request
Information Visualization MOOC (Massive Open Online Course) led by Dr. Katy Börner of Indiana University, students TwyBethard (United States), Andrew Miles (United Kingdom), Edward Kok (Netherlands) and Mattia Della Libera (Italy) used GloBI data to create an insightful visualization of spatial marine food webs in the Gulf of Mexico.
Starting with marine dataIn the most simplistic view, we’ll be storing triplesThis data will be organized on a data tab, sorting out the data into the 35 or so “topics” that we currently have text chapters for, and we will also allow powerful downloading and searching capabilityFinally we’ll be setting up ways for other applications to grab the data and do interesting things with it. We already have a tool for making field guides,The approach here builds on our innovations for EOL and adds some proven technology called the “semantic web” to our domain. The next step takes this chain of innovation even further.
Drawing data from the literature, from online databases, and from published datasets as in Dryad, summarizing collections databases
Everyone wants to know theattributes of organismsPeople exploring the world find something and want to be able to search on characteristics they can seeTeachers want their students to become adept at analyzing data, and how better than to work with real numerical information about the size of organisms or their behavior or what their sensitivity is to temperature and what might happen in the face of climate changeSo while scientists were saying they needed us to provide data they could analyze, we heard the same thing from our educators, too.
Phenoscape is a database that is looking at anatomical traits in fishes. Looking just at 57 publications they have more than 500K descriptions for 2500 kinds of organisms.ZFIN is a model organism database for zebrafish, a common model organism for developmental biologists. In just this one species they have captured nearly 40,000 traits – just for ONE very well-studied SPECIES