Separation of Lanthanides/ Lanthanides and Actinides
Reasoning over phenotype diversity, character change, and evolutionary descent
1. Reasoning over phenotype
diversity, character change,
and evolutionary descent
Hilmar Lapp
National Evolutionary Synthesis Center (NESCent)
Seminar at University of Florida, March 1, 2011
2. Regier et al (2010
Parfrey et al (2010, Parfrey & Katz
Life has
evolved a
stunning
diversity of
phenotypes
Images: Web Tree of Life (http://tolweb.org)
7. As complex, free text phenotypes
are resistant to computing
(Lundberg and Akama 2005)
8. Finding similar information
in free-text is difficult
“lacrymal bone...flat’’ Mayden 1989
Grande and Poyato-
“lacrimal...small, flat”
Ariza 1999
“lacrimal...triangular’’ Royero 1999
“first infraorbital (lachrimal)
Kailola 2004
shape...flattened”
“fourth infraorbital...anterior and
Zanata and Vari 2005
posterior margins...in parallel”
9. Meaning of words
depends on context
Mole:
• Burrowing insectivorous mammals in the family Talpidae
• A spy buried secretly within an organization or country
• The SI unit used in chemistry for the amount of a
substance
• A small, sometimes raised area of skin, usually with
darker pigment
• A Mexican sauce made from chili peppers and other
spices, including chocolate
• A massive structure, usually of stone, used as a pier,
jetty, or breakwater between places separated by water
10. What is an ontology?
• An ontology is a type of vocabulary
with well-defined terms and the logical
relationships that hold between them.
• An ontology represents the knowledge
about its subject domain.
11. Ontologies support reasoning
• Relationships
(“assertions”)
induce a
hierarchical
structure.
• Ontologies can
be processed by
machines to
make
inferences.
12. The same principles
apply to anatomy
pharyngeal
ventral arch
hyoid arch cartilage
replacement
part_of
bone is_a
part_of basihyal
is_a cartilage
basihyal
element is_a
basihyal
bone develops_from
16. Computing
example:
Search by
Similarity
Fig. 3, Washington et al (2009)
Trogloglanis pattersoni - a blind catfish
http://tolweb.org/Trogloglanis/69910
Fig. 1, Washington et al (2009)
18. Knowledge mining &
hypothesis generation
Model Organism Non-model organisms
Mutagenesis Mutation, selection, drift, gene flow
Mutant or missing protein at Altered expression or
specific developmental stage function of protein
Phenotype change(s) Phenotype changes between
to wildtype evolutionary lineages
middle nuchal plate predorsal
spinelet spine
anterior
nuchal plate
Order Siluriformes
Laue et al (2008) Pimelodus maculatus 2 cm abdominal Order Characiformes
scutes Catoprion mento
19. Phenoscape
• Collaboration between P. Mabee (PI, U. South
Dakota), M. Westerfield (ZFIN), and Todd Vision
(UNC, NESCent)
• Aim: Foster devo-evo synthesis by
• Prototyping a database of curated, machine-
interpretable evolutionary phenotypes.
• Integrating these with mutant phenotypes from
model organisms.
• Enabling data-mining and discovery for candidate
genes of evolutionary phenotype transitions.
• Informatics for the project is developed and hosted
at NESCent
21. Entity-Quality Model for
Evolutionary Phenotypes
Character State
Entity Attribute Value
ectopterygoid shape rectangular
Entity (TAO) Quality (PATO)
ectopterygoid rectangular
22. Entity-Quality Model for
Evolutionary Phenotypes
Character State
Entity Attribute Value
ectopterygoid shape rectangular
implies
Entity (TAO) Quality (PATO)
ectopterygoid rectangular
23. Entity-Quality Model for
Evolutionary Phenotypes
Character State
Entity Attribute Value
ectopterygoid shape rectangular
implies
Entity (TAO) Quality (PATO)
}ectopterygoid rectangular
Phenotype
24. Taxon phenotype assertion
Links a quality to the
Links a taxon to a entity that is its bearer Phenotypic Quality
phenotype ontology term
some rectangular
exhibits some inheres_in some
Batrachoglanis
raninus ectopterygoid
Taxon Anatomy
ontology term ontology term
25. Taxon phenotype assertion
Links a quality to the
Links a taxon to a entity that is its bearer Phenotypic Quality
phenotype ontology term
some rectangular
exhibits some inheres_in some
Batrachoglanis
raninus ectopterygoid
Taxon Anatomy
ontology term ontology term
Evidence
Curator Specimen Publication
Code
26. Gene phenotype assertion
Links a quality to the
Links a genotype to a entity that is its bearer Phenotypic Quality
phenotype ontology term
curvature
influences some inheres_in some
edn1tf216b/tf216b
maxilla
Anatomy
Genotype
ontology term
Publication
36. Major taxonomic groups have
similar distribution of entities
among phenotypes
25 July 2008
37. Some notable differences
for skeletal characters
Clupeiformes
% Postcranial axial skeleton
Gonorynchiformes
% Paired fins
Gymno<formes
% Cranium
Characiformes
Cypriniformes
Image from Sabaj-Perez
Siluriformes
0 10 20 30 40 50 60 70
25 July 2008
38. Substantial overlap between
model organism and
evolutionary phenotypes
hematopoie7c system •4,217 zebrafish phenotypes
reproduc7ve system •3,405 evolutionary characters
musculature system
liver and biliary system
respiratory system
renal system
endocrine system Evolu7onary characters
Zebrafish phenotypes
immune system
diges7ve system
cardiovascular system
skeletal system
sensory system
nervous system
0 500 1000 1500 2000 2500
25 July 2008
40. Wet lab test
(Work by Richard Edmunds)
Ictalurus punctatus
eda expression is lacking in the epidermis
41. Hypothesis generation:
Genetic basis for absence of the
basihyal bone in Siluriformes
Mutation of brpf1 gene Ictalurus punctatus:
in Danio:
Laue et al (2008)
42. Wet lab test
(Work by Richard Edmunds)
Ictalurus punctatus
78 hpf 86 hpf
brpf1 lacks expression in the basihyal
43. The parts to make this work
• Ontologies that capture the knowledge
domains
• Efficient data curation workflow
• Expressive and scalable inference
engine
44. • Teleost Anatomy
(seeded from Zebrafish)
• Teleost Taxonomy
Ontology (based on Eschmeyer’s
Catalog of Fishes)
development • Phenotypic Quality
Dahdul et al (2009), Cover art: K. Luckenbill
(PATO)
46. Getting ontologies right
is a challenge
• What is the right axis of classification?
• Structure versus function
• Relational vs monadic qualities
• PATO: shape and size vs natural language
”Interopercle shape: expanded posteroventrally”
• Different ways to observe or generate a
phenotypic quality
• Color as color hue (radiation quality) or
pigmentation (structural quality)
• Relative sizes don’t have a universal
reference
47. Curation
Dahdul et al., 2010 PLoS ONE
2. Students:
3. Character
Manual entry of free
annotation by experts:
text character
Entry of phenotypes
descriptions, matrix,
and homology
taxon list, specimens
assertions using
and museum numbers
Phenex
using Phenex
Curators:
Wasila Dahdul
Miles Coburn
Jeff Engemen
Terry Grande
1. Students: Eric Hilton
gather publications John Lundberg 4. Consistency
(scan hard copies, Paula Mabee checks, upload of
produce OCR PDFs) Richard Mayden data to public view of
Mark Sabaj Pérez Phenoscape KB
48. KB is based on OBD
(Ontology-Based Database)
(C. Mungall, LBL)
Inference_Evidence_Support
Link_Cardinality
Inference_Evidence_Support_ID
Link_ID
of link
- object_cardinality
inferred from link
- object_min_cardinality
for evidence
- object_max_cardinality Link
Link_ID Inference_Evidence
- combinator Inference_Evidence_ID
Node_Xref - is_inferred
Alias Node_ID - is_instantiation
of link inference type
Alias_ID - is_negated
- is_obsolete evidence type
- scope
Xref - is_metadata
- label for
context
- applies_to_all
of node - object_quantifier_some
synonym - object_quantifier_only
source type
source
Description_Xref subject
Description_ID predicate
object
reified as
valid during
Xref Node
for
context
Node_ID TagVal
- uid
Description TagVal_ID
- label
- val
Description_ID - uri
- scope - metatype for node
- label - is_anonymous
of node predicate (tag)
- is_transitive
- is_obsolete data type
description type
source - is_reiflink source
source
49. PATO:quality Measurement ZFIN:Publication
ECO:evidence curator(s) -value/max/min uid = ZFIN ID
-unit
OBO_REL:is_a
has_measurement OBO_REL:posited_by
dc:creator
has_evidence
TTO:taxon Phenotype Genotype
PHENO:exhibits OBO_REL:influences
uid = TTO ID (class expression) uid = ZFIN ID
OBO_REL:towards
PHENO:asserted_for_otu OBO_REL:inheres_in OBO_REL:variant_of
{TAO,ZFA}:entity
PHENO:has_taxon uid = {TAO,ZFA} ID
OBO_REL:posited_by Gene
{TAO,ZFA}:entity
uid = ZFIN ID
uid = {TAO,ZFA} ID
OBO_REL:posited_by
CDAO:TU CDAO:CharacterStateDomain
name = Publication Taxon CDAO:CharacterStateDatum CDAO:has_State
name = state text
CDAO:has_TU
PHENO:has_comment PHENO:has_comment
dwc:individualID CDAO:CharacterStateDataMatrix CDAO:has_Datum
comments CDAO:has_Character comment
(literal text) CDAO:Character (literal text)
name = character text
Specimen PHENO:has_publication
PHENO:has_comment
PHENO:has_comment
dwc:catalogID dwc:collectionID
PHENO:Publication
-dc:abstract publication notes comment
-dc:bibliographicCitation (literal text) (literal text)
catalog number COLLECTION -dc:date
(literal text)
Figure 3
53. How does reasoning work?
TAO:
dermal bone
TAO:
is_a palatoquadrate arch
PATO:shape part_of
is_a
part_of TAO:maxilla
is_a is_a
TAO: is_a
ectopterygoid
PATO:rectangular PATO:curvature
in ZFA:maxilla
s_
re
is_a
is_a he inheres_in
in
TTO: PATO:rectangular^ PATO:curvature^ ZFIN:
Batrochoglanis inheres_in inheres_in influences
edn1tf216b/tf216b
exhibits
raninus (TAO:ectopterygoid) (ZFA:maxilla)
is_a variant_of
TTO:
has_rank
TTO: TTO: ZFIN:edn1
Batrochoglanis is_a Pseudopimelodidae
TTO:
TAXRANK:species has_rank
TAXRANK:genus TTO: TAXRANK:family
has_rank
54. How does reasoning work?
TAO: is
dermal bone _a
TAO:
is_a palatoquadrate arch
pa
PATO:shape part_of rt_
is_a of
part_of TAO:maxilla
is_a is_a
TAO: is_a
ectopterygoid
PATO:rectangular PATO:curvature
ZFA:maxilla
s _in
is_a ere
is_a inh inheres_in
TTO: PATO:rectangular^ PATO:curvature^ ZFIN:
Batrochoglanis inheres_in inheres_in influences
edn1tf216b/tf216b
exhibits
raninus (TAO:ectopterygoid) (ZFA:maxilla)
is_a variant_of
TTO: is_a
has_rank
TTO: TTO: ZFIN:edn1
Batrochoglanis is_a Pseudopimelodidae
TTO:
TAXRANK:species has_rank
TAXRANK:genus TTO: TAXRANK:family
has_rank
55. How does reasoning work?
TAO: is_
dermal bone a
TAO:
is_a palatoquadrate arch
PATO:shape pa
of part_of
in
rt_
is_a rt_ of
res_
pa
in_
re s_
inhe
is_a e part_of TAO:maxilla
inh
_in
is_a is_a
is
s
_a
re
TAO: is_a
he
ectopterygoid of in
in
PATO:rectangular PATO:curvature rt_ e s_
pa er ZFA:maxilla
in i n_ in
h
re
s_ s_
is_a e re
is_a inh he inheres_in
in
TTO: PATO:rectangular^ PATO:curvature^ ZFIN:
Batrochoglanis inheres_in inheres_in influences
edn1tf216b/tf216b
exhibits
raninus (TAO:ectopterygoid) (ZFA:maxilla)
is_a variant_of
TTO: is_a
has_rank
TTO: TTO: ZFIN:edn1
Batrochoglanis is_a Pseudopimelodidae
TTO:
TAXRANK:species has_rank
TAXRANK:genus TTO: TAXRANK:family
has_rank
56. How does reasoning work?
PATO:shape^
inheres_in TAO: is_
(TAO:dermal bone) dermal bone a
TAO:
is_a palatoquadrate arch
PATO:shape PATO:shape^ pa
part_of rt_
of
in
inheres_in_part_of is_a
art
_ of
res_
p
(TAO:palatoquadrate arch)
s_ in_
re
inhe
is_a inhe part_of TAO:maxilla
is_a is_a
_in
is_
es
TAO:
a
er
is_a
inh
ectopterygoid f in
PATO:rectangular PATO:curvature o s_
rt_ re
_ pa he ZFA:maxilla
in in in
s_ s_
ere e re
is_a
is_a inh inh inheres_in
TTO: PATO:rectangular^ PATO:curvature^ ZFIN:
Batrochoglanis inheres_in inheres_in influences
edn1tf216b/tf216b
exhibits
raninus (TAO:ectopterygoid) (ZFA:maxilla)
is_a variant_of
TTO: is_a
has_rank
TTO: TTO: ZFIN:edn1
Batrochoglanis is_a Pseudopimelodidae
TTO:
TAXRANK:species has_rank
TAXRANK:genus TTO: TAXRANK:family
has_rank
57. How does reasoning work?
PATO:shape^
inheres_in TAO: is_
inheres_in
(TAO:dermal bone) dermal bone a
TAO:
is_a palatoquadrate arch
is_a
in_pa rt_of
PATO:shape^ inheres_ pa
PATO:shape part_of rt_
is_a of
in
inheres_in_part_of is_a
art
_ of
res_
p
(TAO:palatoquadrate arch)
s_ in_
re
inhe
is_a inhe part_of TAO:maxilla
is_a is_a
_in
is_
es
TAO:
a
er
is_a
inh
ectopterygoid f in
PATO:rectangular PATO:curvature o s_
rt_ re
_ pa he ZFA:maxilla
in in in
s_ s_
ere e re
is_a
is_a inh inh inheres_in
TTO: PATO:rectangular^ PATO:curvature^ ZFIN:
Batrochoglanis inheres_in inheres_in influences
edn1tf216b/tf216b
exhibits
raninus (TAO:ectopterygoid) (ZFA:maxilla)
is_a variant_of
TTO: is_a
has_rank
TTO: TTO: ZFIN:edn1
Batrochoglanis is_a Pseudopimelodidae
TTO:
TAXRANK:species has_rank
TAXRANK:genus TTO: TAXRANK:family
has_rank
58. How does reasoning work?
PATO:shape^
inheres_in TAO: is_
inheres_in
(TAO:dermal bone) dermal bone a
TAO:
is_a is_a is_a palatoquadrate arch
_in_pa rt_of
PATO:shape^ inheres pa
PATO:shape part_of rt_
is_a of
in
inheres_in_part_of is_a
art
_ of
res_
p
(TAO:palatoquadrate arch)
s_ in_
re
inhe
is_a inhe part_of TAO:maxilla
is_a is_a is_
_in
a
is_
is_a
es
is_a
TAO:
a
er
is_a
inh
ectopterygoid f in
PATO:rectangular PATO:curvature o s_
rt_ re
_ pa he ZFA:maxilla
in in in
s_ s_
ere e re
is_a
is_a inh inh inheres_in
TTO: PATO:rectangular^ PATO:curvature^ ZFIN:
Batrochoglanis inheres_in inheres_in influences
edn1tf216b/tf216b
exhibits
raninus (TAO:ectopterygoid) (ZFA:maxilla)
is_a variant_of
TTO: is_a
has_rank
TTO: TTO: ZFIN:edn1
Batrochoglanis is_a Pseudopimelodidae
TTO:
TAXRANK:species has_rank
TAXRANK:genus TTO: TAXRANK:family
has_rank
59. How does reasoning work?
PATO:shape^
inheres_in TAO: is_
inheres_in
(TAO:dermal bone) dermal bone a
TAO:
is_a is_a is_a palatoquadrate arch
art_of
inher es_in_p
PATO:shape PATO:shape^ pa
part_of rt_
is_a of
in
inheres_in_part_of is_a
art
_ of
res_
(TAO:palatoquadrate arch) _p
s _in
ere
inhe
is_a inh part_of TAO:maxilla
is_a is_a is_
_in
a
is_
is_a
es
is_a
TAO:
a
er
is_a
inh
ectopterygoid f in
PATO:rectangular PATO:curvature o _
a rt_ r es
_p e ZFA:maxilla
in in inh
s_ _
e r es
is_a er e
is_a inh inh inheres_in
TTO: PATO:rectangular^ PATO:curvature^ ZFIN:
Batrochoglanis inheres_in inheres_in influences
edn1tf216b/tf216b
exhibits
raninus (TAO:ectopterygoid) (ZFA:maxilla)
is_a variant_of
TTO: is_a
has_rank
TTO: TTO: ZFIN:edn1
Batrochoglanis is_a Pseudopimelodidae
TTO:
TAXRANK:species has_rank
TAXRANK:genus TTO: TAXRANK:family
has_rank
60. Entity 1 Taxon 1 Relationship Entity 2 Taxon 2 Evidence Reference(s)
scaphium Otophysi homologous_to neural arch 1 Teleostei IDS, IMS, IPS
(Fink and Fink,
1981; Rosen and
Greenwood,
1970)
Future
intercalarium Otophysi homologous_to
neural arch 2
(ventral
portion)
Teleostei IDS, IMS, IPS
(Rosen and
Greenwood,
1970)
Directions:
1.
(Fink and Fink,
intercalarium Otophysi homologous_to neural arch 2 Teleostei NAS
1981)
intercalarium Otophysi homologous_to neural arch 2 Teleostei IMS (Hora, 1922)
intercalarium Otophysi homologous_to rib of vertebra 2 Teleostei TAS (Hora 1922)
(Fink and Fink,
Reasoning
parapophysis + 1981; Rosen and
over
tripus Otophysi homologous_to Teleostei IDS, IMS, IPS
rib of vertebra 3 Greenwood,
1970)
homology
image by Kyle Luckenbill, ANSP
61. What next?
• Modeling and reasoning over
homology
• Efficient searching and scoring of
semantic similarity
• Reducing the bottlenecks in data
curation
62. Opening descriptive biological data to
computing can enable new science
Taxonomy, Conservation
Species ID Biology
Biodiversity
(Specimens,
Occurrence
records)
Descriptive biology Ecology
- Phenotypes
- Traits
- Function
- Behavior
- Habitat
- Life Cycle
- Reproduction
Physiology - Conservation Threats Genetics
Genomics,
Genetic
Gene
variation
expression
63. Acknowledgements
• Phenoscape • Berkeley Bioinformatics
Personnel & PIs: & Ontologies Project
P. Mabee, (BBOP):
M. Westerfield, C.Mungall, S.Lewis
T. Vision,
J. Balhoff, • National Evolutionary
C. Kothari, Synthesis Center
W. Dahdul, (NESCent)
P. Midford
• NSF (DBI 0641025)
• Phenoscape curators &
workshop participants
64.
65. Phenotypic similarity matches
taxa to candidate genes
Taxon Candiate
Similarity Taxon Gene Subsuming
(subsuming taxon with phenotype Gene(s)
(IC) variable phenotypes in
subsumed taxa)
(one of two or more
subsumed variable taxon (zebrafish)
phenotype phenotype
phenotypes)
Danio rerio:
epural epural, epural,
15.16 Danio trpm7
separated from composition structure
urostyle
Siluriformes:
14.45 Otophysi eda scales, absent scales, count
scales, absent
Siluriformes: basihyal
brpf1, disc1 and basihyal
13.25 Siluriformes basihyal cartilage,
10 more cartilage, absent
cartilage, absent count
process of
Meckel’s Meckel’s
Meckel’s
cartilage, edn1, foxd3 and cartilage,
10.0 Tachysurus cartilage,
adjacent to 22 more mislocalized
position
coronoid posteriorly
process
66. Mapping EQs back to
characters is a challenge
• Properties of “good” phylogenetic characters:
• Exclusivity of states
• Distinguishability of states
• Independence of characters
• Finding exclusive states requires incompatible
phenotypes. How to determine incompatibility?
• Two phenotypes are incompatible iff they
cannot both inhere in the same specimen.
• Two qualities are incompatible iff an entity
cannot bear both.
67. Which EQs and qualities
are incompatible?
• Incompatible Qs • Compatible Qs
• present vs. absent • present vs. any
• triangular vs. other quality
round (except absent)
• absent vs. any • serrated vs. round
other quality • some colors
• Incompatible EQs
• (Q inheres_in bone
E) vs (cartilage E
absent)
69. System architecture
Knowledgebase User Inteface External web sites
Web Application for Exploration & Mining and client
(Ruby on Rails, JavaScript) applications
Knowledgebase Data Services API (REST)
OBD Programming API
OBD Reasoner
(Java)
Teleost Taxonomy
Ontology (TTO)
Knowledgebase (OBD)
(PostgreSQL)
Phenotypic
Anatomy Quality Ontology
Ontologies (PATO)
(ZFA, TAO)
Genes & genotypes Homology assertions
Mutant EQ phenotypes Evolutionary EQ Phenotypes NeXML
OBO Library
from Zebrafish Model (through annotation)
Organism Database
Phenex Skeletal Character Data
(Evolutionary EQ (from phylogenetic
annotation) treatments in literature)
70. Formalizing homology
relationships
• Formal pattern is ternary:
E1 in_taxon T1 homologous_to E2 in_taxon T2 as E3 in_taxon T3
• Classifying homology relationships
• 1-1 homology (phylogenetic homology)
• serial homology
• A iso_homologous_to B as C
all A derived_by_descent_from some
(C and has_derived_by_descendent some B)
and
all B derived_by_descent_from some
(C and has_derived_by_descendent some A)
• shares_ancestor_with as a relation chain:
derived_by_descent_from o has_derived_by_descendent