1. Book club
Andreas Wagner,
The Origins of Evolutionary Innovations
Chapter 4
Book club presented by G. M. Dall'Olio,
Pompeu Fabra, IBE-CEXS
2. Reminder:
Genotype network
A genotype network is a set of genotypes that have the same
phenotype, and are connected by single pairwise differences
AAAAA AAAAC AAAAG AAAAT AAATT
AAACA AAACC AAACG AAACT AAATC
AACCA AACCC AACCG AACCT …..
ACCCA ACCCC ACCCG ACCCT …..
CCCCA CCCCC CCCCG CCCCT …..
….. ….. ….. ….. …..
Yellow = same phenotype = a genotype network
Note: genotype network == neutral network
3. Genotype Networks
better representation!
The Genotype Space can be represented as a Hamming Graph
https://bitbucket.org/dalloliogm/genotype_space
4. Chapter 4:
Novel Molecules
This chapter describes the relationship between
protein/RNA sequence and tertiary structure
Most RNA/Proteins have the same fold but
different sequences
5. Novel Molecules,
definitions (1)
Genotype:
def 1: the aminoacid sequence of a protein
(or the list of hydrophobic)
def 2: the nucleotidic sequence of a RNA
7. A genotype space of
sequences (simplified)
O = any Hydrophobic aminoacid
Y = any Hydrophilic aminoacid
8. Novel Molecules
definitions (2)
Phenotype:
The fold of a protein sequence
The secondary structure of a RNA molecule
9. Protein Structures
It is also possible to
predict the fold of a
protein
But it is difficult, so
here we focus on
“lattice models”
In a lattice model, we
only use hydrophobic
or hydrophilic
aminoacids
11. More sequences than folds
Li et al, 1996: study on lattice protein models:
There are many more protein sequences than folds
Some phenotypes are formed by more sequences
than others
Sequences that produce the same fold can be very
different
Rost, 1997: study on 272 proteins with similar
folds. They shared 8.5% of aa seq
12. There are many more
protein sequences than
protein folds
Globins are a very common protein domain
Most globins have different sequence, but the same
fold
Among some hemoglobins, only 12.4% of aa
residues are identical
13. Do globins have a common
origin?
Bailly, X., Chabasse, C.,
Hourdez, S., Dewilde, S., Martial,
S., Moens, L. and Zal, F. (2007),
Globin gene family evolution and
functional diversification in
annelids. FEBS Journal, 274:
2641–2652. doi: 10.1111/j.1742-
4658.2007.05799.x
Goodman M, Pedwaydon J,
Czelusniak J, Suzuki T, Gotoh T,
Moens L, Shishikura F, Walz D,
Vinogradov S. An evolutionary
tree for invertebrate globin
sequences. J Mol Evol.
1988;27(3):236-49. PubMed
PMID: 3138426.
14. Some folds are more
common than others
Some folds can be obtained by an higher number of
sequences than others
Number of proteins Sequences by structure (Ferrada,
Wagner 2010):
Ferrada, E. & Wagner, A., 2010. Evolutionary innovations and the organization of protein functions in genotype space. PloS one, 5(11), p.e14172. Available at:
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2994758&tool=pmcentrez&rendertype=abstract
15. The 10 most structurally
promiscuous functions
Promiscuity of a function: when the function can be
obtained by different structures/sequences
Ferrada, E. & Wagner, A., 2010.
Evolutionary innovations and the
organization of protein functions in
genotype space. PloS one, 5(11),
p.e14172. Available at:
http://www.pubmedcentral.nih.gov/article
16. Genotype networks of
protein sequences
Sequences that have
the same fold tend to
be connected in a
genotype network
(from Li et al, 1996)
More the case of figure
1 (above) than figure
2 (below)
17. RNA structures
RNA secondary structures can be predicted in silico
http://rna.ucsc.edu/rnacenter/ribosome_images.html
18. RNA structure videogame
There is even a
videogame on
predicting RNA
structure:
http://eterna.cmu.edu/
So, predicting RNA
structures is
(relatively) easy
19. Innovations in RNA folds
All the observations made for protein sequences are
also valid for RNA, in a bigger scale:
On average, 400 million RNA seqs per fold
Very long RNA sequences tend to similar folds
20. There are many more RNA
sequences than RNA folds
Size rank of genotype set by frequency
Wagner, A., 2008. Robustness and evolvability: a paradox resolved. Proceedings. Biological sciences / The Royal Society, 275(1630), pp.91-100. Available at:
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2562401&tool=pmcentrez&rendertype=abstract
21. Frequent RNA structures
def. frequent RNA structure: a RNA structure that
can be obtained by > 5000 sequences
Only 10% of RNA structures are frequent
93% of RNA sequences belong to frequent RNA
structures
22. RNA sequences can
withstand a lot of changes,
without modifying the fold
Maximal genotype distance in a RNA gen. network:
A. Wagner, The Origins of Evolutionary Innovations. Figure 4.6
23. RNA sequences can
withstand a lot of changes,
without modifying the fold
Different sequence, same fold:
http://eterna.cmu.edu/
24. Neighbors of points in the
genotype network
Most neighbors of sequences in the space have the
same fold
A. Wagner, The Origins of Evolutionary Innovations. Figure 4.7
25. Neighbors of points in the
genotype network
Most neighbors of sequences in the space have the same
fold
This means that the genotype network of a RNA fold is
usually dense
RNA genotype network is more likely to fig 1 than fig 2:
Fig 1 Fig 2
26. Neighbors of genotypes in a
genotype network
Two sequences on a
genotype network
have, by definition,
the same fold.
But what about their
neighbors?
A. Wagner, The Origins of Evolutionary Innovations. Figure 2.6
27. Phenotype of neighbors of
genotype network
Neighbor of genotypes
can have very
different phenotypes
28. Novel RNA phenotypes
Schultes and Bartel:
designed a new
rybozime from two
existing ones
Existing enzymes had
<25% sequence
similarity and no
common structure
Few mutations needed
to obtain the hybrid Schultes, E. a & Bartel, D.P., 2000. One sequence, two ribozymes: implications
for the emergence of new ribozyme folds. Science (New York, N.Y.), 289(5478),
pp.448-52. Available at: http://www.ncbi.nlm.nih.gov/pubmed/10903205
29. Take Home messages
There are many more sequences than protein/RNA
folds
Some folds correspond to more sequences than
others
Sequences that produce the same fold can be very
different
New folds can be reached by changing few bases