The document investigates the evolutionary history of the walrus (Odobenus rosmarus) through analysis of mitochondrial DNA sequences. It finds that the relationships between walruses, eared seals, and earless seals are complex, with some evidence supporting a monophyletic origin and other evidence supporting a diphyletic origin. Analysis of cytochrome B and cytochrome C sequences provides conflicting evidence on the phylogenetic relationships. The analysis also finds little genetic divergence between Pacific and Laptev Sea walrus samples, supporting the classification of these as one walrus subspecies rather than distinct species.
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Phylogeny of Walrus Investigated Using Mitochondrial DNA
1. Investigation into the phylogeny of
Odobenus Rosmarus
A report for Nello Cristianini for the unit EMATM0004
Computational Genomics and Bioinformatics
Algorithms
By Samuel R Neaves SN0550
November 2011
2. Introduction
This project investigates the evolutionary history of Odobenus rosmarus (The walrus). The evolution
of the Pinnipedia (Odobenidae- walruses, Otariidae- eared seals, including sea lions and fur seals &
Phocidae- earless seals) is said to be enigmatic with the exact relationships between subspecies in
dispute. The majority of authors support a monophyletic origin of the pinnipeds from a caniform,
however there are others who suggest a diphyletic origin with the phocidae being related to the
mustelids (The mustelids are themselves a disputed family). Arnason et al (1995).
A further dispute is that some authors divide the walrus into three sub species of Odobenus
rosmarus + (rosmarus, divergen or laptivai) however recent work by (Lindqvist et al, 2009),
concludes that laptivai are not a distinct species from divergen. The aim of this investigation is to
gather evidence for the true phylogeny.
Data Description
The primary species for this investigation will be Odobenus rosmarus rosmarus. The complete
mitochondrial DNA accession number in genbank is: NC_004029(.2). Odobenus rosmarus
rosmarus’s phylogeny will be computed in relation to Erignathus barbatus(Bearded Seal,
representing Phocidae ) Zalophus californianus(California Sea Lion, representing Otariidae) Ursus
maritimus (Polar bear, representing Caniformia) and Gulo gulo (Wolverine, representing mustelids).
Homo sapiens are used as an out group to root the phylogenetic trees. For the full table of accession
numbers see appendix A.
Sequence statistics.
Odobenus rosmarus rosmarus mitochondrial DNA was statistically analyzed with the following
information found:
The size of the genome is 16565 base pairs.
The number of each base:
A C G T
5401 4310 2414 4440
The base count frequency:
A C G T
0.3260 0.2602 0.1457 0.2680
This shows that there are twice as many A’s as G’s, with roughly the same amount of C’s and T’s over
the whole genome. This seems an interesting break from the norm of A and T content being similar
and G and C content being similar. To further investigate and in order to consider local fluctuations
in the frequencies of nucleotides we employ sliding windows of size 5000, 2000 and 500 and plot the
frequencies.
3. Nucleotide density 5000 Nucleotide density 2000
0.5 0.5
A A
0.4 C 0.4 C
G G
0.3 T 0.3 T
0.2 0.2
0.1 0.1
0 2000 4000 6000 8000 10000 12000 14000 16000 18000 0 2000 4000 6000 8000 10000 12000 14000 16000 18000
A-T C-G density A-T C-G density
0.7 0.7
A-T A-T
0.6 C-G 0.6 C-G
0.5 0.5
0.4 0.4
0 2000 4000 6000 8000 10000 12000 14000 16000 18000 0 2000 4000 6000 8000 10000 12000 14000 16000 18000
Nucleotide density 500
0.8
A
0.6 C
G
0.4 T
0.2
0
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
A-T C-G density
0.7
A-T
0.6 C-G
0.5
0.4
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
A sliding window of size 5000 does not show a great deal of variation amongst the composition
however a smaller windows clearly show peaks and troughs, which shows that the nucleotides are
not drawn from a independent and identically distributed probability distribution as the distribution
changes along the genome.
With a caveat of caution because of the apparent violation of the aggregate frequencies, the GC
content is also plotted; at the smallest window size this seems to show six distinct waves of variation
in both AT and CG content.
Next we employ an ab initio method to find protein encoding genes. The single-nucleotide
permutation test calculates the significance of Open Reading Frames(ORFs) with a threshold set to
be longer than all ORFs in a random sequence and it finds 1 gene. If we set α to 5% then we get a
larger value of 12 genes found. We are careful to set the correct genetic code for vertebrate
Mitochondrial. We translate these genes into protein sequences and identify cytochrome B and
cytochrome C by translating into amino acid sequences and blasting. Once identified we run further
protein blasts using both cytochromes to identify the nearest other species.
4. Results of CYTB blast:
Rank Latin name Common Name Total Score(Max 760)
1 Halichoerus grypus Grey Seal 681
2 Gulo gulo Wolverine 680
3 Phoca vitulina stejnegeri Harbour Seal 679
4 Erignathus barbatus Bearded Seal 679
5 Ictonyx libyca Saharan Striped Pole cat 679
These results are interesting because they do not include any Otariidaes, suggesting that Pinnipedia
have a diphyletic origin from the ancient caniform with the Odobenidae, Phocidae and the Mustelids
on one branch and Otariidaes on another.
Results of CYTC blast
Rank Latin Name Common name Total
Score
1 Tremarctos ornatus Spectacled bear 447
2 Otaria byroni South American Sea Lion 446
3 Arctocephalus Guadalupe fur seal 445
townsendi
4 Neophoca cinerea Australian Sea Lion 444
5 Callorhinus ursinus Northern fur seal 444
This results is a contrast to the CYTB blast results, this time with many Otariidaes, no Phocidaes a
high ranking Caniformia and no Mustelids. This data appears to support the monophyletic origin
hypothesis or a diphyletic origin but with the Odobenidae on the branch of Otariidaes. To further
the investigation we add the initially selected organisms to the data set and compute the genetic
distances between each pair. We utilize the Jukes-Cantor correction to account for multiple
substitutions that have occurred in the same space.
( )
This states that the number of substitutions per site between two sequences (K) can be estimated
from the observed fractions that differ (d).
With this applied on cytochrome b it is clear that the Polar bear is very distantly related compared to
the other species. It is interesting that the data suggests that the Spectacled bear is a closer relation
to the pinnipeds than the Polar bear.
5. If we remove the Polar bear to allow us to zoom in we can see five distinct groups. The data shows
the Walrus is about equally distant from the Otariidaes and the Phocidaes , with the Otariidaes
closer to Mustelids and as far from the Phocidaes as it is from the Spectacled bear.
Performing the same procedure for cytochrome C we get similar results however, this time the
Phocidaes are clearly grouped with Mustelids along with the Polar bear. The Spectacled bear is once
again on its own slightly closer to Phocidaes than the Otariidaes. This leaves the Walrus again as an
outliner being roughly equal distances from the two major clusters.
6.
7. Four phylogenetic trees were built, one for each Cytochrome from both amino acid and nucleotide
sequences’. In order to build the cytochrome C nucleotide tree, a number of animals including
Odobenus rosmarus had to use amino acid to nucleotide transformation due to unavailability of
sequence data, which as this is not a one to one relation results in some random substitutions which
may affect the accuracy of this graph.
The results present a confused picture with many contradictions between the four trees. However if
we discount the Cytochrome C nt tree there appears to be some consensus, all the Otariidae and
Phocidae are consistently grouped together and the Odobenus Rosmarus is seen to first split from
the common ancestor of both the Otariidae and Phocidae which then diverged at a later date, this
stands in contrast to the results in Arnason et al(1995) which show the Phocidae first splitting, with a
later split between the Otariidae and Odobenus Rosmarus. However (Lento et al, 1995) does offer
some evidence for Odobenus Rosmarus being an early divergence from the common pinniped
ancestor which would be consistent with these results. There are major differences in the placing of
Ursus maritimus, Tremarctos ornatus and Gulo gulo between the cytochrome b and c trees,
cytochrome c puts the mustelids, Ursus maritimus and Tremarctos ornatus on the same branch as
the Phocidae, however the cytochrome B tree has the Mustelids and Tremarctos ornatus close to
the Otariidae, with Ursus maritimus being a distance relation. Castresana (2001) presents evidence
that Cytochrome B is more reliable for constructing trees at the genus and family level and therefore
this tree may be taken as a more reliable indicator to the true phylogeny.
The online resource tax browser collated by NCBI has the Odobenidae, Phocidae and Otariidae as
three distinct families within the suborder of Caniformia and does not have any one group as an
ancestor to the other.
Multiple alignments
In order to build multiple alignments and identify polymorphic sites the heuristic CLUSTALW tool was
used to align both the cytochrome B and cytochrome C protein sequences. This was set to use the
BLOSUM Protein weight matrix with a GAP open penalty set to 10, GAP extension penalty set to
0.20, GAP distances set to 5 and No End Gaps set to ‘No’. Too see the full alignments refer to
appendix B. It is clear that both alignments are very good apart from of course the out-group and
the Polar bear in cytochrome B. The majority of polymorphic sites in cytochrome B are consistent
with the groupings of Odobenidae, Phocidae and Otariidae. They include both indels and point
mutations. The sites are fairly sporadic across the sequences which is in contrast to the polymorphic
sites in cytochrome C which mostly lie between the 50th and 100th amino acid with the extremities
remaining constant.
8. Addressing the question of how many species of Odobenus
Rosmarus there are we utilize a selection of walrus samples from
the (Lindqvist et al, 2009) study. These sequences are ATL25
tRNA-Trp and tRNA-Pro genes from the mtDNA region of the
genomes. We follow the same procedure as earlier computing
the genetic distance between the samples using jukes cantor
correction and plotting these on a graph. We use this
computation to build an unrooted phylogenetic tree. Both the
tree and the distance plot conforms with (Lindqvist et al, 2009)
conclusion that the walruses sampled from the Laptev sea are
indeed just a subgroup of the Pacific walrus because they exist in
a sub branch of Odobenus rosmarus divergens and their genetic
distance is mixed amongst the Pacific samples. This data and
analysis therefore does not justify labeling these as a separate
species.
A further point of note is that the Atlantic walrus genetic data
show signs of going through a genetic bottle neck due to the lack
of diversity compared to the Pacific walrus. This information sits
with the historic fact, that the Atlantic walrus was almost hunted
to extinction by the 1950’s with numbers beginning to recover
since then. Whereas the more remote locations inhabited by the
Pacific walrus protected them from human hunting which has
allowed there numbers to remain much higher throughout the
20th century and therefore accounting for the greater genetic
diversity shown in the samples. If further larger samples are
collected and more detailed analysis’s show the same results
then it may be it will be time to change the current NCBI tax
browser to show only two species of Odobenus Rosmarus.
Atlantic Pacific Laptivai
9. Conclusion
The analysis that we have performed present results that stand in contrast to the two papers Ulfure
et al (1995) and Lento et al (1995). Proving that the question of pinniped evolution is indeed very
interesting with a variety of hypothesis still in contention. The examination of the question of if
there are two or three walruses species came to the same conclusion as (Lindqvist et al, 2009)
despite using different techniques and methods. It must be said that the same data was used for this
study and Lindqvist et al’s (2009) study. Which when taken with the low numbers of samples and the
use of amplicons, as well as the inherent difficulty of sampling Odobenus Rosmarus potentially
leading to sampling errors, such as close relatives being sampled, leaves the hypothesis very much
still open to refutation.
While the evolution of pinnipeds remain inconclusive there remains the need for further more in-
depth studies to allow for reliable conclusions to be drawn so that wise actions can be taken to
protect this charismatic and vulnerable artic creature from the threats of hunting and habitat
destruction that continue to push many creatures to extinction.
A pair of curious Walruses (image from http://www.free-extras.com/images/walrus-8927.htm)
10. Appendix A
Accession Number
Proteins Nucleotides
Latin Name Common mtDna Cytochrome B Cytochrome C Cytochrome B Cytochrome C
Name
Odobenus Atlantic Walrus CAD21718 NP_659340.3 NC_004029.2 NA
Rosmarus
Rosmarus
Zalophus California sea YP_778707.1 YP_778698.1 D26524.1 AJ616896.1
californianus lion,
representing
the Otariidae
Erignathus Bearded Seal, YP_778837.1 YP_778828.1 AY140982.1 FJ839388.1
barbatus representing
Phocidae
Ursus Polar bear, AAF71578.1 NP_597984.1 NC_003428.1 NA
maritimus representing
Caniformia
Gulo gulo Wolverine, YP_001382271.1 YP_001382262.1 L77960.2 EU544598.1
representing
Mustelids
Homo Sapiens Human, is used AAA31851.1 NP_061820.1 S88250.1 NM_018947.5
as an outgroup
Halichoerus Grey Seal ACZ28998.1 NP_007072.1 GU167293.1 GU733706.1
grypus
Phoca vitulina Harbor seals BAI60013.1 NP_006931.1 AB510422.1 NA
stejnegeri
Ictonyx libyca Saharan Striped ABV57060.1 NA EF987739.1 NA
Polecat
Tremarctos spectacled bear AAB50570.1 YP_001542732.1 U23554.1 NA
ornatus
Otaria byronia South American AAQ95107.1 AAR00312.1 AY713034.1 AJ891144.1
Sea Lion
Arctocephalus Guadalupe fur YP_778759.1 YP_778750.1 AF380897.1 NA
townsendi seal
Neophoca Australian Sea YP_778746.1 YP_778737.1 AF380915.1 NA
cinerea Lion
Callorhinus Northern fur YP_778694.1 YP_778685.1 HQ895717.1 HM171421.1
ursinus seal
Odobenus Rosmarus samples.
Lap 1 EU728526
Pac 8 EU728538 Atlan 4 EU728567 Atlan 14 EU728549
Lap 2 EU728527
Pac 9 EU728539 Atlan 5 EU728568 Atlan 15 EU728550
Lap 3 EU728529
Pac 12 EU728542 Atlan 6 EU728569 Atlan 16 EU728551
Lap 4 EU728530
Pac 13 EU728543 Atlan 7 EU728570 Atlan 17 EU728552
Lap 5 EU728525
Pac 14 EU728562 Atlan 8 EU728571 Atlan 18 EU728553
Pac 1 EU728531
Pac 15 EU728563 Atlan 9 EU728572 Atlan 19 EU728554
Pac 2 EU728532
Pac 16 EU728564 Atlan 10 EU728573 Atlan 20 EU728555
Pac 3 EU728533
Atlan 1 EU728561 Atlan 11 EU728546 Atlan 21 EU728556
Pac 4 EU728534
Atlan 2 EU728565 Atlan 12 EU728547 Atlan 22 EU728557
Pac 5 EU728535
Atlan 3 EU728566 Atlan 13 EU728548 Atlan 23 EU728558
Pac 6 EU728536
Pac 7 EU728537
14. Appendix C bibliography
Andersen et al. (1998). Population Structure and gene flow of the Atlanstic Walrus (Odobenus
rosmarus rosmarus) in the eastern Atlantic Artic based on mitochondiral DNA and
microsatellite variation. Molecular Ecology(7), 1323-1336.
Castresana J. (2001). Molecular biology and Evolution(18), 465-471.
Castresana J. (2001). Cytochrome b Phylogeny and the Taxonomy of Great Apes and Mammals.
Molecular biology and Evolution(18), 465-471.
Lento et al. (1995). Use of Spectral Anaylsis to test hypotheses on the orign of pinnipeds. Molecular
Biology and Evolution(12), 28-52.
Lindqvist et al. (2009). The Laptev Sea Walrus Odobenus rosmarus laptevi: an engima revisited.
Zoologica Scripta(38), 113-127.
Ulfure, A., bodin, K., Gullberg, A., Ledge, C., & Mouchaty, S. (1995). A Molecular View of Pinniped
Relationships with Particular Emphasis on the True Seals. Journal of molecular Evolution(40),
78-85.