08448380779 Call Girls In Civil Lines Women Seeking Men
Gutell 089.book bioinfomaticsdictionary.2004
1. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 1 of 23
RNA: General Categories (RNA)
Ribonucleic acid (RNA) is one of the two major forms of nucleic acids. Each individual
element, or nucleotide, of RNA is comprised of three parts: a ribose sugar, a phosphate,
and a cyclic base. Four primary bases are found in RNA: adenine, guanine, cytosine,
and uracil. Other bases and modified forms of these bases sometimes appear; the
McCloskey Lab at the University of Utah has prepared a database of these modified
nucleotides.
The major structural difference between RNA and DNA, the other major form of nucleic
acid, is the sugar (deoxyribose in DNA, ribose in RNA). The ribose sugar increases the
susceptibility to degradation of RNA compared to DNA. Thus, RNA is best suited to
(relatively) short-term uses, while DNA is sufficiently stable to be a good medium for
genetic inheritance.
While DNA’s primary functions are the storage of genetic information and the
production of RNA, cellular RNA has several distinct functions. Three major forms of
RNA are commonly discussed: messenger RNA, ribosomal RNA, and transfer RNA.
(Ribosomal and transfer RNA are discussed separately.) Messenger RNA (mRNA)
codes for proteins. mRNA nucleotide sequence is translated to amino acid sequence
based on a specific mapping (the “genetic code” for an organism) between sets of three
mRNA nucleotides (codons) and amino acids. mRNA is produced from DNA during
transcription. mRNAs sometimes contain untranslated regions (UTRs) at their 5' and 3'
ends that play several key roles in gene regulation and expression. mRNA is both
quickly synthesized and degraded as part of the regulation of protein synthesis.
In addition to the three major RNAs, other RNAs that have different properties have
been identified. Some RNAs form ribonucleoprotein complexes, others have catalytic
activities, and, in viruses, carry the genetic information for the organism rather than
DNA. Included among these other RNAs are intron RNAs, which must be excised from
other genes so that those genes can function, RNase P, and the U RNAs. Many of these
interesting RNAs have been characterized, and several examples appear below.
Noncoding RNAs (also referred to as small RNAs) are RNAs that do not code for
proteins. (Technically, both ribosomal and transfer RNAs belong to this category.)
Certain noncoding RNAs, such as the microRNAs (miRNAs) that have been isolated
from plants and animals, have been characterized and implicated in regulatory roles.
Other noncoding RNAs have been implicated in the destruction of mRNA via RNA
interference (RNAi). Some additional noncoding RNAs that have been characterized
are described below.
The bacterial tmRNA (also known as 10Sa RNA or SsrA) is a chimeric molecule with
both tRNA-like and mRNA-like characteristics. An incomplete mRNA with a truncated
3’ end will cause the ribosome to “stall” with an incomplete protein attached. tmRNA
“rescues” the ribosome by binding its tRNA-like portion to the ribosome. This binding
2. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 2 of 23
positions the mRNA-like portion of tmRNA so that the ribosome can resume
translation, using the tmRNA as its template. The mRNA-like portion codes for a signal
peptide that will be recognized by bacterial proteases and ensures that the partially-
produced protein will be degraded.
Small nucleolar RNAs (snoRNAs) are typically 60-300 nucleotide RNAs that are
abundant in the nucleolus of a broad variety of eukaryotes. The snoRNAs associate
with proteins to form small nucleolar ribonucleoproteins (snoRNPs). snoRNAs come in
two major structural forms, one containing the box C and D motifs and the second
containing box H and ACA elements. Most snoRNAs are involved in the nucleotide
modification process, in either 2’-O-methylation (box C/D snoRNAs) or
pseudouridylation (box H/ACA snoRNAs), for a wide range of RNAs by hybridizing
to the region of the RNA that needs to be modified. Other snoRNAs are essential for
nucleolytic cleavage of precursor RNAs. Two different mechanisms for the synthesis of
snoRNAs have been observed; in vertebrates, snoRNAs are processed from previously-
excised pre-mRNA introns, while in yeast and plants, the sources of snoRNAs are
polycistronic snoRNA transcripts. Vertebrate telomerase is a box H/ACA snoRNP.
The signal recognition particle (SRP) RNA is involved in transport of newly-translated
secretory proteins to the cytosol. The SRP binds to a signal at the N-terminus of a
protein as the protein is synthesized. The SRP then binds to a receptor that is bound to
the membrane of the endoplasmic reticulum. The protein begins to transverse the
membrane en route to the cytosol, and protein synthesis continues, with the SRP
recycled to assist with another protein-ribosome complex.
Guide RNAs (gRNAs) are a novel class of small noncoding RNA molecules that are
transcribed from the maxicircles and minicircles of trypanosome mitochondria. gRNAs
contain the necessary information for proper editing (insertion or deletion of uridines)
of mitochondrial precursor RNAs in the trypanosomes, resulting in functional RNAs.
The 5’ end of a gRNA is complementary to its mRNA target sequence that is 3’ of the
modification site, serving as an “anchor” for the gRNA. The central portion of a gRNA
is complementary to the mature, edited mRNA and thus serves as the editing template.
The 3’ end of a gRNA is a posttranscriptionally-added oligo[U] tail; the function of this
tail is not presently certain.
Websites
Modified RNA Nucleotides: http://medlib.med.utah.edu/RNAmods/
Noncoding RNAs: http://biobases.ibch.poznan.pl/ncRNA/
Noncoding RNAs in plants: http://www.prl.msu.edu/PLANTncRNAs/
The RNA World: http://www.imb-jena.de/RNA.html
RNABase: The RNA Structure Database: http://www.rnabase.org/
RNAi Database: http://formaggio.cshl.org/%7Emarco/fabio/index.html
RNase P Database: http://jwbrown.mbio.ncsu.edu/RNaseP/home.html
3. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 3 of 23
snoRNA Database: http://rna.wustl.edu/snoRNAdb/
SRP Database (Christian Zwieb): http://psyche.uthct.edu/dbs/SRPDB/SRPDB.html
tmRNA (Kelly Williams): http://www.indiana.edu/~tmrna/
tmRNA (Christian Zwieb): http://psyche.uthct.edu/dbs/tmRDB/tmRDB.html
Uridine Insertion/Deletion RNA Editing (gRNA):
http://www.rna.ucla.edu/trypanosome/
UTR Database: http://bighost.area.ba.cnr.it/BIG/UTRHome/
Yeast snoRNA Database: http://www.bio.umass.edu/biochem/rna-
sequence/Yeast_snoRNA_Database/snoRNA_DataBase.html
Further Reading
Bachellerie JP & Cavaillé J (1998). Small Nucleolar RNAs Guide the Ribose
Methylations of Eukaryotic rRNAs. In: Modification and Editing of RNA. Grosjean H
& Benne R, editors. ASM Press, Washington, DC.
Estévez AM & Simpson L (1999). Uridine insertion/deletion RNA editing in
trypanosome mitochondria – a review. Gene 240:247-260.
Guthrie C & Patterson B (1988). Spliceosomal snRNAs. Annual Review of Genetics
22:387-419.
Hutvagner G & Zamore PD (2002). RNAi: nature abhors a double-strand. Current
Opinion in Genetics and Development 12:225-232.
Kiss T (2002). Small Nucleolar RNAs: An Abundant Group of Noncoding RNAs with
Diverse Cellular Functions. Cell 109:145-148.
Mattick JS (2001). Non-coding RNAs: the architects of eukaryotic complexity. EMBO
Reports 2:986-991.
Reinhart BJ, Weinstein EG, Rhoades MW, Bartel B, & Bartel DP (2002). MicroRNAs in
plants. Genes & Development 16:1616-1626.
Samarsky DA & Fournier MJ (1999). A comprehensive database for the small nucleolar
RNAs from Saccharomyces cerevisiae. Nucleic Acids Research 27:161-164.
4. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 4 of 23
Ribosomal RNA (RNA)
Ribosomal RNAs (rRNAs) are specialized RNAs that provide the structural and
catalytic core of the ribosome, the cellular structure that is the site for protein synthesis.
The three major forms of rRNA are the 5S, small subunit (SSU, 16S, or 16S-like), and
large subunit (LSU, 23S, or 23S-like) rRNAs. rRNA may comprise up to 90% of a cell’s
RNA.
Ribosomal RNAs are organized into operons in the genome. In nuclear rRNA operons,
the genes are separated by internal spacers, which are often used for phylogenetic
analyses. In most (higher) eukaryotes, the large subunit rRNA is divided into two
pieces, the 5.8S rRNA and a larger rRNA (varying in size between 25S and 28S), with a
spacer between them in the genomic sequence. Certain organisms extend this theme by
dividing their rRNA molecules into fragments that must be assembled correctly to
produce a viable rRNA. The sizes of the rRNAs vary over a wide range; the Escherichia
coli rRNAs, which were the first to be sequenced and serve as a reference organism for
comparative analysis of rRNA, are 120, 1542, and 2904 nucleotides for the 5S, SSU, and
LSU rRNAs, respectively.
Size Variation in Ribosomal RNA: Approximate ranges of size (in nucleotides) for
complete sequences are shown in the table. Unusual sequences (of vastly different)
length are excluded.
rRNA Molecule
5S SSU LSUPhylogenetic Domain/Cell Location
Bacteria 105-128 1470-1600 2750-3200
Archaea 120-135 1320-1530 2900-3100
Eukaryota Nuclear 115-125 1130-3725 2475-5450
Eukaryota Chloroplast 115-125 1425-1630 2675-3200
Eukaryota Mitochondria 115-125 685-2025 940-4500
Overall 105-135 685-3725 940-5450
Structure models for the ribosomal RNAs have been proposed using comparative
sequence analysis methods. The majority of the base-pairs predicted by these methods
are canonical (G:C, A:U, and G:U) base-pairs that are consecutive and antiparallel with
each other, forming nested secondary structure helices. Many tertiary structure
interactions were also proposed with comparative analysis. These interactions include
base triples, non-canonical base-pairs, pseudoknots, and many RNA motifs (see “Motifs
in RNA Tertiary Structure”). The most recent versions of the models for the Escherichia
coli 16S and 23S rRNAs are shown below (Figure XX).
5. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 5 of 23
More than twenty years after the first 16S and 23S rRNA comparative structure models
were proposed, the most recent structure models (Figure XX) were evaluated against
the high-resolution crystal structures of both the small (Wimberly et al. 2000) and large
(Ban et al. 2000) ribosomal subunits that were solved in 2000. The results were
affirmative; approximately 97-98% of the 16S and 23S rRNA base-pairs, including
nearly all of the tertiary structure base-pairs, predicted with covariation analysis were
present in these crystal structures. In addition, many new motifs have been proposed
and characterized based upon the crystal structures.
Figure XX. Comparative secondary structure diagrams for the Escherichia coli 16S and
23S rRNAs. A, 16S rRNA; B, 23S rRNA, 5’ half; C, 23S rRNA, 3’ half. Base-pair
symbols: line, canonical (G:C or A:U); small filled circle, wobble (G:U); large open
circle, G:A; large closed circle, all other non-canonical base-pairs. Colors of base-pair
symbols indicate confidence based upon comparative analysis, with red representing
high confidence, green less confidence, and black indicating that comparative analysis
does not strongly support or argue against the base-pair. Grey base-pairs are (nearly)
invariant, and blue base-pairs were predicted with comparative analysis but could not
be scored using the current system. See the Comparative RNA Web Site for more
information.
10
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
850
900
950
1000
1050
1100
1150
1200
1250
1300
1350
1400
1450
1500
5’
3’
I
II
III
m
2m
5
m7
m
2
mm
4
m5
m2
m
6
2
m6
2
m3
G[ ]
A
A
A
U
U
G
A
A
G A G U U
U G
A
UCAUGGCUCAG
A
U
U
GA
A
C
G
C
U
GG
C
G
G
C
A
G
G
C
C
UA
AC
A
C A
U
G
C
A
A
G U C
G A
A C G G U
A A
C A G G A A G A A G C
U
U
GCUUCUUU
G
CUGAC
G
AGUGGC
G
G
A
CGG
G
U
G
A
G
U
A
A
UG
U
C
U
G
G
G
A
A
A
C
U
G
C
C
U
G
A
U
G
G
A G G G G
GA U A
A C U A C U G G
A
A
ACGGUAGC
U
AAU
A
CCGC
A
U
A
A
C
G
U
C
G
CA
A
G
A
C
C
A
A
A
GAGGGG
GA
CCU
U
C
G G G C C U C U U G
C
C
A
U
C
G
G
A
U
G
U
G
C
C
C
A
G
A
UG
G
G
A
UU
A
G
C
U
A
GU
A
G
G
U
G
G
G
G
UA
A
C
G
G C
U
C
A
C
C
U
A
G
G
C
G
A
C
G A
U
C
C
C
U
A
GCUG
GUCU
G
A
G A
GGA U
G A
C
C A GC C
A
C
A
CUGGAA
CU
G
A
G
A
C
A C G
G U C C A G
A
C
U
C
C
U
A
C G
G
G
A
G
G
C A G
C
A
G
U
G
G
G
G
A
A
U
AU
U
GCA
CAA
UGGGCG
C
A
A G C C U G A U G C A GC
C
A U
G
C
C
G
CGUGUAU
G
AAGA
A
GGCCU
U
C
G G G U U
G
U A A
A
G U A C
U
U
U
C
A
G
C
G
G
GG
A
G
GAA
G
G
G
A
G
U
A
A
A
GU
U
A
A
U A
C
C
U
U
U
G
C
U
CA U
U
G
A
C G U
U
A
C
C
C
G
C
A
G
A
A
G
A
AG
C
A
C
CGGC
UA A C
U
C
C
G
ψ
G
C
C
A
G
C
A
G C C
G
C G
G
U
A
A
U
AC
G
G
A
G
G
G
U
G
C
A
A
G
C
G
U
U
A
A
U
C
G
G
A
A
U
U
A
C
U
G G
G
C
GU
A
A
A
G
C
G
C
A
CG
CA
G
G
C
GGUUUGUU
A
AGUCAGAUGUG
A
AA
U
CCCCGGGCU
C
A A C C U G G G A
A C
U G C A U C U G A
U A
C U G G C A A G C
U
U
G
A
G
U
C
U
C
G
U
A
G
A
G
G
G
G
G
G
U
AGAAUUCCAGGU
GUA
GCGGU
G
A
A A U G C
G
U
A G
A
G
A U C U G G A G
G A
A U
A
C C
G
G
U G
G C G
A
A
GGCG
G
C
C
C
C
C
U
G
G
A
C
G
A
A
G
A
C
U
G
A
C
G
C
U
C
A
G
G
U
G
C
G
A
A
A
G
C
G
U
G
GG
G
A G
C
A
A
A
C
A
G
G
A
U
U
A G A
U
A
C
C
C
U
G
G
U
A
G
U
C
C
A
C
G
C C G U
A
A
A
C
G
AU
G U C G A C U U G
G
A
G
G
U
U
G
U
G
C
C
C U U
G
A
G
G
C
G
U
G
G
C
U
U
C
CG
G
A
G
C
U
A
AC
G
CGU
U
A
A
GUCGAC
C
G
C
C
U
G G G
G
A
G U
A
C
G G C C G
C
A
AGGUU
AAAA
CUC
A
A A
U G A A U U G A C G
G
G G G C C C G
C
A C A A G
C
G
G
U
G
G
A
G
C
A
U
G
U
G
G
UU
UAAU
U
C
G
A
U
GC
A
A
C
G C
G
A
A
G
A
A
C C U U
A
C
C
U
G
G
U
CU
U
GA
C
A
U
C
C
A
C
G
GAAGUUUUCAG
A
G
A U G A G A A U G
U
G
C
C
U
U C
G
G
G
A
A
C
C
G
U
GA
G
A
C A
G
G
U
G
C
U
GC
A U
G
G
C
U
G
U
C
G
U
C
A
GCUCGUG
U
U
G
UG
A
A
A
U
G
U
U
G
G
G
U
U
A A
G
U
C
C
C
G C
A
A C G A G C
G
C A A
C
C C U U A U C C U U U G U U G C C
A G
C G G U C
C
GGCCGGG
AACU
CAAAGGA
G
A
C
U
G
C
C
A
G
U
G
AUA
A
A
C
U
G
G
A
G
G
A
A
G
G
UGGGGA
U
G
A
C
G
U
C
A
A
G
U C
A
UC
A
U
G
G
C
C
C
U
U
A
CG
A
C
C
A
G
G
G
C
U
A
C
A
C
A
C
G
U
G
C
U
A
C A A
U G
G
C
G
C
A
U
A
C
A A A G
A
G
A
A G
C
G
A C C
U
C
G C
G
A
G
A
G
C
AA
G
C
G
G
AC
C
U
C
A
U
AAAG
U
G
C
G
U
C
G
U
A
G
U
C
C
G
G
A
U
U
G
G
A
G
U
C
U
G
C
AAC
U
C
G
A
C
U
C
C
A
U
G
A
A
GU
C
G
G
A
A
U
C
G
C
U
A
G
U
A
A
U
C
G
U
G
G
A
U
C
A
GAA
U
G
C
C
A
C
G
G
UG
A
A
U
A
C
GU
U
C
C
CGGGCCUUGU
A
CA
C
A
C
C
G
C
C
C
G
U
C
A
C
A
C
C
A
U
G
G
G
A
G
U
G
G
G
U
U
G
C
A
A
A
A
G
A
A
G
U
A
G
G
U
A
G
C
U
U
A
A
C
C
U
U C
G
G
G
A
G
G
G
C
G
C
U
U
A
C
C
A
C
U
U
U
G
U
G
A
U
U
C
A
U
G
A
C
U
G
G
G
G
U
G
A
A
G
U
C
GU
A
A
C
A A
G
G
U A A C C G U A G G
G
G
A
A
CCUGCGGUUG
G
A
U
C
A
C
C
U
C
C
U
U
A
A
I
II
III
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
850
900
950
1000
1050
1100
1150
1200
1250
1300
1350
1400
1450
1500
1550
1600
1640
2900
5’ 3’
3’ half
m1
m
5
m
6
(2407-2410)
(2010-2011)
(2018)
(2057/2611 BP)
(2016-2017)
G
G
U
U
A
A
G
C
G
A
C
UAAG
C
G
U
A
C
A
C
G
G
U
G
G
A
U
G
C
C C
U
G G C A G U C A G A G
G
C
G
A
U
G
A
A
G
G
AC
G
U
G
C
UA
A
U
C U
G
C
G
A
U
A
A
G C
G
U
C
G
G
U
A
A
G
G
U
G
A
U
A
U
G
A
A
C
C GU
U
A
UAA
C
C
G
G
C
G
A
U
U
U
C
C
G
A A U G
G
G
G
A A
A
C
C
C A
G
U
G
U
G
U
U
U C
G
A
C
A
C
A
C
U
A
U
C
A
U
U
A
A
C
U
G
A A U
C
C
A
U
AG
G
U
U
A
A
U
G
A
G
G
C
G
A
A
C C G G G G
G A A C
U
G A A
A
C
AUC
UAAGU
A
CCCCGA
G
G
A
A
A
A
G
A
A
AU
C
A
AC
C
G
AGAU
U
C
C
C
C C
A
G
U
A
G
C
G
G
CG
A
G
CG
A
A
C
G
G
G
G
A
G
C
A
G
C
C
C
A
G A G C
C
U G A A
U
C A G U G U G U G U G U U A G U G
G
A
A G
C
G
U
C
U
G
G A
A
A
G
G
C
G
C
G
C
G A
U
AC
A
G
G
G
U
G
ACA
G
C
C
C
CG
U
A
CAC
AAA
AAUGCACAUGCUG
UGA
GCUCGAUGA
G
U
A
G
G
G
C
G
G
G
A
C
ACG
U
G
G
U AU
C
C
U
G
U
C
U
G
A
A
U
A
U
G
G
G
G
G
G
A
C C A
U
C
C
U
C
C A A
G
G
C
U
A
A
A
U
A
CU
C
CUGACUG
A
CC
G
A
U
A
GUGAACC
A
G
U
A
CCG
U
G
A G G
G
A
A A G
GCGAAAAGAACCCCG
G
C
G
A G G G GA GU GAA A A A GA
A CC
U
G
A
A
A
C
C
G
U
G
U
A
C
G
UACAAGCA
G
U
G
G
G
A
G
C
A
C
G
C
UU
A
G
G
C
G
U
G
U
G
A
C
U
G
C
G
U
A C C U UU
U
G
U
AUA
AUGG
GUCAGC
G
A
C
UU
A
U
A
U
U
C
U
G
U
A
G
C
A
A
G G U U
A A
C C G A
A
U
AGG
GG
AGCC
G
A
AG
G
G
AA
A
C
C
G
AGUCUUA
A
C
U G G G C G
U
U
A A G
U
U
G
C
A
G
G
G
U
A
U
AG
A
C
C
CG
A
A
AC
C
C
G
G
U
G
A
U
C
U
A
G
C
C
A
U
G
G
G
C
A
G G U U
G A A
G G U U G G G U
A
A
CACUAACU
G
GA
G
GACC
GAA
C
C
G
AC
U
A
A
U
G
ψU
G
A
A
A
A A
U
U
A
G
C
G
G
A
U
G
A
C
U
U
G
U
G
G
C
U
G
G
G
GGU
GA
A
A
G GCC
A
A
U
C A AA
C
C
G
G
GA
G
A
UA G
C
UG
G
U
U
CUCCCC
G
A
A
A
G
C
U
A
U
U
U
AG
G
U
A
G
CGC
C
U
C
G
U
G
A
A
UU
C
A
U
C
U
C
C
G
G
G
G
G
U
A
G
A
G
C
A
CU
G
U
U
U
C
G
G
C
A
AG
G
G
G
G
U
C
A
UC
C
C
G
A
C
U
U
A C
C
A
A
C
C
C
G
A
U
G
C
A
A
A
C
U
G C
G
A
A
U
A
C
C
G
G
A
G
A A
U
G
U
UA
U
C
A
C
G
G
G
AG
A
C
A
CACGGCGGGψGC
U
A
A C G U C C G U C G U G
A
A
G
A
G
G
G
A
A
A
C A
A
C
C
C
A G A C
C
G
C
C
A
G
C
U
A
A
G
G
UCC
C
A AA G
U C
A
U
G
G
U
U
A
A
G
U
G
G
G
A
A
A C
G
A
U
G
U
G
G
G
A
A
G
G
CCC
A
G
A
C A G
C
C
A
G
G
AUGUUGGC
UUA
G
A
A
G C A
G C C A U C A U U
U
A
A
A G
A
A
A
G C
G U
A
A
UA
GCUC
A
C
U
G
G
U
C
G
A
G
U
C
G
G
C
C
U
G
C
G
C
G G A
A
G
A
U
G
U
A
A
C
G
G
G
G
CUAAA
C
C
A
U
G
C
A
C
C
G
A
A
G
C
U
G
C
G
G C
A
G
C
G
A
C
G
C
U U
A
U
G
C
G
U
U
G
U
U
G
G
G
U
A
G G G G A G
C
G
U
U
C
U
G
U
A
A
G
C
C
U
G
C
G
A
A G
G
U
G
U
G
C
U
G U
G
A
G
G
C
A
U
G
C
U
G
G
A
G
G
U
A
U
C
A
G
A
AG
U
G C
G
A
A
U
G C U G A C
A
U
A
A
G
U
A
AC
G
A U A A A
G
C
G
G
G
U
G
A
A A
A
G
C
C
C
G
C
U C
G
C
C
G
G
A
A
G
A
C
C
A
A
G
GGUUCCUGUC
CAA
CGU
U
A
A U C G G G G C A G G
G
U
G
A
GU C
G
A
CCCC
UAA
GGC
G
A
G
GCCG
A
A
A G G C
G
U
A
G U C
G A U
G G
G
A
A A
C
A
G
G
U
U
A
A U
A
U
U
C
C
U
G
U
AC
U U G G U G U U A C U G C
G A
A G G G G G
G
A C
G
G
A
G
A
A
G
G
C
U
A
U
G
U
U
G
GCCGGG
CGA
C
G
G
U
U G U
C C C G G U
U
U
A
AGCGU
GUA
GGCUGGUUUUCC
A
GGCA
A
A
U C C G G A A A A U C
A A
G G C U
G A G
G C G U G
A
U
G
A C
G A G G C A C U
A
C
GGUGCUGAAGC
A
A
C
A
A
A
U
G
C
C
C
U
G
C
U
U
C
C
A
G
GAAA
A
GCCUCUAAGC
A
UC
A
GGUAACAUCAAA
U
C
G
U
A
C
CC
CAA
A
C C
G A
C
A
CAGGUG
G
U
C A
G G U A G
A
G
AAUACC
A
AG
G
C
G C
G
C
U
U
A
A
C
C
U
U
B
IV
V
VI
5’
3’
1650
1700
1750
1800
1850
1900
1950
2000
2050
2100
2150
2200
2250
2300
2350
2400
2450
2500
2550
2600
2650
2700
2750
2800
2850
2900
5’ half
m2
m
3
m
5
m
6
m
7
m
m
m
2
(1269-1270)
(413-416)
(1262-1263)
(746)
(531)
5
m
m
-[m
2
G]
G
G
U
U
A
A
G
C
U U
G
A
GA
G
A
A C
U
C
G
G
G
U
G
A
A
G
GAACUAGGCAAAAUGGUGCC
GUA
ACU
U
C
G G G
A G A A
G G C A C
G
C
U
G
A
U
A
U
G
U
A
GG
U
G
A
GG
U
C
C
C
U C G
C
G
G
A
U
G
G
A
G
C
U
G
A
A
A
U
C
A
G
U C
GA A
G A U A C C A G C
U
G
G
C
U
G
C
A
A
C
UGU
UUA
U
U
A
A A A
A C A
C
A
G
C
A
C
U
G
U
G
C
A
A
A
C
A
C
G
A A
A
G
U
G
G
A
C
GU
AU
A
C
G
G
U
G
U
G
A
C G C C
U
G
C
CC
G G
U
G
C
C
G
GA
A G
G
U
U
A
A
U
U
G
A
U
G
G
G
G
U
U
A
G
C
G
C A
A
G
C
G
A
A
G
C
U
C
U
U
G
A
U
C
G
A
A
G
C
C
C
C
G
G
U A
AA
C
G
G
C G
G
C
C
G
ψ
A
A
C ψ
A
ψ
A
A
C
G
G
U
C C
U A
A
G
G
U
A
G
C
G
A
A
A
U
U
CCUUG
U
C
G
G
G
U
AAG
U
U
C
C
G
A
CC
U
G
C
A
C
G
A
A
U
GGCG
U
A
AU
G
A
U
G
G
C
C
A
G
G
C
U
G
U
C
U
C
C
A
C
C
C
G
A
G
A
C
U
C
A GU G A A A
U
U
G
A
A
C
U
C GC U G
U
G A
A
G
A
UGCAGUG
U
A
C C C G C G G C
A
A G A C G G
A
A
A
G
A C
C
C
C
GU
G
A
A
C
C
U
U
U
A
C
U
A
U
A
G
C
U
U
G
A
C
A
C
U
G
A
A
C
A
U
U
G
A
G
C
C
U
U
G
A
U
G
U
G
U
A
G
G A U
A
G G U G G
G
A G
G
CU
U
U
G
A
A G
U
G
U
G
G
A
C
G
C C
A
G
U
C
U
G
C
A
U
G
G
A
G
C
C
G
A
C
C
U
U
GAAAU
A
CCACCC
U
U
U
A
A
U
G
U
U
U
G
A
U
G
U
U
C U A A C G U
U
G A C C C G U A
A
UCCGGGUUGCG
G
ACAGU
G
U
C
U
G
G
U
G
GG
U
A
G
U
U U G
A
C
U
G
G G G
C
G
G
U
C U
C
C
U
C
C
U
A
A
A
G A G
U
A
A
C
G
G
A
G
G
A G C A C
G
A
A
G
G
U
U
G
G
C
U
A
A
U
C
C
U
G
G
U
C
G G A
C
A
U
C
A
G
G
A G
G
U
U
A G
U
GC A
A
U
G
G
C
A
UA
AG
C
C
A
G
C
U
U
G
A
C U G C G A G C G U G
A
C
GGCGCGAGCAG
G
U
G
C
G
AA
A
G
C
A
G
GU
C
A
U
A
GU
G
A
U
CC
G
G
U
G
G
U
UC
U
G
A
A
UG
G
A
A
G
G
G
C
C
A
U
C
GC
U
C
A
ACG
G
A
U
A
AA
A
G
G
U A
CU
C
C
G
G
G
G A D
A
A
C
A
G
G C ψ
G
A U A C C G C C
C A A
G A
G U
U
C
A
UA
UC
GAC
GGCGGUG
UU
UGGC
A
C
C
U
C
G
A
ψGUC
G
G
C
U
C
A
U
C
A
C
A U C C U G G G G C U G A
A
G
UAGGUCCC
AA
GGGU
A
U
G
G
C
U
G
U
U
C
G
C
C
A
UU
U
A
A
A G
U
G
G
UA
C
GC
GA
G
C
ψ
G
GGUUU
A
G
A
A
C
G
U
C
GU
G
A
G
A
C
A G
U
ψ
C
G
G
U
C
CC
UA
UCUGCCGUGGG
C
G
C
U
G
G
A
G
A
A
C
U G
A
G
G
G
G
GG
C
U
G
C
U
C
C
U
A G
U
A C
G A
G
A
G
GA
C
CG
G
A
G
U
G
G
A
C
G
C
A
UC A
C
U
G
GU G
U
U
C
G
G
G
U
U
G
U
C
A
U
G
C
CA
A
U
G
G
C
AC
U
G
C
C
C
GGU
A
G
C
U
AA
A
U
G
C
G
G
AAG
A
G
A
U
AAG
U
G
C
U
G
A
AAG
C
A
U
C
U A A
G
C
A
C
G
A
A A C
U
U
G
C
C
C
C
GAG
A
U
G
A
G
U
U
C
U
C
C
C
U
G
A
C
C
C
U
UU
A
A
G
G
G
U
CCUGAAG
G
A
A C G U U G
A A
G
A
C
GA
CGACG
U
U
GAU
A
G
G
C
C
G
G
G
U
G
U
G
U A
AG
C
G
C
A
G
CG
A
U
G
C
G
U
U
G
A
G
C
U
A
A
C
C
G
G
U
A C
U
A
A
U
G
A
A
C
CGUGA
GG
C
U
U
A
A
C
C
U
U
C
Websites
The Comparative RNA Web (CRW) Site (all rRNAs):
http://www.rna.icmb.utexas.edu/
European rRNA Database (SSU and LSU rRNAs):
http://oberon.rug.ac.be:8080/rRNA/
5S rRNA Database: http://biobases.ibch.poznan.pl/5SData/
Ribosomal Database Project II (RDP-II): http://rdp.cme.msu.edu/html/
Ribosomal Internal Spacer Sequence Collection (RISSC): http://ulises.umh.es/RISSC/
6. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 6 of 23
RNABase Ribosomal RNA Entries: http://www.rnabase.org/listing/?cat=rrna
Further Reading
Ban N, Nissen P, Hansen J, Moore PB, Steitz TA (2000). The complete atomic structure
of the large ribosomal subunit at 2.4 Å resolution. Science 289:905-920.
Gutell RR, Lee JC, & Cannone JJ (2002). The Accuracy of Ribosomal RNA Comparative
Structure Models. Current Opinion in Structural Biology 12:301-310.
Harms J, Schluenzen F, Zarivach R, Bashan A, Gat S, Agmon I, Bartels H, Franceschi F,
& Yonath A (2001). High resolution structure of the large ribosomal subunit from a
mesophilic eubacterium. Cell 107:679-688.
Schluenzen F, Tocilj A, Zarivach R, Harms J, Gluehmann M, Janell D, Bashan A, Bartels
H, Agmon I, Franceschi F, & Yonath A (2000). Structure of functionally activated
small ribosomal subunit at 3.3 Å resolution. Cell 102:615-623.
Wimberly BT, Brodersen DE, Clemons WM Jr, Morgan-Warren RJ, Carter AP, Vonhein
C, Hartsch T, Ramakrishnan V (2000). Structure of the 30 S ribosomal subunit. Nature
407:327-339.
Yusupov MM, Yusupova GZ, Baucom A, Lieberman K, Earnest TN, Cate JHD, & Noller
HF (2001). Crystal structure of the ribosome at 5.5 Å resolution. Science 292:883–896.
7. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 7 of 23
Transfer RNA (RNA)
Transfer RNAs (tRNAs) are typically 70-90 nt in length in nuclear and chloroplast
genomes and are directly involved in protein synthesis. The carboxyl-terminus of an
amino acid is specifically attached to the 3’ end of the tRNA (aminoacylated). These
aminoacylated tRNAs are substrates in translation, interacting with a specific mRNA
codon to position the attached amino acid for catalytic transfer to a growing
polypeptide chain. Thus, tRNAs decode (or translate) the nucleotide sequence during
protein synthesis.
tRNAs have a characteristic “cloverleaf” structure that was initially determined with
comparative analysis. Crystal structures of tRNA substantiated this secondary
structure and revealed that different tRNAs formed very similar tertiary structures,
underscoring the key underlying principle of comparative analysis. The “variable
loop” of tRNA is primarily responsible for length variation among tRNAs; some of the
mitochondrial tRNAs are smaller than the typical tRNA, shortening or deleting the D or
TΨC helices.
Figure YY. tRNA secondary structure (Saccharomyces cerevisiae phenylalanine tRNA).
Structural features are labeled.
ACCEPTOR
STEM
LOOP
D LOOP
VARIABLE
LOOP
LOOP
ANTICODON
ΨT C
10
20
30 40
50
60
70
5’
3’
G
C
G
G
A
U
U
U
A
GCUC
AG
D
D
G
G
G A
G A G C
G
C
C
A
G
A
C
U
G A A
Y
A
U
C
U
G
G
A G
G
U
C
C U G U G
T Ψ
C
G
A
UC
CACAG
A
A
U
U
C
G
C
A
C
C
A
ΨT C STEM
D STEM
ANTICODON STEM
Non-mitochondrial tRNAs come in two types: type 1 and type 2. Structurally, the
major difference between the two types is the addition of a stem-loop structure in the
variable loop of type 2 tRNAs. The tRNA types do not correlate with the two classes of
aminoacyl-tRNA synthetases, where class 1 synthetases attach amino acids to the 2’-OH
and class 2 synthetases attach amino acids to the 3’-OH of the terminal nucleotide of the
tRNA.
Over fifty modified nucleotides have been observed in different tRNA molecules.
8. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 8 of 23
Websites
Aminoacyl-tRNA Synthetases Database: http://rose.man.poznan.pl/aars/index.html
Genomic tRNA: http://lowelab.ucsc.edu/GtRNAdb/
Mattias Sprinzl’s tRNA compilation: http://www.uni-
bayreuth.de/departments/biochemie/sprinzl/trna/
Modified RNA Nucleotides: http://medlib.med.utah.edu/RNAmods/
Further Reading
Holley RW, Apgar J, Everett GA, Madison JT, Marquisee M, Merrill SH, Penswick JR, &
Zamir A (1965). Structure of a ribonucleic acid. Science 147:1462-1465.
Kim SH (1979). Crystal structure of yeast tRNAphe and general structural features of
other tRNAs. In Transfer RNA: Structure, Properties, and Recognition (Schimmel PR,
Soll D & Abelson JN, eds), pp. 83-100, Cold Spring Harbor Laboratory Press, Cold
Spring Harbor, New York.
Kim SH, Suddath FL, Quigley GJ, McPherson A, Sussman JL, Wang AH, Seeman NC, &
Rich A (1974). Three-dimensional tertiary structure of yeast phenylalanine transfer
RNA. Science 185:435-440.
Levitt M (1969). Detailed molecular model for transfer ribonucleic acid. Nature 224:759-
763.
Marck C & Grosjean H (2002). tRNomics: Analysis of tRNA genes from 50 genomes of
Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and domain-
specific features. RNA 8:1189-1232.
Quigley GJ & Rich A (1976). Structural domains of transfer RNA molecules. Science
194:796-806.
Robertus JD, Ladner JE, Finch JT, Rhodes D, Brown RS, Clark BF, & Klug A (1974).
Structure of yeast phenylalanine tRNA at 3Å resolution. Nature 250:546-551.
9. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 9 of 23
RNA Secondary and Tertiary Structure (RNA Structural Elements)
For this discussion, we define secondary structure as two or more consecutive and
canonical base-pairs that are nested and antiparallel with one another. All other
structure, including non-canonical or non-nested base-pairs and RNA motifs, is
considered to be tertiary structure. Paired nucleotides are involved in interactions in
the structure model; all other nucleotides are considered to be unpaired.
Base-Pairs (Canonical and Non-Canonical) (RNA Structural Elements)
A base-pair is formed when two nucleotides hydrogen bond to each other, typically
between the bases of the nucleotides. Most RNA base-pairs observed to date are
oriented with the backbones of the two nucleotides in an antiparallel configuration.
The canonical base-pairs, G:C and A:U, plus the G:U ("wobble") base-pair, were
originally proposed by Watson and Crick. All other base-pairs are considered to be
non-canonical. While "non-canonical" connotes an unusual or unlikely combination of
two nucleotides, a significant number of non-canonical RNA base-pairs has been
proposed in rRNA comparative structure models and substantiated by the ribosomal
subunit crystal structures. A larger number of non-canonical base-pairs is present in the
crystal structures that were not predicted with comparative analysis.
Stems (RNA Structural Elements)
A stem is a set of base-pairs that are arranged adjacent to and antiparallel with one
another. While an RNA helix is a collection of smaller stems connected by loops and
bulges, the terms “stem” and “helix” are often used interchangeably.
The base-pairs that comprise a stem are nested; that is, drawn graphically, each base-
pair either contains or is contained within its neighbors. For two nested base-pairs, a:a’
and b:b’, where a < a’, b < b’, and a < b in the 5’ to 3’ numbering system for a given
RNA molecule, the statement a < b < b’ < a’ is true. Figure ZZ shows nesting for tRNA
in two different formats that represent the global arrangement of base-pairs. Nesting
arrangements can be far more complicated in a larger RNA molecule. Most base-pairs
are nested, and most helices are also nested. Base-pairs that are not nested are
pseudoknots (see Pseudoknots).
Loops (RNA Structural Elements)
Unpaired nucleotides in a secondary structure model are commonly referred to as
loops. Many of these loops close one end of an RNA stem and are called "hairpin
loops;" phrased differently, the nucleotides in a hairpin loop are flanked by a single
stem. Loops that are flanked by two stems come in several forms. A "bulge loop"
occurs only in one strand in a stem; the second strand's nucleotides are all forming base-
pairs. An "internal loop" is formed by parallel bulges on opposing strands, interrupting
the continuous base-pairing of the stem. Finally, a "multi-stem" loop forms when three
or more stems intersect. Figure AA is a schematic RNA that shows each of these types
10. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 10 of 23
of loop. The sizes of each type of loop can vary; certain combinations of loop size and
nucleotide composition have been shown to be more energetically stable.
Figure ZZ. Nested and non-nested base-pairs. The comparative model of the
secondary and tertiary structure of tRNA is shown in two formats that highlight the
helical and nesting relationships. In both panels, the secondary structure base-pairs are
shown in blue, tertiary base-pairs in red, and base triples in green. Some tertiary base-
pairs are nested (red lines do not cross blue lines); red lines representing pseudoknot
base-pairs do cross. A. Histogram format, with the tRNA sequence shown as a
“baseline” from left to right (5’ to 3’). Secondary structure elements are shown above
the baseline and tertiary structure elements are shown below the baseline. The distance
from the baseline to the interaction line is proportional to the distance between the two
interacting positions within the RNA sequence. B. Circular format, with the sequence
drawn clockwise (5' to 3') in a circle, starting at the top and base-base interactions
shown as lines traversing the circle. The tRNA structural elements are labeled.
10 20 30 40 50 60 70
-60
-40
-20
0
20
40
60
The Structure of tRNA
A ACCEPTOR
STEM
LOOP
D LOOP
LOOP
LOOP
ANTICODON
ΨT C
tRNA
5’3’
VARIABLE
10
20
30
40
50
60
70
B
11. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 11 of 23
Figure AA. Loop types. This schematic RNA contains one example of each of the four
loop types. Colors and labels: B, bulge loop (red); H, hairpin loop (purple), I, internal
loop (blue); M, multi-stem loop (green). Stems are shown in gray.
5'
3'
5' 3'
H
B I
M
Pseudoknots (RNA Structural Elements)
A pseudoknot is an arrangement of helices and loops where the helices are not nested
with respect to each other. Pseudoknots are so named due to the optical illusion of
knotting evoked by secondary and tertiary structure representations. A simple
pseudoknot is represented in Figure BB-A. For two non-nested (pseudoknot) base-
pairs, a:a’ and b:b’, where a < a’ and b < b’ in the 5’ to 3’ numbering system for a given
RNA molecule, the following statement will be true: a < b < a’ < b’. Contrast this
situation with a set of nested base-pairs (see Stems), where a < b < b’ < a’. Another
descriptive explanation (Figure BB-B) of pseudoknot formation is when the hairpin
loop nucleotides from a stem-loop structure form a helix with nucleotides outside the
stem-loop. Figure ZZ-B shows pseudoknot interactions in green; note how these lines
cross the blue lines that represent nested helices.
12. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 12 of 23
Figure BB. Schematic drawings of a simple pseudoknot. Stems are shown in red and
green; loops are shown in blue, orange, and purple. A. Standard format. B. Hairpin
loop format (after Hilbers et al. 1998).
5'
3'
5'
3'
Stem 1
Loop 1
Stem 2
Loop 2
Loop 3
Stem 1
Stem 2
Loop 1
Loop 2 Loop 3
A B
Websites
Base-Pair Directory (IMB Jena): http://www.imb-jena.de/IMAGE_BPDIR.html
The Comparative RNA Web (CRW) Site (descriptions and images of structural
elements): http://www.rna.icmb.utexas.edu/
Non-Canonical Base-Pair Database: http://prion.bchs.uh.edu/bp_type/
Pseudobase: http://wwwbio.leidenuniv.nl/~Batenburg/PKB.html
Further Reading
Chastain M & Tinoco I Jr (1994). Structural Elements in RNA. Progress in Nucleic Acid
Research and Molecular Biology 44:131-177.
Hilbers CW, Michiels PJ, & Heus HA (1998). New developments in structure
determination of pseudoknots. Biopolymers 48:137-153.
Pleij CWA (1994). RNA pseudoknots. Current Opinion in Structural Biology 4:337-344.
ten Dam E, Pleij K, & Draper D (1992). Structural and functional aspects of RNA
pseudoknots. Biochemistry 31:11665-11676.
13. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 13 of 23
Motifs in RNA Tertiary Structure (RNA Structural Elements)
Underlying the complex and elaborate secondary and tertiary structures for different
RNA molecules is a collection of different RNA building blocks, or structural motifs.
Beyond the abundant G:C, A:U, and G:U base-pairs in the standard Watson-Crick
conformation that are arranged into regular secondary structure helices, structural
motifs are usually composed of non-canonical base-pairs (e.g. A:A) with non-standard
base-pair conformations that are usually not consecutive and antiparallel with one
another. Approximately twenty-seven RNA structural motifs have been identified by
different research groups with differing methods and criteria; these motifs are listed in
alphabetical order below.
• 2’-OH-Mediated Helical Interactions: extensive hydrogen bonding between the
backbone of one strand and the minor groove of another in tightly-packed RNAs.
• A Story: unpaired adenosines in the covariation-based structure models.
• A-Minor: the minor groove faces of adenosines insert into the minor groove of
another helix, forming hydrogen bonds with the 2’-OH groups of C:G base-pairs.
• AA.AG@helix.ends: A:A and A:G oppositions exchange at ends of helices.
• Adenosine Platform: two consecutive adenosines form a pseudo-base-pair that
allows for additional stacking of bases.
• Base Triple: a base-pair interacts with a third nucleotide.
• Bulge-Helix-Bulge: Archaeal internal loop motif that is a target for splicing.
• Bulged-G: links a cross-strand A stack to an A-form helix.
• Coaxial Stacking of Helices: two neighboring helices are stacked end-to-end.
• Cross-Strand Purine Stack: consecutive adenosines from opposite strands of a helix
are stacked.
• Dominant G:U Base-Pair: G:U is the dominant base-pair (50% or greater),
exchanging with canonical base-pairs over a phylogenetic group, in particular
structural locations.
• E Loop/S Turn: an asymmetric internal or multi-stem loop (with consensus
sequence 5’-AGUA/RAA-3’) forms three non-canonical base-pairs.
• E-like Loop: a symmetric internal loop that resembles an E Loop (with consensus
sequence 5'-GHA/GAA-3') forms three non-canonical base-pairs.
• Kink-Turn: named for its kink in the RNA backbone; two helices joined by an
internal loop interact via the A-minor motif.
• Kissing Hairpin Loop: two hairpin loops interact to form a pseudocontinuous,
coaxially stacked three-stem helix.
14. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 14 of 23
• Lone Pair: a base-pair that has no consecutive, adjacent base-pair neighbors.
• Lonepair Triloop: a base-pair with no consecutive, adjacent base-pair neighbors
encloses a three-nucleotide hairpin loop.
• Metal-Binding: guanosine and uracil residues can bind metal ions in the major
groove of RNA.
• Metal-Core: specific nucleotide bases are exposed to the exterior to bind specific
metal ions.
• Pseudoknots (see Pseudoknots)
• Ribose Zipper: as two helices dock, the ribose sugars from two RNA strands
become interlaced.
• Tandem G:A Opposition: two consecutive G:A oppositions occur in an internal
loop.
• Tetraloop: four-nucleotide hairpin loops with specific sequences.
• Tetraloop Receptor: a structural element with a propensity to interact with
tetraloops, often involving another structural motif.
• Triplexes: stable “triple helix” observed only in model RNAs.
• tRNA D-Loop:T-Loop: conserved tertiary base pairs between the D and T loops of
tRNA.
• U-turn: a loop with the sequences UNR or GNRA contains a sharp turn in its
backbone, often followed immediately by other tertiary interactions.
Websites
The Comparative RNA Web (CRW) Site (descriptions and images of structural
elements; motif-related publications): http://www.rna.icmb.utexas.edu/
Distribution of RNA Motifs in Natural Sequences:
http://www.centrcn.umontreal.ca/~bourdeav/Ribonomics/
Pseudobase: http://wwwbio.leidenuniv.nl/~Batenburg/PKB.html
RNABase: The RNA Structure Database: http://www.rnabase.org/
SCOR (Structural Classification of RNA): http://scor.lbl.gov/domain_tert.html
Further Reading (Motifs)
Batey RT, Rambo RP, & Doudna JA (1999). Tertiary Motifs in RNA Structure and
Folding. Angewandte Chemie (International ed. in English) 38:2326-2343. [review
discussing multiple motifs: Adenosine Platform; Base Triple; Coaxial Stacking of
Helices; Kissing Hairpin Loops; Pseudoknot; Tetraloop; Tetraloop Receptor; tRNA
D-Loop:T-Loop]
15. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 15 of 23
Blake RD, Massoulié J, & J. R. Fresco JR (1967). Polynucleotides. 8. A spectral approach
to the equilibria between polyriboadenylate and polyribouridylate and their
complexes. Journal of Molecular Biology 30:291-308. [Triplexes]
Cate JH & Doudna JA (1996). Metal-binding sites in the major groove of a large
ribozyme domain. Structure 4:1221–1229.
Cate JH, Gooding AR, Podell E, Zhou K, Golden BL, Kundrot CE, Cech TR, & Doudna
JA (1996). Crystal structure of a group I ribozyme domain: principles of RNA
packing. Science 273:1678-1685. [Metal-Core; Tetraloop Receptor]
Cate JH, Gooding AR, Podell E, Zhou K, Golden BL, Szewczak AA, Kundrot CE, Cech
TR, & Doudna JA (1996). RNA tertiary structure mediation by adenosine platforms.
Science 273:1696-1699. [Adenosine Platform; Tetraloop Receptor]
Cate JH, Hanna RL, & Doudna JA (1997). A magnesium ion core at the heart of a
ribozyme domain. Nature Structural Biology 4:553-558. [Metal-Core; Tetraloop
Receptor]
Chang KY & Tinoco I Jr (1997). The structure of an RNA "kissing" hairpin complex of
the HIV TAR hairpin loop and its complement. Journal of Molecular Biology 269:52-66.
[Kissing Hairpin Loops]
Correll CC, Freeborn B, Moore PB, & Steitz TA (1997). Metals, motifs, and recognition
in the crystal structure of a 5S rRNA domain. Cell 91:705-712. [Cross-Strand Purine
Stack; example of Metal-Binding]
Costa M & Michel F (1995). Frequent use of the same tertiary motif by self-folding
RNAs. The EMBO Journal 14:1276-1285. [Tetraloop Receptor]
Costa M & Michel F (1997). Rules for RNA recognition of GNRA tetraloops deduced by
in vitro selection: comparison with in vivo evolution. The EMBO Journal 16:3289-3302.
[Tetraloop Receptor]
Diener JL & Moore PB (1998). Solution structure of a substrate for the archaeal pre-
tRNA splicing endonucleases: the bulge-helix-bulge motif. Molecular Cell 1:883–894.
[Bulge-Helix-Bulge]
Dirheimer G, Keith G, Dumas P, & Westhof E (1995). Primary, secondary, and tertiary
structures of tRNAs. In: tRNA: Structure, Biosynthesis, and Function (Söll D &
RajBhandary U, editors). American Society for Microbiology, Washington, DC, pp.
93-126. [tRNA D-Loop:T-Loop]
Doherty EA, Batey RT, Masquida B, & Doudna JA (2001): A universal mode of helix
packing in RNA. Nature Structural Biology 8:339-343. [A-Minor]
Elgavish T, Cannone JJ, Lee JC, Harvey SC, & Gutell RR (2001). AA.AG@Helix.Ends:
A:A and A:G Base-pairs at the Ends of 16 S and 23 S rRNA Helices. Journal of
Molecular Biology 310:735-753. [AA.AG@helix.ends]
Gautheret D, Damberger SH, & Gutell RR (1995). Identification of base-triples in RNA
using comparative sequence analysis. Journal of Molecular Biology 248:27-43. [Base
Triples]
16. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 16 of 23
Gautheret D, Konings D, & Gutell RR (1994). A major family of motifs involving G-A
mismatches in ribosomal RNA. Journal of Molecular Biology 242:1-8. [Tandem G:A
Oppositions]
Gautheret D, Konings D, & Gutell RR (1995). GU base pairing motifs in ribosomal
RNAs. RNA 1:807-814. [Dominant G:U Base-Pair]
Gutell RR, Cannone JJ, Shang Z, Du Y, & Serra MJ (2000). A Story: Unpaired Adenosine
Bases in Ribosomal RNA. Journal of Molecular Biology 304:335-354. [A Story;
Adenosine Platform; E Loop/S Turn; E-like Loop]
Gutell RR, Cannone JJ, Konings D, & Gautheret D (2000). Predicting U-turns in
Ribosomal RNA with Comparative Sequence Analysis. Journal of Molecular Biology
300:791-803. [U Turn]
Gutell RR, Larsen N, & Woese CR (1994). Lessons from an evolving ribosomal RNA:
16S and 23S rRNA structure from a comparative perspective. Microbiological Reviews
58:10-26. [review of several motifs: Base Triple; Coaxial Stacking of Helices;
Dominant G:U Base-Pair; Lone Pair; Pseudoknot; Tetraloop]
Ippolito JA & Steitz TA (1998). A 1.3-A resolution crystal structure of the HIV-1 trans-
activation response region RNA stem reveals a metal ion-dependent bulge
conformation. Proceedings of the National Academy of Sciences USA 95:9819-9824.
[Metal-Core; Tetraloop Receptor]
Jaeger L, Michel F & Westhof E (1994). Involvement of a GNRA tetraloop in long-range
RNA tertiary interactions. Journal of Molecular Biology 236:1271-1276. [Tetraloop
Receptor]
Klein DJ, Schmeing TM, Moore PB, & Steitz TA (2001): The kink-turn: a new RNA
secondary structure motif. TheEMBO Journal, 20:4214-4221. [Kink-Turn]
Lee JC, Cannone JJ, & Gutell RR (2003). The Lonepair Triloop: A New Motif in RNA
Structure. Journal of Molecular Biology 325:65-83. [Lonepair Triloop]
Leonard GA, McAuley-Hecht KE, Ebel S, Lough DM, Brown T, Hunter WN (1994).
Crystal and molecular structure of r(CGCGAAUUAGCG): an RNA duplex containing
two G(anti).A(anti) base pairs. Structure 2:483-494. [2’-OH-Mediated Helical
Interactions]
Leontis NB & Westhof E (1998). A common motif organizes the structure of multi-helix
loops in 16 S and 23 S ribosomal RNAs. Journal of Molecular Biology 283:571-583. [E
Loop/S Turn]
Lietzke SE, Barnes CL, Berglund JA, & Kundrot CE (1996). The structure of an RNA
dodecamer shows how tandem U-U base pairs increase the range of stable RNA
structures and the diversity of recognition sites. Structure 4:917-930. [2’-OH-
Mediated Helical Interactions]
Massoulié J (1968). [Associations of poly A and poly U in acid media. Irreversible
phenomenon] (French). European Journal of Biochemistry 3:439-447. [Triplexes]
17. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 17 of 23
Moore PB (1999). Structural motifs in RNA. Annual Review of Biochemistry 68:287-300.
[review of several motifs: Adenosine Platform; Bulge-Helix-Bulge; Bulged-G;
Cross-Strand Purine Stack; Metal-Binding; Ribose Zipper; Tetraloop; Tetraloop
Receptor; U-Turn]
Nissen P, Ippolito JA, Ban N, Moore PB, & Steitz TA (2001). RNA tertiary interactions in
the large ribosomal subunit: the A-minor motif. Proceedings of the National Academy of
Sciences USA 98:4899-4903. [A-Minor]
Pleij CWA (1994). RNA pseudoknots. Current Opinion in Structural Biology 4:337-344.
[Pseudoknot]
SantaLucia J Jr, Kierzek R, & Turner DH (1990). Effects of GA mismatches on the
structure and thermodynamics of RNA internal loops. Biochemistry 29:8813-8819.
[Tandem G:A Oppositions]
Tamura M & Holbrook SR (2002). Sequence and structural conservation in RNA ribose
zippers. Journal of Molecular Biology 320:455-474. [Ribose Zipper]
Traub W & Sussman JL (1982). Adenine-guanine base pairing ribosomal RNA. Nucleic
Acids Research 10:2701-2708. [AA.AG@helix.ends]
Wimberly B (1994). A common RNA loop motif as a docking module and its function
in the hammerhead ribozyme. Nature Structural Biology 1:820-827. [E Loop/S Turn]
Wimberly B, Varani G, & Tinoco I Jr. (1993). Biochemistry 32:1078–1087. [Bulged-G]
Woese CR, Gutell R, Gupta R, & Noller HF (1983). Detailed analysis of the higher-order
structure of 16S-like ribosomal ribonucleic acids. Microbiological Reviews 47:621-669.
[AA.AG@helix.ends]
Woese CR, Winker S, & Gutell RR (1990). Architecture of ribosomal RNA: constraints
on the sequence of "tetra-loops." Proceedings of the National Academy of Sciences USA
87:8467-8471. [Tetraloops]
18. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 18 of 23
Comparative Sequence Analysis (RNA Structure Prediction)
The two underlying principles for Comparative Sequence Analysis are very simple but
have a very profound influence on our prediction and understanding of RNA structure.
The principles are: 1) Different RNA sequences could have the potential to fold into the
same secondary and tertiary structure, and 2) the specific structure and function for
select RNA molecules is maintained during the evolutionary process of genetic
mutations and natural selection. Typically for RNA structure prediction, homologous
base-pairs that occur at the same positions in all of the sequences in the data set are
identified with covariation analysis (see “Covariation Analysis”), resulting in a
minimal structure model. Different structural motifs (see “Motifs in RNA Tertiary
Structure”) that have characteristic sequences at specific structural elements that are
sufficiently conserved in the sequence data set are identified, culminating in a final
comparative structure model.
The comparative method has been used for a variety of RNA molecules, including
tRNA, the three rRNAs (5S, 16S, and 23S), ITS and IVS rRNAs, group I and II introns,
RNase P, tmRNA, SRP RNA, and telomerase RNA. These methods will be more
accurate and the predicted structure will have more detail for any one type of RNA
when the number of sequences is large and the diversity among these sequences is high.
Covariation Analysis (RNA Structure Prediction)
While comparative sequence analysis is based on the simple proposition that
molecules with the same function will have similar secondary and tertiary structures,
covariation analysis, a subset of comparative sequence analysis, identifies base-pairs
that occur at the same positions in the RNA sequence in all of the RNA sequences in the
data set. Covariation analysis searches for positions that have the same pattern of
variation in an alignment of sequences (see Figure CC). The most recent
implementation of this method usually base-pairs any two positions with the same
pattern of variation, regardless of the types of base-pairs. While most of the base-pair
types that are identified exchange between G:C, A:U, and G:U, covariation analysis has
also identified exchanges between non-canonical base-pairs.
19. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 19 of 23
Figure CC. Examples of covariation. A. Schematic alignment. Five sequences are
shown from 5’ at left to 3’ at right. Black and red lines above the alignment show base-
pairing. Nucleotide position numbers appear in blue at the bottom of the alignment. B.
Summary of covariations from the alignment in panel A. Position numbers are
followed by the observed base-pair types for the seven base-pairs.
Sequence 1:
Sequence 2:
Sequence 3:
Sequence 4:
Sequence 5:
Position
Numbers:
UAGCGAA nnnnnnn AUCGCUU
UAACAAG nnnnnnn GUUGUUU
CAGCAGG nnnnnnn GCUGCUC
CAGGAGG nnnnnnn GCUCCUC
UAAGAAA nnnnnnn AUUCUUU
11111 1111122
1234567 8901234 5678901
A B
1:21 U:U, C:C
2:20 A:U
3:19 G:C, A:U
4:18 C:G, G:C
5:17 G:C, A:U
6:16 A:U, G:C
7:15 A:A, G:G
For example, the following four sets of base-pair exchanges show covariation: 1)
A:U <-> U:A <-> G:C <-> C:G, 2) G:U <-> A:C, 3) U:U <-> C:C, 4) A:A <-> G:G. By
searching for these coordinated positional variations in an well-aligned collection of
sequences, key elements of an RNA molecule’s core structure can be elucidated. Earlier
covariation methods searched specifically for helices composed of canonical base-pairs.
Improvements in covariation algorithms and an ever-growing collection of sequences
make comprehensive searches that consider all base-pairing types in a context-
independent manner possible. Due to the requirement that the two base-paired
positions have the same pattern of variation, covariation analysis will only identify a
subset of the total number of base-pairs that are in common to different sequences;
other comparative methods must be employed to detect the remainder.
Our confidence in the prediction of a base-pair with covariation analysis is directly
proportional to the dependence of the two 'paired' positions. These positions that vary
independently of one another are less likely to form a base-pair that can be predicted
with covariation analysis. In contrast a greater extent of simultaneous variation at the
two 'paired' positions could indicate that the two positions are dependent on one
another, and thus we are more confident in these base-pairs that are predicted with
covariation analysis. One of the family of methods that measures the
dependence/independence between two positions that are proposed to be base-paired
is the chi-square statistic that gauges the types of base-pairs and their frequencies (see
“Phylogenetic Event Counting”). The accuracy of the base-pair predictions with
covariation analysis is very high: approximately 97-98% of the 16S and 23S rRNA base-
pairs predicted with covariation analysis are present in the high-resolution crystal
structures.
20. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 20 of 23
An important facet of the covariation analysis is that the current covariation methods
are comparing all positions in a sequence alignment, independently of previous
predictions, structural context, and principles of RNA structure. In practice, the
majority of covariations involve two single nucleotides exchanging to maintain a
canonical base-pair and are arranged into standard secondary structure helices. Thus,
covariation analyses have independently determined the two most fundamental
principles in RNA structure: the Watson–Crick base-pairing relationship and the
formation of helices from the antiparallel and consecutive arrangement of these base-
pairs. In addition to these achievements, a significant number of examples of both
covariation between canonical (A:U, G:C, and G:U) and non-canonical base-pairs and
covariation between non-canonical base-pair types have been predicted for the rRNAs
and proven correct by the ribosomal subunit crystal structures. Likewise, examples of
tertiary base-pairs that are not part of a larger helices, both short- and long-range
interactions, have been predicted and shown to exist.
Phylogenetic Events (RNA Structure Prediction)
For covariation analysis, a more complete measure of the dependence and
independence between the two positions with the same pattern of variation is gauged
by 1) the number of base-pair types that covary with one another (i.e., A:U, G:C), 2) the
frequency of these base-pair types, and (3) phylogenetic events, or the number of times
that base-pair was created during the evolution of that base-pair. The first two gauges
can be measured with a chi-square statistic (see “Covariation Analysis”). The third
gauge requires an understanding of the phylogenetic relationships between the
sequences that are in the data set.
These three gauges are exemplified by three different base-pairs in 16S rRNA, 9:25,
502:543, and 245:283. The 9:25 base-pair has approximately 67% G:C and 33% C:G in
the nuclear-encoded rRNA genes in the three primary forms of life, the Eucarya,
Archaea, and the Bacteria. The minimal number of times these base-pairs evolved
(phylogenetic events) on the phylogenetic tree is about 4. In contrast, the 502:543 base-
pair has 27% G:C, 30% C:G, and 42% A:U, with a minimum of 75 phylogenetic events.
Last, the 245:283 base-pair has 38% C:C and 62% U:U in the same set of 16S rRNA
sequences, with approximately 25 phylogenetic events.
The phylogenetic event counting method can be used to augment the results from the
analysis of base-pair types and base-pair frequencies. It can add or subtract support for
a base-pair predicted with covariation analysis. In some situations, this form of analysis
can suggest a base-pair that would not have been predicted based on base-pair type and
frequencies alone.
Websites
Gutell Lab Comparative RNA Web Site:
http://www.rna.icmb.utexas.edu/METHODS/
21. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 21 of 23
Further Reading
Gutell RR, Larsen N & Woese CR (1994). Lessons from an evolving rRNA: 16 S and 23 S
rRNA structures from a comparative perspective. Microbiological Reviews 58:10-26.
Gutell RR, Lee JC & Cannone JJ (2002). The accuracy of ribosomal RNA comparative
structure models. Current Opinion in Structural Biology 12:301–310.
Michel F, Costa M, Massire I & Westhof E (2000). Modeling RNA tertiary structure
from patterns of sequence variation. Methods in Enzymology 317:491-510.
Woese CR & Pace NR (1993). Probing RNA structure, function, and history by
comparative analysis. In The RNA World (Gesteland RF & Atkins JF, editors), pp. 91-
117, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
22. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 22 of 23
RNA Folding (RNA Structure Prediction)
The structure and function of any given RNA are dependent on each other. Thus, an
understanding of how an RNA folds into its functional form can provide insights on
that function. Insight into any potential for fluidity or movement in an RNA structure
can also be indicative of functional properties.
The “RNA Folding Problem” can be summarized as the challenge of folding an RNA’s
primary structure into its active secondary and tertiary structure. Currently, no
complete answer to this question is available, although several approaches have been
able to provide insight. RNA folding algorithms search for secondary structure helices
composed of consecutive, antiparallel canonical base-pairs. Another set of constraints
comes from thermodynamics, where RNA is expected to fold into its most energetically
stable structure. The kinetics of the folding process will also have an impact on the final
result.
Energy Minimization (RNA Structure Prediction)
Traditionally, molecular biologists search for the most thermodynamically stable
structures, using energy minimization techniques. Thermodynamic energy values have
been experimentally determined for consecutive base-pairs and a few other simple
structural elements. The assumption behind energy minimization in RNA folding is
that the folding process for an RNA molecule can be determined by summing up the
totals of the energy values for its simpler structural elements.
The present set of thermodynamic folding algorithms does not always predict a
complete and correct secondary structure for an RNA molecule. This may indicate that
either our understanding of all of the thermodynamic parameters is incomplete or that
the process is based upon a flawed assumption. These algorithms also are unable to
predict tertiary structure base-pairs.
Websites
Michael Zuker’s Home Page (includes mfold and links to his current research):
http://www.bioinfo.rpi.edu/~zukerm/
Turner Group Home Page: http://rna.chem.rochester.edu/
From IMB-JENA (both available from the above link but less direct):
Algorithms and Thermodynamics for RNA Secondary Structure Prediction: A Practical
Guide: http://www.bioinfo.rpi.edu/~zukerm/seqanal/
Free Energy and Enthalpy Tables for RNA Folding From the Turner Group:
http://www.bioinfo.rpi.edu/~zukerm/rna/energy/
23. Dictionary of Bioinformatics (JM Hancock and MJ Zvelebil, Editors)
Entries by Jamie J. Cannone and Robin R. Gutell
Page 23 of 23
Further Reading
Burkard ME, Turner DH, & Tinoco I Jr (1998). The Interactions that Shape RNA. In: The
RNA World, 2nd Edition (Gestland RF, Atkins JF, & Cech TR, editors). Cold Spring
Harbor Press, pp. 233-264.
Mathews DH, Sabina J, Zuker M, & Turner DH (1999). Expanded Sequence
Dependence of Thermodynamic Parameters Provides Robust Prediction of RNA
Secondary Structure. Journal of Molecular Biology 288:911-940.
Mathews DH, Turner DH, & Zuker M (2000). RNA Secondary Structure Prediction. In:
Current Protocols in Nucleic Acid Chemistry (Beaucage S, Bergstrom DE, Glick GD, &
Jones RA, editors), John Wiley & Sons, New York, 11.2.1-11.2.10.
Nagel JHA & Pleij CWA (2002). Self-induced structural switches in RNA. Biochimie
84:913-923.
Zuker M (1989). On finding all suboptimal foldings of an RNA molecule. Science
244:48-52.
Zuker M (2000). Calculating nucleic acid secondary structure. Current Opinion in
Structural Biology 10:303-310.