2. What is a Human Genome
the complete set
of genetic information for humans
encoded as DNA sequences within the
23 chromosome pairs in cell nuclei and
in a small DNA molecule found within
individual mitochondria
include both protein-coding DNA
genes and noncoding DNA
3. Haploid human genomes consist of
three billion DNA base pairs,
while diploid genomes have twice the
DNA content.
While there are significant differences
among the genomes of human
individuals these are considerably
smaller than the differences between
humans and their closest living relatives,
the chimpanzees and bonobos.
4. noncoding DNA
In genomics and related
disciplines, noncoding DNA sequences are
components of an organism's DNA that do
not encode protein sequences.
Some noncoding DNA is transcribed into
functional noncoding RNA molecules while
others are not transcribed or give rise to RNA
transcripts of unknown function.
5. Many noncoding DNA sequences have
important biological functions as
indicated by comparative
genomics studies that report some
regions of noncoding DNA that are
highly conserved, sometimes on time-
scales representing hundreds of
millions of years, implying that these
noncoding regions are under
strong evolutionary pressure
and positive selection
6. For example, in the genomes
of humans and mice, which diverged
from a common ancestor 65–75 million
years ago, protein-coding DNA
sequences account for only about 20%
of conserved DNA, with the remaining
80% of conserved DNA represented in
noncoding regions.
7. Linkage mapping often identifies
chromosomal regions associated with
a disease with no evidence of
functional coding variants of genes
within the region, suggesting that
disease-causing genetic variants lie in
the noncoding DNA
8. Noncoding RNA molecules play many essential
roles in cells, especially in the many reactions
of protein synthesis and RNA processing
Non-coding RNA genes include highly abundant
and functionally important RNAs such as transfer
RNA (tRNA) and ribosomal RNA (rRNA), as well as
RNAs such
as snoRNAs, microRNAs,siRNAs, snRNAs, exRNA
s, and piRNAs and the long ncRNAs that include
examples such as Xist and HOTAIRA non-coding
RNA (ncRNA) is a functional RNA molecule that is
not translated into a protein.
Noncoding functional RNA
T
Y
P
E
S
9. Cis- and Trans-regulatory elements
Cis-regulatory elements are sequences that
control the transcription of a nearby gene.
Cis-elements may be located
in 5' or 3' untranslated regions or
within introns. Trans-regulatory
elementscontrol the transcription of a distant
gene.
Promoters facilitate the transcription of a
particular gene and are typically upstream of
the coding region. Enhancer sequences
may also exert very distant effects on the
transcription levels of genes.
T
Y
P
E
S
10. Introns
Introns are non-coding sections of a gene,
transcribed into the precursor
mRNA sequence, but ultimately removed
by RNA splicing during the processing to
mature messenger RNA. Many introns
appear to be mobile genetic elements
An intron is any nucleotide sequence within
a gene that is removed by RNA
splicing while the final mature RNA product
of a gene is being generated.
T
Y
P
E
S
11. Studies of group I
introns from Tetrahymena protozoans i
ndicate that some introns appear to be
selfish genetic elements, neutral to the
host because they remove themselves
from flanking exonsduring RNA
processing and do not produce an
expression bias between alleles with
and without the intron.
12. Some introns appear to have
significant biological function, possibly
throughribozyme functionality that may
regulate tRNA and rRNA activity as
well as protein-coding gene
expression, evident in hosts that have
become dependent on such introns
over long periods of time
13. for example, the trnL-intron is found in
all green plants and appears to have
been vertically inherited for several
billions of years, including more than a
billion years within chloroplasts and an
additional 2–3 billion years prior in
the cyanobacterial ancestors of
chloroplasts.
14. Pseudogenes
Pseudogenes are DNA sequences,
related to known genes, that have lost
their protein-coding ability or are
otherwise no longer expressed in the
cell.
15. Pseudogenes arise from
retrotransposition or genomic
duplication of functional genes, and
become "genomic fossils" that are
nonfunctional due to mutations that
prevent the transcription of the gene,
such as within the gene promoter
region, or fatally alter the translation of
the gene, such as premature stop
codons or frameshifts
16. Repeat sequences, transposons and
viral elements
Transposons and retrotransposons are
mobile genetic elements
Retrotransposon repeated sequences,
which include long interspersed
nuclear elements (LINEs) and short
interspersed nuclear
elements (SINEs), account for a large
proportion of the genomic sequences
in many species.
17. Telomeres
A telomere is a region of
repetitive nucleotide sequences at each end
of a chromatid, which protects the end of the
chromosome from deterioration or from
fusion with neighboring chromosomes
Telomere regions deter the degradation
of genes near the ends of chromosomes by
allowing chromosome ends to shorten,
which necessarily occurs
during chromosome replication.
18. Without telomeres, the genomes would
progressively lose information and be
truncated after cell division because
the synthesis of Okazaki
fragments requires RNA primers
attaching ahead on the lagging strand.
Over time, due to each cell division,
the telomere ends become shorter.
19. During cell division, enzymes that duplicate
DNA cannot continue their duplication all the
way to the end of chromosomes. If cells
divided without telomeres, they would lose
the ends of their chromosomes, and the
necessary information they contain.
The telomeres are disposable buffers
blocking the ends of the chromosomes, are
consumed during cell division, and are
replenished by an enzyme,telomerase
reverse transcriptase.
20. Coding sequences (protein-coding
genes)
Protein coding sequences are DNA
sequences that are transcribed into mRNA
and in which the corresponding mRNA
molecules are translated into a polypeptide
chain.
Every three nucleotides, termed a codon, in
a protein coding sequence encodes 1 amino
acid in the polypeptide chain. In some
cases, different chassis may either map a
given codon to a different sequence or may
use different codons more or less
frequently.
21. In the Registry, protein coding sequences
begin with a start codon (usually ATG) and
end with a stop codon (usually with a double
stop codon TAA TAA). Protein coding
sequences are often abbreviated with the
acronym CDS.
Although protein coding sequences are
often considered to be basic parts, in fact
proteins coding sequences can themselves
be composed of one or more regions, called
protein domains. Thus, a protein coding
sequence could either be entered as a basic
part or as a composite part of two or more
protein domains.
22. The N-terminal domain of a protein coding
sequence is special in a number of ways.
First, it always contains a start codon,
spaced at an appropriate distance from a
ribosomal binding site. Second, many
coding regions have special features at the
N terminus, such as protein export tags and
lipoprotein cleavage and attachment tags.
These occur at the beginning of a coding
region, and therefore are termed Head
domains.
23. A protein domain is a sequence of amino
acids which fold relatively independently
and which are evolutionarily shuffled as a
unit among different protein coding regions.
The DNA sequence of such domains must
maintain in-frame translation, and thus is a
multiple of three bases. Since these protein
domains are within a protein coding
sequence, they are called Internal domains.
Certain Internal domains have particular
functions in protein cleavage or splicing and
are termed Special Internal domains.
24. Similarly, the C-terminal domain of a
protein is special, containing at least a
stop codon. Other special features,
such as degradation tags, are also
required to be at the extreme C-
terminus. Again, these domains cannot
function when internal to a coding
region, and are termed Tail domains.
25.
26.
27.
28. Human Genetics Disorders
some genetic disorders only cause
disease in combination with the
appropriate environmental factors
(such as diet).
With these caveats, genetic disorders
may be described as clinically defined
diseases caused by genomic DNA
sequence variation
29. In the most straightforward cases, the
disorder can be associated with
variation in a single gene. For
example, cystic fibrosis is caused by
mutations in the CFTR gene, and is
the most common recessive disorder
in caucasian populations with over
1,300 different mutations known.[52]
30. Disease-causing mutations in specific
genes are usually severe in terms of
gene function, and are fortunately rare,
thus genetic disorders are similarly
individually rare.
However, since there are many genes
that can vary to cause genetic
disorders, in aggregate they constitute
a significant component of known
medical conditions, especially in
pediatric medicine.
31. Molecularly characterized genetic
disorders are those for which the
underlying causal gene has been
identified, currently there are
approximately 2,200 such disorders
annotated in the OMIM database
32. Disorder Prevalence Chromosome or Gene Involved
Chromosomal Conditions
Down Syndrome 1:600 Chromosome 21
Klinefelter Syndrome 1:500–1000 males Additional X Chromosome
Turner Syndrome 1:2000 females Loss of X Chromosome
Sickle cell anemia 1 in 50 births in parts of Africa; rarer elsewhere[53] β-globin
Cancers
Breast/Ovarian Cancer (susceptibility) ~5% of cases of these cancer types BRCA1, BRCA2
FAP (hereditary nonpolyposis coli) 1:3500 APC
Lynch syndrome 5–10% of all cases of bowel cancer MLH1, MSH2, MSH6, PMS2
Neurological Conditions
Huntington disease 1:20000 Huntingtin
Alzheimer disease ‐ early onset 1:2500 PS1, PS2, APP
Other Conditions
Cystic fibrosis 1:2500 CFTR
Muscular dystrophy – Duchenne type 1:3500 boys Dystrophin