1. Artificial neural networks (ANNs) are being used as a bioinformatics approach for gene prediction and genetic diversity analysis. ANNs consist of interconnected layers that learn from input to output.
2. For gene prediction, a neural network is constructed with multiple input, hidden, and output layers. The input is a gene sequence and output is exon probability. Weights between layers are adjusted during training to recognize patterns.
3. ANNs have advantages over traditional statistical methods as they can model more complex data relationships without requiring detailed system information. Different ANN types exist for various applications in bioinformatics.
2. gene prediction
• gene prediction or gene finding refers to the
process of identifying the regions of genomic
DNA that encode genes.
• This includes protein-coding genes as well as
RNA genes, but may also include prediction of
other functional elements such as
regulatory regions.
• Gene prediction is one of the key steps in
genome annotation.
3. Genemark
• GeneMark is a generic name for a family of
ab initio gene prediction programs developed
at the Georgia Institute of Technology in
Atlanta. Developed in 1993.
• GeneMark was used in 1995 as a primary gene
prediction tool for the first completely
sequenced bacterial genome of
Haemophilus influenzae
Shortcomings
Inability to find exact gene
boundaries
4. Definition
• It’s a prerequisite for detailed functional
annotation of genes and genomes.
• It can detect the location of ORF(open reading
frame )structure of introns and exans.
• It describe all genes computationally with
near 100% accurency.
• It can reduce the amount of experimental
verification work required.
6. Prokaryotic gene prediction
• The GeneMark.hmm algorithm (1998) was
designed to improve gene prediction accuracy
in finding short genes and gene starts.
• The idea was to integrate the Markov chain
models used in GeneMark into a
hidden Markov model framework.
• GeneMarkS has been in active use by
genomics community for gene identification in
new prokaryotic genomic sequences.
7. Glimmer
• Glimmer is used to find genes in prokaryotic DNA.It is
effective at finding genes in bacteria, archea, viruses, typically
finding 98-99% of all relatively long protein coding genes.
• maintained by Steven Salzberg, Art Delcher at the University
of Maryland .
• Used IMM (Interpolated markov Models) for the first time.
• Predictions based on variable context(oligomers of variable
lengths).
8. Three versions:
Glimmer 1 (1997)
Glimmer 2 (1999)
Glimmer 3 (2007)*
Ribosome binding site(RBS) signal can be used to find
true start site position. GLIMMER results are passed
as an input for RBSfinder program to predict
ribosome binding sites.
GLIMMER 3.0 integrates RBS finder program into
gene predicting function itself.
9. Eukaryotic gene prediction
• In eukaryotes,the gene is combination of
coding segments(exons) that are in the
non_coding segments(introns).
• Genes in prokaryotes are continuously .so
computational gene prediction is easy in
eukaryotes.
• Exons are interpreted with introns and
typically flanked by GT and AC.
10. tools
• GeneMark
• GeneMark.hmm many species pre-trained
model parameters are ready and available
through the GeneMark.hmm
• GeneMark-ES has a special made for
analyzing fungal genomes.
11.
12.
13.
14.
15. Introduction - ANN
• The bioinformatics refers to the application of computational and mathematical
techniques in biological analysis
• To evaluate, as a strategy for genetic diversity analysis, the bioinformatics
approach (multivariate) called artificial neural network (ANN)
• Information that flows through network affects the structure of the ANN because
a neural network changes or learns, in a sense based on that input and output
• ANNs have three layers that are interconnected. The first layer consists of input
neurons. Those neurons send data on to the second layer, which in turn sends the
output neurons to the third layer
• Used in various fields –gene discovery,drug designing, horticulture, agriculture,
forestry, medicine , etc…
16.
17. The artificial neuron
• Electrochemically modeled biological neuron
• Has many input and output .
• Has two mode,
• 1.Training mode
• 2.Using mode.
• Training mode is trained to particular input
patterns.
• Using mode is detected the input.
18. Gene prediction
• A neural network is costructed with multible
layers .input,output and hidden layers .
• The input is the gene sequence with intron
and exon signals .the output is the probability
of an exon structure.
• Between the input and output ,there may be
one or more several hidden layers where the
machine learning takes place .The machine
learning process starts by feeding the model
with a sequence of known gene structure .
19. The gene structure information is separated
into several classes of features such as
hexamer frequencies ,splice sites ,and gc
composition during training .
• The weight fuction in the hidden layaers are
adjusted during this process to recognize the
necleotide patterns and their relationship with
known structures.
• Then the algorithms predicted the unknown
sequence after training .
20. Why ANN
• ANN’s can capture more complex features of the data, which is not
always possible with traditional statistical techniques
• The greatest advantage of ANN’s over the conventional methods is that
they do not require detailed information about the physical processes of
the system to be modelled
21. A distinguishing the biological
neuron versus artificial neuron
Comparative schemes of biological and artificial neural system. X=
input variable; W= weight of in input; θ= internal threshold value;
23. Types of artificial neural networks
The are many artificial neural
networks……
1.Feed-forward and neural network
2.Radial basis function (RBF) network
3.Kohonen self organising network
4.learning vector quantization
5.Recurrent neural network
6.Modular neural networks
7.Physical neural network
8.Other types of networks
(holographic associative memory)
24.
25. Conclusion
• Neural networks are regularly used to model parts of
living organisms and to investigate the internal
mechanisms of the brain
• It was observed that the neural network was not
influenced by scale of input data. The classification by
original data was the same when using standardized
data.