SlideShare a Scribd company logo
1 of 18
By
Harpreet Singh Kalsi
   Hans Raj College
BIOINFORMATICS
  Bioinformatics is an emerging field of science which uses computer
  technology for storage, retrieval, manipulation and distribution of
  information related to biological data specifically for DNA, RNA and
  proteins.

DATABASE
  They are simply the repositories in which all the biological data is
  stored as computer language. Databases are variously classified on
  varying basis like data type, data source, organisms, etc.

TOOLS
  Tools are software developed to perform various tasks over the
  stored data such as searches, analysis, submission, annotation, etc.

RESIDUE
  Terms stand for the building block of the macromolecules in the
  databases. For example nucleotide for DNA & RNA and amino acids
  for Proteins.
On basis of Data Type
                             On basis of Data Source
       Genome Databases
       Sequence Databases           Primary Databases

       Structure Databases
                                   Secondary Databases
      Microarray Databases
                                 Special Categories
       Chemical Databases
      Metabolic Databases
                                   Integrated Database
       Enzyme Databases
       Disease Databases           Composite Database
      Literature Databases
       Taxonomy Database
IMPORTANT DATABASES                IMPORTANT TOOLS
   NCBI (Integrated database)        BLAST (search and
   EMBL (Nucleotide database)         homology tool)
   DDBJ (Nucleotide database)        FASTA (search and
    GenBank (Nucleotide

    database)                          homology tool)
   SWISS-PROT (Protein               BankIt (submission tool)
    database)                         Sequin(submission tool)
   OMIM (Disease database)           ORF Finder (analysis tool)
   PDB (Structure database)          TXSearch (retrieval tool for
   KEGG (Metabolic database)          taxonomy database)
   PubMed (Literature database)      SAKURA (submission tool in
   Enzymes (Enzyme database)          DDBJ)
    PANDIT (taxonomy database)

                                      ClustalW (multiple sequence
   ArrayExpress (microarray
    database)                          alignment)
                                      MSDFold (protein secondary
                                       structure comparison tool)
   BLAST stands for Basic Local Alignment Search Tool

   Blast is a program which uses specific scoring matrices (like
    PAM or BLOSSUM) for performing sequence-similarity
    searches against a variety of sequence databases, to give us
    high-scoring ungapped segments among related sequences.

   Complex- requires multiple steps and many parameters

   The BLAST algorithm is fast, accurate, and web-accessible

   Is relatively faster than other sequence similarity search tools.

   Provides us with ability to perform analysis by different types
    of programs
Program   Input     Query search   Database
                              1
   blastn    DNA                      DNA
                             1
   blastp    protein                  protein
                             6
   blastx    DNA                      protein
                             6
   tblastn   protein                  DNA
                             36
   tblastx   DNA                      DNA




                                                 Continued
blastn    compares a DNA query sequence against a DNA
          database, allowing for gaps
blastp    compares a protein query sequence against a
          protein database, allowing for gaps
blastx    compares a DNA query sequence translated into
          six reading frames against a protein database,
          allowing for gaps
tblastn   compares a protein query sequence against a
          DNA database translated into six reading frames,
          allowing for gaps
tblastx   compares a DNA query sequence translated into
          six reading frames against a DNA database
          translated into six reading frames. tblastx doesn’t
          allow for gaps.
   MEGABLAST         - for comparison of large sets of long DNA
                      sequences

   RPS-BLAST         - Conserved Domain Detection

   BLAST 2 Sequences - for performing pair-wise alignments for
                     2 chosen sequences

   Genomic BLAST     - for alignments against select human,
                      microbial or malarial genomes

   PSI-BLAST         - construct a multiple alignment from
                      matches

   PHI-BLAST         -specify a pattern that hits must match
   Make specific primers with Primer-BLAST
   Search trace archives
   Find conserved domains in your sequence (cds)
   Find sequences with similar conserved domain architecture
    (cdart)
   Search sequences that have gene expression profiles (GEO)
   Search immunoglobulins (IgBLAST)
   Search using SNP flanks
   Screen sequence for vector contamination (vecscreen)
   Align two (or more) sequences using BLAST (bl2seq)
   Search protein or nucleotide targets in PubChem BioAssay
   Search SRA transcript and genomic libraries
   Constraint Based Protein Multiple Alignment Tool
   Needleman-Wunsch Global Sequence Alignment Tool
              Search RefSeqGene
                                       http://blast.ncbi.nlm.nih.gov/Blast.cgi
Although how BLAST works is a little complicated and lengthy so in
short and brief explanation BLAST works in following two steps:

1. BLAST first searches for short regions of a given length (W)
   called “words” (or substrings) that score at least “T” when
   compared to the query sequence that align with sequences in
   the database (“target sequences”), using a substitution matrix.

2. For every pair of sequences (query and target) that have a word
   or words in common, BLAST extends the alignment in both
   directions to find alignments that score greater (are more
   similar) than a certain score threshold (S). These alignments are
   called high scoring pairs or HSPs; the maximal scoring HSPs are
   called MSPs.
Query Sequence

                               “words” (subsequences of the query sequ




                                   Query words are compared to
                                   the database (target sequences)
                                   and exact matches identified



                                 For each word match, alignment
                                 is extended in both directions to
                                 find alignments that score
                                 greater than some threshold
(Schneider and La Rota 2000)     (maximal segment pairs, or MSPs)
There are various questions which a BLAST can handle which
commonly arises in the research laboratory. Some of the
most common questions arising are:
   Which bacterial species have a protein that is related to a
    protein whose amino-acid sequence I know?
   Where does the DNA I’ve sequenced come from?
   What other genes encode proteins that exhibit structures
    similar to the one I’ve just determined?
   What does the protein structure looks like?
   What is the function of the gene or the protein that I've
    sequenced? (if it’s not known then you have some work to do)
   What are the probable functions of the sequence I have?



                                                       CONTINUED
To answer the question arising we use BLAST for searching
the database and then analyse the results which it produces.
Here to explain this we will see an example
    We have following sequence of a protein from our
    experiments with a Mycobacterium tuberculosis
    Sequence:




    Now as to see whether this protein has any similarity
    between other organisms we perform a BLAST to
    understand it’s importance. To perform BLAST we go to
    following URL
      http://blast.ncbi.nlm.nih.gov/

                                                    CONTINUED
   After performing blast against a chosen or every blast we
    perform the analysis of the result
   A chosen entry is shown below




    This entry shows that the sequence for which we ran BLAST hits
    against a database (here Swiss-Prot) has a 88% identity with
    Full=Single-stranded DNA-binding protein accession number P46390.2

                                                              Continued
Entry shows us a score which describes the quality of the entry which has
matched with the query which we have sequenced in our experiment.

With the use of accession number which we have obtained after
organising a BLAST search we can easily access the information about
many aspects. Some of them are described below
                 • The organism from which it came
                 • Function of the protein
                 • Region of DNA encoding for the gene
                 • length of the sequence
                 • taxonomy of the organism
                 • FASTA sequence of the protein
                 • Links for the 3D structure if it has been found


Similarly we can see whether the sequence which we have sequenced is
homologous (similar) or not with any of the sequence in the database
which we are referring for the search. As mentioned we can search any
database of our interest to check it’s function or function for similar
structures.
   BLAST is the most important program in bioinformatics
    (maybe all of biology)
   BLAST is based on sound statistical principles (key to its
    speed and sensitivity)
   A basic understanding of its principles is key for
    using/interpreting BLAST output
   BLAST can play an essential role for helping us to purpose
    the following
                structure of a protein
                Function of sequence
                Relation with an organism
   Use blastn or MEGA-BLAST for DNA
   Use PSI-BLAST for protein searches
BOOKS

   BIOINFORMATICS by by Pevsner
   BIOINFORMATICS by Jin Xiong
   BIOINFORMATICS by Ghosh and Malik

INTERNET

   Slide share www.slideshare.com
   NCBI www.blast.ncbi.nlm.nih.gov/Blast.cgi
   UniProt/Swiss-Prot www.uniprot.org
blast bioinformatics

More Related Content

What's hot

What's hot (20)

Fasta
FastaFasta
Fasta
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Scop database
Scop databaseScop database
Scop database
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Fasta
FastaFasta
Fasta
 
Cath
CathCath
Cath
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
Ddbj
DdbjDdbj
Ddbj
 
Tools of bioinforformatics by kk
Tools of bioinforformatics by kkTools of bioinforformatics by kk
Tools of bioinforformatics by kk
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
SEQUENCE ANALYSIS
SEQUENCE ANALYSISSEQUENCE ANALYSIS
SEQUENCE ANALYSIS
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Gene bank by kk sahu
Gene bank by kk sahuGene bank by kk sahu
Gene bank by kk sahu
 
DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
YEAST TWO HYBRID SYSTEM
 YEAST TWO HYBRID SYSTEM YEAST TWO HYBRID SYSTEM
YEAST TWO HYBRID SYSTEM
 

Similar to blast bioinformatics

BLAST [Basic Alignment Local Search Tool]
BLAST [Basic Alignment Local Search Tool]BLAST [Basic Alignment Local Search Tool]
BLAST [Basic Alignment Local Search Tool]
BiotechOnline
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
Atai Rabby
 
Blast bioinformatics
Blast bioinformaticsBlast bioinformatics
Blast bioinformatics
atmapandey
 
Bioinformatics Final Presentation
Bioinformatics Final PresentationBioinformatics Final Presentation
Bioinformatics Final Presentation
Shruthi Choudary
 
Research presentation-wd
Research presentation-wdResearch presentation-wd
Research presentation-wd
Wagied Davids
 

Similar to blast bioinformatics (20)

BLAST
BLASTBLAST
BLAST
 
BLAST [Basic Alignment Local Search Tool]
BLAST [Basic Alignment Local Search Tool]BLAST [Basic Alignment Local Search Tool]
BLAST [Basic Alignment Local Search Tool]
 
BLAST
BLASTBLAST
BLAST
 
BLAST
BLASTBLAST
BLAST
 
BLAST
BLASTBLAST
BLAST
 
Blasta
BlastaBlasta
Blasta
 
Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptx
 
Blast
BlastBlast
Blast
 
Ncbi
NcbiNcbi
Ncbi
 
Blast
BlastBlast
Blast
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
 
BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)
 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahu
 
Data base searching tool
Data base searching toolData base searching tool
Data base searching tool
 
Blast gp assignment
Blast  gp assignmentBlast  gp assignment
Blast gp assignment
 
BLAST Search tool
BLAST Search toolBLAST Search tool
BLAST Search tool
 
Blast bioinformatics
Blast bioinformaticsBlast bioinformatics
Blast bioinformatics
 
Bioinformatics Final Presentation
Bioinformatics Final PresentationBioinformatics Final Presentation
Bioinformatics Final Presentation
 
Research presentation-wd
Research presentation-wdResearch presentation-wd
Research presentation-wd
 
database retrival.pdf
database retrival.pdfdatabase retrival.pdf
database retrival.pdf
 

Recently uploaded

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Recently uploaded (20)

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 

blast bioinformatics

  • 1. By Harpreet Singh Kalsi Hans Raj College
  • 2. BIOINFORMATICS Bioinformatics is an emerging field of science which uses computer technology for storage, retrieval, manipulation and distribution of information related to biological data specifically for DNA, RNA and proteins. DATABASE They are simply the repositories in which all the biological data is stored as computer language. Databases are variously classified on varying basis like data type, data source, organisms, etc. TOOLS Tools are software developed to perform various tasks over the stored data such as searches, analysis, submission, annotation, etc. RESIDUE Terms stand for the building block of the macromolecules in the databases. For example nucleotide for DNA & RNA and amino acids for Proteins.
  • 3. On basis of Data Type On basis of Data Source Genome Databases Sequence Databases Primary Databases Structure Databases Secondary Databases Microarray Databases Special Categories Chemical Databases Metabolic Databases Integrated Database Enzyme Databases Disease Databases Composite Database Literature Databases Taxonomy Database
  • 4. IMPORTANT DATABASES IMPORTANT TOOLS  NCBI (Integrated database)  BLAST (search and  EMBL (Nucleotide database) homology tool)  DDBJ (Nucleotide database)  FASTA (search and GenBank (Nucleotide  database) homology tool)  SWISS-PROT (Protein  BankIt (submission tool) database)  Sequin(submission tool)  OMIM (Disease database)  ORF Finder (analysis tool)  PDB (Structure database)  TXSearch (retrieval tool for  KEGG (Metabolic database) taxonomy database)  PubMed (Literature database)  SAKURA (submission tool in  Enzymes (Enzyme database) DDBJ) PANDIT (taxonomy database)   ClustalW (multiple sequence  ArrayExpress (microarray database) alignment)  MSDFold (protein secondary structure comparison tool)
  • 5. BLAST stands for Basic Local Alignment Search Tool  Blast is a program which uses specific scoring matrices (like PAM or BLOSSUM) for performing sequence-similarity searches against a variety of sequence databases, to give us high-scoring ungapped segments among related sequences.  Complex- requires multiple steps and many parameters  The BLAST algorithm is fast, accurate, and web-accessible  Is relatively faster than other sequence similarity search tools.  Provides us with ability to perform analysis by different types of programs
  • 6. Program Input Query search Database 1  blastn DNA DNA 1  blastp protein protein 6  blastx DNA protein 6  tblastn protein DNA 36  tblastx DNA DNA Continued
  • 7. blastn compares a DNA query sequence against a DNA database, allowing for gaps blastp compares a protein query sequence against a protein database, allowing for gaps blastx compares a DNA query sequence translated into six reading frames against a protein database, allowing for gaps tblastn compares a protein query sequence against a DNA database translated into six reading frames, allowing for gaps tblastx compares a DNA query sequence translated into six reading frames against a DNA database translated into six reading frames. tblastx doesn’t allow for gaps.
  • 8. MEGABLAST - for comparison of large sets of long DNA sequences  RPS-BLAST - Conserved Domain Detection  BLAST 2 Sequences - for performing pair-wise alignments for 2 chosen sequences  Genomic BLAST - for alignments against select human, microbial or malarial genomes  PSI-BLAST - construct a multiple alignment from matches  PHI-BLAST -specify a pattern that hits must match
  • 9. Make specific primers with Primer-BLAST  Search trace archives  Find conserved domains in your sequence (cds)  Find sequences with similar conserved domain architecture (cdart)  Search sequences that have gene expression profiles (GEO)  Search immunoglobulins (IgBLAST)  Search using SNP flanks  Screen sequence for vector contamination (vecscreen)  Align two (or more) sequences using BLAST (bl2seq)  Search protein or nucleotide targets in PubChem BioAssay  Search SRA transcript and genomic libraries  Constraint Based Protein Multiple Alignment Tool  Needleman-Wunsch Global Sequence Alignment Tool  Search RefSeqGene http://blast.ncbi.nlm.nih.gov/Blast.cgi
  • 10. Although how BLAST works is a little complicated and lengthy so in short and brief explanation BLAST works in following two steps: 1. BLAST first searches for short regions of a given length (W) called “words” (or substrings) that score at least “T” when compared to the query sequence that align with sequences in the database (“target sequences”), using a substitution matrix. 2. For every pair of sequences (query and target) that have a word or words in common, BLAST extends the alignment in both directions to find alignments that score greater (are more similar) than a certain score threshold (S). These alignments are called high scoring pairs or HSPs; the maximal scoring HSPs are called MSPs.
  • 11. Query Sequence “words” (subsequences of the query sequ Query words are compared to the database (target sequences) and exact matches identified For each word match, alignment is extended in both directions to find alignments that score greater than some threshold (Schneider and La Rota 2000) (maximal segment pairs, or MSPs)
  • 12. There are various questions which a BLAST can handle which commonly arises in the research laboratory. Some of the most common questions arising are:  Which bacterial species have a protein that is related to a protein whose amino-acid sequence I know?  Where does the DNA I’ve sequenced come from?  What other genes encode proteins that exhibit structures similar to the one I’ve just determined?  What does the protein structure looks like?  What is the function of the gene or the protein that I've sequenced? (if it’s not known then you have some work to do)  What are the probable functions of the sequence I have? CONTINUED
  • 13. To answer the question arising we use BLAST for searching the database and then analyse the results which it produces. Here to explain this we will see an example We have following sequence of a protein from our experiments with a Mycobacterium tuberculosis Sequence: Now as to see whether this protein has any similarity between other organisms we perform a BLAST to understand it’s importance. To perform BLAST we go to following URL http://blast.ncbi.nlm.nih.gov/ CONTINUED
  • 14. After performing blast against a chosen or every blast we perform the analysis of the result  A chosen entry is shown below This entry shows that the sequence for which we ran BLAST hits against a database (here Swiss-Prot) has a 88% identity with Full=Single-stranded DNA-binding protein accession number P46390.2 Continued
  • 15. Entry shows us a score which describes the quality of the entry which has matched with the query which we have sequenced in our experiment. With the use of accession number which we have obtained after organising a BLAST search we can easily access the information about many aspects. Some of them are described below • The organism from which it came • Function of the protein • Region of DNA encoding for the gene • length of the sequence • taxonomy of the organism • FASTA sequence of the protein • Links for the 3D structure if it has been found Similarly we can see whether the sequence which we have sequenced is homologous (similar) or not with any of the sequence in the database which we are referring for the search. As mentioned we can search any database of our interest to check it’s function or function for similar structures.
  • 16. BLAST is the most important program in bioinformatics (maybe all of biology)  BLAST is based on sound statistical principles (key to its speed and sensitivity)  A basic understanding of its principles is key for using/interpreting BLAST output  BLAST can play an essential role for helping us to purpose the following structure of a protein Function of sequence Relation with an organism  Use blastn or MEGA-BLAST for DNA  Use PSI-BLAST for protein searches
  • 17. BOOKS  BIOINFORMATICS by by Pevsner  BIOINFORMATICS by Jin Xiong  BIOINFORMATICS by Ghosh and Malik INTERNET  Slide share www.slideshare.com  NCBI www.blast.ncbi.nlm.nih.gov/Blast.cgi  UniProt/Swiss-Prot www.uniprot.org