SlideShare uma empresa Scribd logo
1 de 42
Baixar para ler offline
Analysis of
Gene Expression Data
     _______________________

            Jhoirene B. Clemente
       Algorithms and Complexity Lab
     University of the Philippines Diliman
Overview

● Definitions
● Clustering of Gene Expression Data
● Visualizations of Gene Expression Data
Definitions
Gene
Basic unit of heredity in a living organism.
It is normally a stretch of DNA that codes
for a type of protein or for an RNA chain
that has a function in the organism.

Gene Expression Data
Expression level of genes in an individual
that is measured through Microarray
Definitions
Definitions
Definitions
Gene Expression Data

                        Gene     Gene
                               Expression
                       a
                       b
                       c
                       ...
                       n
Definitions
Gene Expression Data                 1 Sample

                              Gene     Gene
                                     Expression
                             a
                             b
                      n
                   Samples   c
                             ...
                             n
Definitions
   (n x m) Data Matrix          m Samples


            Gene   Sample   Sample      .....   Sample
                     1        1                   m
           a
           b
   n
Samples    c
           ...
           n
Definitions
   (n x m) Data Matrix          m Samples


            Gene   Sample   Sample      .....   Sample
                     1        1                   m
           a
           b
   n
Samples    c
           ...
           n
Clustering




Clustering is the unsupervised classification of
patterns including observations, data sets and
feature vectors into groups called clusters,
such that objects in the same cluster are similar to
each other while objects in different clusters are
dissimilar as possible.
Clustering




Clustering is the unsupervised classification of
patterns including observations, data sets and
feature vectors into groups called clusters,
such that objects in the same cluster are similar to
each other while objects in different clusters are
dissimilar as possible.
Cluster Analysis
Preprocessing
 ● Filtering

 ● Normalization




                   Clustering



                                Analysis
Clustering
Partitional
●   K-means Algorithm
●   X-means Algorithm



Hierarchical
Clustering
Given the (n x m) data matrix, we can

●   Cluster the set of genes
●   Cluster the set of samples
●   Cluster the set of genes and samples
    simultaneously.
Data Set
Data set is a time series gene expression data from
a synchronized population of yeast.
Data Set
Data set is a time series gene expression data from
a synchronized population of yeast.
Preprocessing
Filtering
 ● Removed genes not involved in cell cycle

    regulation
 ● Removed genes belonging to more than one

    group

Normalization
● All gene expression values range from -1.0 to

  1.0.
Data Set
Data matrix (384 genes and 17 samples) with 5
classifications.
Groupings based from cell cycle phase activation.
Data Set
Group 1: Resting Phase
Data Set
Group 2: First Growth Phase
Data Set
Group 3: Synthesis Phase
Data Set
Group 4: Second Growth Phase
Data Set
Group 5: Cell Division
Clustering of genes
K-means Algorithm

Given n data points in Rd
1. Assign k initial centers of the k clusters
2. Assign all the data points to the nearest cluster
   (Euclidean distance, Manhattan distance, etc.)
3. Adjust the k centers
4. Repeat steps 2 and 3 until convergence
Clustering of genes
K-means Algorithm

Given n data points in Rd
1. Assign k initial centers of the k clusters
2. Assign all the data points to the nearest cluster
   (Euclidean distance, Manhattan distance, etc.)
3. Adjust the k centers
4. Repeat steps 2 and 3 until convergence
                   k =5
    since we want to approximate the 5
Clustering of genes
Initialization

1. Choose the first k centers that will maximize the
   distance between the clusters
2. Sort the distances between all the data points
   and then choose the k initial points at constant
   intervals from the sorted list
3. Use the first k points in the data set as the first k
   centers
Clustering of genes
Using k-means clustering, with k =5
Clustering of genes
●   Clustering may suggest possible roles for genes
    with unknown functions
●   Clustering the samples or experiments may shed
    light on new subtypes of diseases.
●   Identify which type of treatment is suited for a
    specific type of cancer.
●   Building genetic networks
visualization
Vector Fusion
Non-metric Multidimensional Scaling (nMDS)
Principal Components Analysis (PCA)
Vector fusion
Visualization technique that uses the Single point
broken line parallel algorithm
nMDS visualization
Input (Dissimilarity Matrix=|ij|) actual distance
 ● In nMDS, only the rank order of entries is

   assumed to contain the significant information.
 ● Thus, the purpose of the non-metric MDS

   algorithm is to find a configuration of points
   whose distances reflect as closely as possible
   the rank order of the data.
 ● The transformation is by using a non parametric

   function f. (monotone regression)

             dij= f(dij) pseudo-distance
PCA
vector fusion
visualization
nmds visualization
nmds visualization
nmds visualization
nmds visualization
nmds visualization
nmds visualization
nmds visualization
References
2010: "Non-Metric Multidimensional Scaling and Vector
Fusion Visualization of Cell Cycle Independent Gene
Expressions for Gene Function Analysis", Clemente J.,
Salido J.A., (2010), Published in the conference
proceedings of National Conference on Information
Technology for Education(NCITE) 2010 and Philippine IT
Journal Feb 2011 Issue.

2010: "Cluster Analysis for Identifying Genes Highly
Correlated with a Phenotype", Clemente J.,
Undergraduate thesis, Department of Computer Science,
University of the Philippines Diliman
Thank you for
  Listening

Mais conteúdo relacionado

Mais procurados

Introduction of bioinformatics
Introduction of bioinformaticsIntroduction of bioinformatics
Introduction of bioinformaticsDr NEETHU ASOKAN
 
Functional annotation
Functional annotationFunctional annotation
Functional annotationRavi Gandham
 
Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics Senthil Natesan
 
Single nucleotide polymorphism, (SNP)
Single nucleotide polymorphism, (SNP)Single nucleotide polymorphism, (SNP)
Single nucleotide polymorphism, (SNP)KAUSHAL SAHU
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesMelanie Courtot
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPuneet Kulyana
 
Transcriptome analysis
Transcriptome analysisTranscriptome analysis
Transcriptome analysisRamaJumwal2
 
UniProt
UniProtUniProt
UniProtAmnaA7
 
Systems biology & Approaches of genomics and proteomics
 Systems biology & Approaches of genomics and proteomics Systems biology & Approaches of genomics and proteomics
Systems biology & Approaches of genomics and proteomicssonam786
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicskiran singh
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and toolsKAUSHAL SAHU
 
GENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICSGENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICSsandeshGM
 

Mais procurados (20)

Genomics
GenomicsGenomics
Genomics
 
Introduction of bioinformatics
Introduction of bioinformaticsIntroduction of bioinformatics
Introduction of bioinformatics
 
Genome analysis
Genome analysisGenome analysis
Genome analysis
 
Functional annotation
Functional annotationFunctional annotation
Functional annotation
 
Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics
 
Single nucleotide polymorphism, (SNP)
Single nucleotide polymorphism, (SNP)Single nucleotide polymorphism, (SNP)
Single nucleotide polymorphism, (SNP)
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resources
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
Transcriptome analysis
Transcriptome analysisTranscriptome analysis
Transcriptome analysis
 
Genome Database Systems
Genome Database Systems Genome Database Systems
Genome Database Systems
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
 
UniProt
UniProtUniProt
UniProt
 
Systems biology & Approaches of genomics and proteomics
 Systems biology & Approaches of genomics and proteomics Systems biology & Approaches of genomics and proteomics
Systems biology & Approaches of genomics and proteomics
 
Genomics types
Genomics typesGenomics types
Genomics types
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and tools
 
Finding genes
Finding genesFinding genes
Finding genes
 
TrEMBL
TrEMBLTrEMBL
TrEMBL
 
GENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICSGENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICS
 

Destaque

Introduction to Network Medicine
Introduction to Network MedicineIntroduction to Network Medicine
Introduction to Network MedicineMarc Santolini
 
Graph properties of biological networks
Graph properties of biological networksGraph properties of biological networks
Graph properties of biological networksngulbahce
 
Gene expression concept and analysis
Gene expression concept and analysisGene expression concept and analysis
Gene expression concept and analysisNoha Lotfy Ibrahim
 
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...Ramy K. Aziz
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its toolsGaurav Diwakar
 
Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelLars Juhl Jensen
 
Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biologylemberger
 

Destaque (11)

Introduction to Network Medicine
Introduction to Network MedicineIntroduction to Network Medicine
Introduction to Network Medicine
 
The Genopolis Microarray database
The Genopolis Microarray databaseThe Genopolis Microarray database
The Genopolis Microarray database
 
Graph properties of biological networks
Graph properties of biological networksGraph properties of biological networks
Graph properties of biological networks
 
Artificial Intelligence in Data Curation
Artificial Intelligence in Data CurationArtificial Intelligence in Data Curation
Artificial Intelligence in Data Curation
 
Gene expression concept and analysis
Gene expression concept and analysisGene expression concept and analysis
Gene expression concept and analysis
 
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
 
RT-PCR
RT-PCRRT-PCR
RT-PCR
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its tools
 
Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems level
 
Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biology
 
Dr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 MedicineDr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 Medicine
 

Semelhante a Gene Expression Data Analysis

LE03.doc
LE03.docLE03.doc
LE03.docbutest
 
20100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_020100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_0Computer Science Club
 
Seminar Slides
Seminar SlidesSeminar Slides
Seminar Slidespannicle
 
MCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdfMCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdfRajendraChavhan3
 
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDoctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDavide Chicco
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeSean Davis
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrUSD Bioinformatics
 
20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal ClubMed_KU
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysisAcad
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Rakibul Hasan Pranto
 
Identification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning MethodIdentification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning Methodpraveena06
 
Microarray Data Analysis
Microarray Data AnalysisMicroarray Data Analysis
Microarray Data Analysisyuvraj404
 
Survey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue ClassificationSurvey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue Classificationperfj
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsajay301
 
High Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and VisualizationHigh Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and VisualizationDmitry Grapov
 

Semelhante a Gene Expression Data Analysis (20)

LE03.doc
LE03.docLE03.doc
LE03.doc
 
Microarray Analysis
Microarray AnalysisMicroarray Analysis
Microarray Analysis
 
20100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_020100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_0
 
Seminar Slides
Seminar SlidesSeminar Slides
Seminar Slides
 
Dbm630 lecture09
Dbm630 lecture09Dbm630 lecture09
Dbm630 lecture09
 
Gene expression profiling ii
Gene expression profiling  iiGene expression profiling  ii
Gene expression profiling ii
 
MCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdfMCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdf
 
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDoctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the Transcriptome
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corr
 
20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019
 
Gene expression profiling i
Gene expression profiling  iGene expression profiling  i
Gene expression profiling i
 
Identification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning MethodIdentification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning Method
 
Microarray Data Analysis
Microarray Data AnalysisMicroarray Data Analysis
Microarray Data Analysis
 
Survey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue ClassificationSurvey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue Classification
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
31931 31941
31931 3194131931 31941
31931 31941
 
High Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and VisualizationHigh Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and Visualization
 

Mais de Jhoirene Clemente

Reoptimization Algorithms and Persistent Turing Machines
Reoptimization Algorithms and Persistent Turing MachinesReoptimization Algorithms and Persistent Turing Machines
Reoptimization Algorithms and Persistent Turing MachinesJhoirene Clemente
 
Introduction to Approximation Algorithms
Introduction to Approximation AlgorithmsIntroduction to Approximation Algorithms
Introduction to Approximation AlgorithmsJhoirene Clemente
 
Reoptimization techniques for solving hard problems
Reoptimization techniques for solving hard problemsReoptimization techniques for solving hard problems
Reoptimization techniques for solving hard problemsJhoirene Clemente
 
Parallel Random Projection for Motif Discovery on GPUs
Parallel Random Projection for Motif Discovery on GPUsParallel Random Projection for Motif Discovery on GPUs
Parallel Random Projection for Motif Discovery on GPUsJhoirene Clemente
 
Consurrent Processes and Reaction
Consurrent Processes and ReactionConsurrent Processes and Reaction
Consurrent Processes and ReactionJhoirene Clemente
 

Mais de Jhoirene Clemente (7)

Reoptimization Algorithms and Persistent Turing Machines
Reoptimization Algorithms and Persistent Turing MachinesReoptimization Algorithms and Persistent Turing Machines
Reoptimization Algorithms and Persistent Turing Machines
 
LaTex Tutorial
LaTex TutorialLaTex Tutorial
LaTex Tutorial
 
Introduction to Approximation Algorithms
Introduction to Approximation AlgorithmsIntroduction to Approximation Algorithms
Introduction to Approximation Algorithms
 
Reoptimization techniques for solving hard problems
Reoptimization techniques for solving hard problemsReoptimization techniques for solving hard problems
Reoptimization techniques for solving hard problems
 
Randomized Computation
Randomized ComputationRandomized Computation
Randomized Computation
 
Parallel Random Projection for Motif Discovery on GPUs
Parallel Random Projection for Motif Discovery on GPUsParallel Random Projection for Motif Discovery on GPUs
Parallel Random Projection for Motif Discovery on GPUs
 
Consurrent Processes and Reaction
Consurrent Processes and ReactionConsurrent Processes and Reaction
Consurrent Processes and Reaction
 

Último

Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 

Último (20)

Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 

Gene Expression Data Analysis

  • 1. Analysis of Gene Expression Data _______________________ Jhoirene B. Clemente Algorithms and Complexity Lab University of the Philippines Diliman
  • 2. Overview ● Definitions ● Clustering of Gene Expression Data ● Visualizations of Gene Expression Data
  • 3. Definitions Gene Basic unit of heredity in a living organism. It is normally a stretch of DNA that codes for a type of protein or for an RNA chain that has a function in the organism. Gene Expression Data Expression level of genes in an individual that is measured through Microarray
  • 6. Definitions Gene Expression Data Gene Gene Expression a b c ... n
  • 7. Definitions Gene Expression Data 1 Sample Gene Gene Expression a b n Samples c ... n
  • 8. Definitions (n x m) Data Matrix m Samples Gene Sample Sample ..... Sample 1 1 m a b n Samples c ... n
  • 9. Definitions (n x m) Data Matrix m Samples Gene Sample Sample ..... Sample 1 1 m a b n Samples c ... n
  • 10. Clustering Clustering is the unsupervised classification of patterns including observations, data sets and feature vectors into groups called clusters, such that objects in the same cluster are similar to each other while objects in different clusters are dissimilar as possible.
  • 11. Clustering Clustering is the unsupervised classification of patterns including observations, data sets and feature vectors into groups called clusters, such that objects in the same cluster are similar to each other while objects in different clusters are dissimilar as possible.
  • 12. Cluster Analysis Preprocessing ● Filtering ● Normalization Clustering Analysis
  • 13. Clustering Partitional ● K-means Algorithm ● X-means Algorithm Hierarchical
  • 14. Clustering Given the (n x m) data matrix, we can ● Cluster the set of genes ● Cluster the set of samples ● Cluster the set of genes and samples simultaneously.
  • 15. Data Set Data set is a time series gene expression data from a synchronized population of yeast.
  • 16. Data Set Data set is a time series gene expression data from a synchronized population of yeast.
  • 17. Preprocessing Filtering ● Removed genes not involved in cell cycle regulation ● Removed genes belonging to more than one group Normalization ● All gene expression values range from -1.0 to 1.0.
  • 18. Data Set Data matrix (384 genes and 17 samples) with 5 classifications. Groupings based from cell cycle phase activation.
  • 19. Data Set Group 1: Resting Phase
  • 20. Data Set Group 2: First Growth Phase
  • 21. Data Set Group 3: Synthesis Phase
  • 22. Data Set Group 4: Second Growth Phase
  • 23. Data Set Group 5: Cell Division
  • 24. Clustering of genes K-means Algorithm Given n data points in Rd 1. Assign k initial centers of the k clusters 2. Assign all the data points to the nearest cluster (Euclidean distance, Manhattan distance, etc.) 3. Adjust the k centers 4. Repeat steps 2 and 3 until convergence
  • 25. Clustering of genes K-means Algorithm Given n data points in Rd 1. Assign k initial centers of the k clusters 2. Assign all the data points to the nearest cluster (Euclidean distance, Manhattan distance, etc.) 3. Adjust the k centers 4. Repeat steps 2 and 3 until convergence k =5 since we want to approximate the 5
  • 26. Clustering of genes Initialization 1. Choose the first k centers that will maximize the distance between the clusters 2. Sort the distances between all the data points and then choose the k initial points at constant intervals from the sorted list 3. Use the first k points in the data set as the first k centers
  • 27. Clustering of genes Using k-means clustering, with k =5
  • 28. Clustering of genes ● Clustering may suggest possible roles for genes with unknown functions ● Clustering the samples or experiments may shed light on new subtypes of diseases. ● Identify which type of treatment is suited for a specific type of cancer. ● Building genetic networks
  • 29. visualization Vector Fusion Non-metric Multidimensional Scaling (nMDS) Principal Components Analysis (PCA)
  • 30. Vector fusion Visualization technique that uses the Single point broken line parallel algorithm
  • 31. nMDS visualization Input (Dissimilarity Matrix=|ij|) actual distance ● In nMDS, only the rank order of entries is assumed to contain the significant information. ● Thus, the purpose of the non-metric MDS algorithm is to find a configuration of points whose distances reflect as closely as possible the rank order of the data. ● The transformation is by using a non parametric function f. (monotone regression) dij= f(dij) pseudo-distance
  • 32. PCA
  • 41. References 2010: "Non-Metric Multidimensional Scaling and Vector Fusion Visualization of Cell Cycle Independent Gene Expressions for Gene Function Analysis", Clemente J., Salido J.A., (2010), Published in the conference proceedings of National Conference on Information Technology for Education(NCITE) 2010 and Philippine IT Journal Feb 2011 Issue. 2010: "Cluster Analysis for Identifying Genes Highly Correlated with a Phenotype", Clemente J., Undergraduate thesis, Department of Computer Science, University of the Philippines Diliman
  • 42. Thank you for Listening