This document outlines the schedule and topics for a class on SNPs and gene expression. The class will have 6 sessions, with groups of 2 students presenting on selected papers in each session. Topics to be covered include an introduction to terminology like forward and reverse genetics; how often SNPs occur in the general population; databases of SNPs like dbSNP and haplotype projects like HapMap; gene expression analysis using microarrays; and experimental design, data analysis, and visualization techniques for microarray data. Papers for each group to present on are also listed. The next class will involve discussing one particular paper on SNPs.
2. schedule
• 6 classes
• 2 for SNP and 4 for gene expression
• 2 per group and will present in each class.
• Subjects and groups will be chosen randomly.
• Todays class is introductory, we will discuss
fundamental aspects.
4. How often SNPs occur?
• One in 300 bases – 10 M.
Not all single-nucleotide
changes are SNPs, though.
To be classified as a SNP,
two or more versions of a
sequence must each be
present in at least one
percent of the general
population.
5.
6. Each combination is a haplotype!!!!
Not necessarily all 8 haplotypes exist!!!!
7.
8. dbSNP and Hapmap project
• dbSNP: 2.5 million variations
• http://www.ncbi.nlm.nih.gov/SNP/
• Haplotypes are blocks – hapmap focuses on those
blocks
• http://hapmap.ncbi.nlm.nih.gov/thehapmap.
html.en
– 2002
– Nigeria, Japan, China, USA
10. Cy3:
570
Cy5:
670
Two Channel and single Channel microarray
Two channel – two conditions
Spike-in control probes are there
Used for Normalization
-Agilent dual mode; Eppendorf with
dualchip
Single channel: One condition at a time.
Abundance of a transcript will not be
known only relative abundance.
Affymetrix: Genechip; Illumina BeadChip
11. Microarray and Bioinformatics
• Experimental design.
• Standardization.
• Statistical data analysis.
• Data storage and visualization.
12. Contd…
• Experimental design:
– Biological replicates.
– Technical replicate.
– Randomization
• Standardization:
– Difficult – cant be easily replicated.
– Minimum Information About a Microarray
Experiment" (MIAME); 2001, nature genetics
• http://fged.org/projects/miame/
13. Contd..
• Data Analysis:
– Image Analysis – gridding of the spots
– Data processing:
• Background correction
• Visualization (MA Plot)
– M is log transformation and A is mean average scale
Most gene
should not
change -> Y is 0
14. Contd..
• Data Processing:
– Normalization (Remove non-biological variation)
• Simplest way: Assume all arrays have same median
gene expression
• Subtract median from each array
• Quantile normalization:
– Order values in each array
– Take average across probes
– Substitute probe intensity with average
– Change the original order