SlideShare uma empresa Scribd logo
1 de 132
Baixar para ler offline
Bioinformatics for discovery:
Introduction to GWAS and EWAS
BMI 701:Introduction to Biomedical Informatics

12/1/2015
chirag@hms.harvard.edu

@chiragjp

www.chiragjpgroup.org
Chirag J Patel
P = G + EType 2 Diabetes

Cancer

Alzheimer’s

Gene expression
Phenotype Genome
Variants
Environment
Infectious agents

Nutrients

Pollutants

Drugs
Complex traits are a function of genes and
environment...
We are great at G investigation!
over 2000 

Genome-wide Association Studies (GWAS)

https://www.ebi.ac.uk/gwas/
G
>2,000 traits/diseases

>15,000 SNPs

>16,000 SNP-trait associations
https://www.ebi.ac.uk/gwas/
Dissecting G in P:
What is a Genome-wide Association Study?
Hypothesis-free β€œsearch engine” for genetic variants 

associated with a complex trait or disease 

in unrelated populations
SNP(A) SNP(a)
diseased
non-
diseased
SNP(A) SNP(a)
diseased
non-
diseased
SNP(A) SNP(a)
diseased
non-
diseased
SNP(A) SNP(a)
diseased
non-
diseased
SNP(A) SNP(a)
diseased
non-
diseased
SNP(A) SNP(a)
diseased
non-
diseased
SNP(A) SNP(a)
diseased
non-
diseased
SNP(A) SNP(a)
diseased
non-
diseased
SNP(A) SNP(a)
diseased
non-
diseased
SNP(Z) SNP(z)
diseased
non-
diseasedgenome-wide
The road to GWAS...
A new paradigm of GWAS for discovery of G in P:
Human Genome Project to GWAS
Sequencing of the genome
2001
HapMap project:
http://hapmap.ncbi.nlm.nih.gov/
Characterize common variation
2001-current day
High-throughput variant
assay
< $99 for ~1M variants
Measurement tools
~2003 (ongoing)
ARTICLES
Genome-wide association study of 14,000
cases of seven common diseases and
3,000 shared controls
The Wellcome Trust Case Control Consortium*
There is increasing evidence that genome-wide association (GWA) studies represent a powerful approach to the
identification of genes involved in common human diseases. We describe a joint GWA study (using the Affymetrix GeneChip
500K Mapping Array Set) undertaken in the British population, which has examined ,2,000 individuals for each of 7 major
diseases and a shared set of ,3,000 controls. Case-control comparisons identified 24 independent association signals at
P , 5 3 1027
: 1 in bipolar disorder, 1 in coronary artery disease, 9 in Crohn’s disease, 3 in rheumatoid arthritis, 7 in type 1
diabetes and 3 in type 2 diabetes. On the basis of prior findings and replication studies thus-far completed, almost all of these
signals reflect genuine susceptibility effects. We observed association at many previously identified loci, and found
compelling evidence that some loci confer risk for more than one of the diseases studied. Across all diseases, we identified a
25 27
Vol 447|7 June 2007|doi:10.1038/nature05911
Nature 2008
Comprehensive, high-throughput analyses
GWAS
Number of raw publications with subject of
β€œGWAS”
0
1000
2000
3000
1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Year
NumberofPublications'GWAS'
pubmed MeSH terms:
human + GWAS
Number of raw publications with subject of
β€œGWAS”
0
1000
2000
3000
1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Year
NumberofPublications'GWAS'
pubmed MeSH terms:

human + GWAS
Risch + Merikangas
linkage vs. association
human genome sequenced
GWAS
age-related macular degeneration
mega-meta-GWAS
WTCCC
GWAS is relevant today (even with NGS) around the corner
Why execute GWAS?
Geneticists have made substantial progress in
identifying the genetic basis of many human
diseases, at least those with conspicuous deter-
minants.ThesesuccessesincludeHuntington's
disease, Alzheimer's disease, and some forms of
breast cancer. However, the detection of ge-
netic factors for complex diseases-such as
schizophrenia, bipolardisorder, anddiabetes-
has been far more complicated. There have
been numerous reports of genes or loci that
might underlie these disorders, butfew ofthese
findings have been replicated. The modest na-
ture ofthe gene effectsforthese disorders likely
explains the contradictory and inconclusive
claims about their identification. Despite the
small effects of such genes, the magnitude of
theirattributable risk (theproportion ofpeople
affectedduetothem) maybelargebecause they
are quite frequent in the population, making
them ofpublic health significance.
Has the genetic study ofcomplex disorders
reached its limits? The persistent lack of
replicability of these reports of linkage be-
tween various loci and complex diseases
might imply that it has. We argue below that
age analysis we have chosen for this argu-
ment is a popular current paradigm in which
pairs of siblings, both with the disease, are
examined for sharing of alleles at multiple
sites in the genome defined by genetic mark-
ers. The more often the affected siblings
share the same allele at a particular site, the
more likely the site is close to the disease
gene. Using the formulas in (1), we calculate
the expected proportion Yofalleles shared by
a pair ofaffected siblings for the best possible
case-that is, a closely linked marker locus
(recombination fraction 0 = 0) that is fully
informative (heterozygosity = 1) (2)-as
1 +W wherew= pq(y-1)2
2+w (py+q)2
If there is no linkage of a marker at a
particular site to the disease, the siblings
would be expected to share alleles 50% ofthe
time; that is, Y would equal 0.5. Values of Y
for various values ofp and y are given in the
third column of the table. For an allele of
moderate frequency (p is 0.1 to 0.5) that con-
linkage analysis for
about 2 or less will ne
because the numbe
(more than -2500)
able.
Although testsof
est effect are of low
above example, direc
a disease locus itself
To illustrate this poi
sion/disequilibrium t
In this test, transmis
at a locus from heter
affected offspring is e
lian inheritance, all a
chance ofbeing tran
eration. In contrast,
associated with dise
mitted more often th
For this approach,
with multiple affect
just on single affect
parents. For the same
can calculate the pr
parents as pq(y + 1
the probability for a
transmit the high ris
Association tests ca
pairs of affected sibl
associatedwithdiseas
over 50% is the same
the probability ofpar
creased at lowvalues
the probability ofpar
creased. The formula
The Future of Genetic Studies of
Complex Human Diseases
Neil Risch and Kathleen Merikangas
onimm, 0In"a0,"a,
Geneticists have made substantial progress in
identifying the genetic basis of many human
diseases, at least those with conspicuous deter-
minants.ThesesuccessesincludeHuntington's
disease, Alzheimer's disease, and some forms of
breast cancer. However, the detection of ge-
netic factors for complex diseases-such as
schizophrenia, bipolardisorder, anddiabetes-
has been far more complicated. There have
been numerous reports of genes or loci that
might underlie these disorders, butfew ofthese
findings have been replicated. The modest na-
ture ofthe gene effectsforthese disorders likely
explains the contradictory and inconclusive
claims about their identification. Despite the
small effects of such genes, the magnitude of
theirattributable risk (theproportion ofpeople
affectedduetothem) maybelargebecause they
are quite frequent in the population, making
them ofpublic health significance.
Has the genetic study ofcomplex disorders
reached its limits? The persistent lack of
replicability of these reports of linkage be-
tween various loci and complex diseases
might imply that it has. We argue below that
age analysis we have chosen for this ar
ment is a popular current paradigm in whi
pairs of siblings, both with the disease,
examined for sharing of alleles at multip
sites in the genome defined by genetic mar
ers. The more often the affected sibli
share the same allele at a particular site, t
more likely the site is close to the dise
gene. Using the formulas in (1), we calcul
the expected proportion Yofalleles shared
a pair ofaffected siblings for the best possi
case-that is, a closely linked marker lo
(recombination fraction 0 = 0) that is fu
informative (heterozygosity = 1) (2)-as
1 +W wherew= pq(y-1)2
2+w (py+q)2
If there is no linkage of a marker at
particular site to the disease, the sibli
would be expected to share alleles 50% oft
time; that is, Y would equal 0.5. Values o
for various values ofp and y are given in t
third column of the table. For an allele
moderate frequency (p is 0.1 to 0.5) that co
The Future of Genetic Studies of
Complex Human Diseases
Neil Risch and Kathleen Merikangas
Science, 1996
A new paradigm is needed for discovery!
How does a GWAS work?
Single nucleotide polymorphisms (SNPs):
How many SNPs are in the human genome?
>3,000,000,000 bases in human genome
SNPs appear ~1000 bases
~3,000,000 SNPs
40-60% have minor allele frequency <5%

GWAS focus on frequency >5%
HapMap Consortium, 2010
Can’t measure everything:
Tag SNPs and Linkage Disequilibrium (LD)
LD = co-occurance of SNPs in a contiguous region
Bush and Moore, 2012
The phenomenon of LD makes GWAS possible:
How and why?: Indirect association
additional studies to map the precise
location of the influential SNP.
Conceptually, the end result of GWAS
under the common disease/common var-
needed to capture the variation
African genome.
It is important to note that t
ogy for measuring genomic
Figure 3. Indirect Association. Genotyped SNPs often lie in a region of high linka
will be statistically associated with disease as a surrogate for the disease SNP throu
doi:10.1371/journal.pcbi.1002822.g003
Bush and Moore, 2012
LD blocks
Can’t measure everything:
Tag SNPs and Linkage Disequilibrium
Tag SNPs are common proxies for other SNPs

500K - 1M per chip
tified significant associations for seven SNPs representing four new
T2DM loci (Table 1). In all cases, the strongest association for the
MAX statistic (see Methods) was obtained with the additive model.
of this gene (Fig. 2a)
solely in the secretory
final stages of insulin
*
*
*
0
2
4
–log10[P]
–log10[P]
*
4954642sr
2373971sr
3373971sr
445409sr
8012261sr
3349941sr
883429sr
2019462sr
0349941sr
90350501sr
036169sr
0415007sr
2225991sr
6136642sr
8136642sr
1869646sr
8798751sr
04928201sr
3926642sr
5926642sr
43666231sr
9926642sr
2954642sr
01350501sr
5769646sr
4577187sr
4769646sr
41350501sr
5784931sr
2173387sr
39250501sr
5050007sr
7492602sr
1255051sr
156868sr
4373387sr
4784931sr
7501107sr
2697402sr
91518711sr
6461001sr
29250501sr
5889103sr
8669646sr
0889103sr
4688392sr
SLC30A8 IDE
0
2
4
7912381sr
3148707sr
0283856sr
52078111sr
5227373sr
0491242sr
2369412sr
2297881sr
662155sr
7790197sr
44068701sr
35075221sr
5826807sr
7851092sr
9409522sr
–log10[P]
–log10[P]
EXT2 ALX4
0
2
4
*** *
0
2
4
a b
c d
LD block
2 alleles are correlated because they are inherited
together
Sladek et al, 2007
image: www.lifa-core.de/
Digitizing SNPs:
e.g., Illumina Infinium Array
image: illumina.com
Assessing Thousands of Factors Simultaneously:
Data-driven search for differences in SNP frequencies
~100,000 - ~1,000,000 association tests
disease cases
healthy controls
GCAGGTACATG...GGTA...
GCAGGTACACG...GGTA...
GCAGGTACATG...GGTA...
GCAGGTACACG...GGTA...
GCAGGTACATG...GGTA...
GCAGGTACACG...GGTA...
disease cases
GCAGGTACATG...GGTA...
GCAGGTACATG...GGTA...
GCAGGTACATG...GGTA...
GCAGGTACATG...GGTA...
healthy controls
Associating One SNP with Disease
Case-Control Study Design
DiseaseSNP (A/a)
?
A a
diseased
non-
diseased
cases
controls
Associating One SNP with Disease
What is an β€œOdds Ratio”?
DiseaseSNP (A/a)
?
A a
diseased c d
non-
diseased
x y
cases
controls
Chi-squared test
Odds Ratio a vs A:
Odds of disease with allele a
vs.
Odds of disease with allele A
1: equal odds (no difference)

>1: increased odds (increased risk)

<1: decreased odds (decreased risk)
Associating One SNP with Disease
Calculating the Odds Ratio
DiseaseSNP (A/a)
?
A a
diseased c d
non-
diseased
x y
cases
controls
Chi-squared test

Odds Ratio
dx
cy
y/x
d/c
[d/(d+y)]/[y/(d+y)]
Odds Ratio a vs A:
[c/(x+y)]/[x/(c+x)]
Odds with allele a
Odds with allele A
How would you interpret an OR of 2?
Associating One SNP with Disease
Cohort Study Design
DiseaseSNP (A/a)
?
β€’Direct measure of risk vs. odds ratio

β€’Need to wait!
β€’If incidence is low, N needs to be large!
Non-diseasedSNP (A/a)
vs.
Cox survival regression

Relative Risk
Models to associate genotypes with disease
Examples for a case-control study
Aa AA
AA
aa Aa
AaaaAa
Disease Non-diseased
ND=4 NC=4
Models to associate genotypes with disease
Examples for a case-control study
Aa AA
AA
aa Aa
AaaaAa
Disease Non-diseased
ND=4 NC=4
A a
diseased
non-
diseased
6 2
2 6
OR A (vs a)

OR a (vs A)
AA Aa aa
diseased
non-
diseased
Models to associate genotypes with disease
Genotypic Test (β€œ2 or 1 df test”)
Aa AA
AA
aa Aa
AaaaAa
Diseased Non-diseased
ND=4 NC=4
2 OR AA (vs. Aa)

aa (vs. Aa)
2 0
220
Associating One SNP with Quantitative Trait
(e.g., height, weight, cholesterol)
40
60
80
100
1 2 3
factor(SNP)
trait
GG GC CC
height
SNP rs1234 SNP rs123456
25
50
75
100
125
1 2 3
factor(SNP)
trait
height
CC CT TT
Associating One SNP with Quantitative Trait
Linear Regression and Additive Risk Model
y=Ι‘+Ξ²x+Ξ΅
25
50
75
100
125
1 2 3
factor(SNP)
trait
height
CC (0) CT (1) TT (2)
SNP rs123456
height = Ι‘+Ξ²x
xCC=0 if individual is CC
xCT=1 if individual is CT
xTT=2 if individual is TT
Ι‘
Ξ²: change in height for 1 risk allele
T= risk allele
Ξ²
Prototypical β€œManhattan plot” to visualize
associations
Science, 2007
~100,000 - ~1,000,000 association tests
evol
part
ease
tase
well
biol
T
capt
imp
STR
reve
subs
libri
clea
βˆ’log10(P)
0
5
10
15
Chromosome
22
X
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
80
60
40
100
rvedteststatistic
a
b
NATURE|Vol 447|7 June 2007
AA Aa aa
diseased
non-
diseased
ibility with schizophrenia, a psychotic disorder with many similar-
ities to BD. In particular association findings have been reported with
assium channel. Ion channelopathies are well-recognized as causes of
episodic central nervous system disease, including seizures, ataxias
βˆ’log10
(P)
0
5
10
15
0
5
10
15
0
5
10
15
0
5
10
15
0
5
10
15
0
5
10
15
0
5
10
15
Chromosome
Type 2 diabetes
22
XX
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
22
XX
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
22
XX
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
22
XX
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
22
XX
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
22
XX
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
22
XX
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Coronary artery disease
Crohn’s disease
Hypertension
Rheumatoid arthritis
Type 1 diabetes
Bipolar disorder
Figure 4 | Genome-wide scan for seven diseases. For each of seven diseases
2log10 of the trend test P value for quality-control-positive SNPs, excluding
Chromosomes are shown in alternating colours for clarity, with
P values ,1 3 1025
highlighted in green. All panels are truncated at
Type I Error:
False Positives!
what is a p-value?
chance we attain the observed result if no difference (H0)
Many tests: some can be significant (low p-value by chance)!
100 tests at a p-value of 0.05...
how many would be significant per chance?
Bonferroni β€œcorrection”:

Correct the 0.05 significance level by number of tests
e.g., 1M SNPs: 0.05/1x10-6 = 5x10-8
QQplot:
Distribution of of observed p-values vs. Ho p-
values
Histogram of runif(10000)
runif(10000)
Frequency
0.0 0.2 0.4 0.6 0.8 1.0
0100200300400500
p-values under Ho
Histogram of gwas$P.value
gwas$P.value
Frequency
0.0 0.2 0.4 0.6 0.8 1.0
050000100000150000
p-values of GWAS in Total Cholesterol
Global Lipids Consortium, 2012random uniform distribution
QQplot:
Distribution of of observed p-values vs. Ho p-
values
Histogram of gwas$P.value
gwas$P.value
Frequency
0.0 0.2 0.4 0.6 0.8 1.0
050000100000150000
p-values of GWAS in Total Cholesterol
Which diseases show evidence of association?
Examining the QQplot of test statistics in WTCCC
sent study cannot provideconclusive exclusion of any given gene. This
is the consequence of several factors including: less-than-complete
coverage of common variation genome-wide on the Affymetrix chip;
poor coverage (by design) of rare variants, including many structural
variants (thereby reducing power to detect rare, penetrant, alleles)25
;
difficultieswithdefining thefullgenomicextentofthegene ofinterest;
and, despite the sample size, relatively low power to detect, at levels of
already allow us, for selected diseases, to highlight pathways and
mechanisms of particular interest. Naturally, extensive resequencing
and fine-mapping work, followed by functional studies will be
required before such inferences can be translated into robust state-
ments about the molecular and physiological mechanisms involved.
We turn now to a discussion of the main findings for each disease,
focusing here only on the most significant and interesting results
25
20
20
15
15
10
10
5
5
30
0
0
25
20
20
15
15
10
10
5
5
30
0
0
25
20
20
15
15
10
10
5
5
30
0
0
25
20
20
15
15
10
10
5
5
30
0
0
25
20
20
15
15
10
10
5
5
30
0
0
25
20
20
15
15
10
10
5
5
30
0
0
25
20
20
15
15
10
10
5
5
30
0
0
BD
Observedteststatistic
Expected chi-squared value
CAD CD
HT RA
T2D
T1D
Figure 3 | Quantile-quantile plots for seven genome-wide scans. For each
of the seven disease collections, a quantile-quantile plot of the results of the
trend test is shown in black for all SNPs that pass the standard project filters,
have a minor allele frequency .1% and missing data rate ,1%. SNPs that
360,000 SNPs. SNPs at which the test statistic exceeds 30 are represented by
triangles. Additional quantile-quantile plots, which also exclude all SNPs
located in the regions of association listed in Table 3, are superimposed in
blue (for BD, the exclusion of these SNPs has no visible effect on the plot, and
Observational associations do not equal causation...
Ice Cream $ Drowning
Confounding bias
What is a confounder?
Summer!
?
Confounder is correlated to both the β€œrisk” factor and disease,

leading to invalid inference.

Common source of bias in observational studies (e.g., case-control,
cohort, etc)
SNP Disease
Population Stratification:
A source of possible confounding in GWAS
race/ethnicity
?
Ancestry correlated with allele frequency and disease

GWAS are done on specific populations separately.

(most have been done in populations of European ancestry)
FTO Diabetes
Mediation
SNPs indicative of a mediator factor?
Example: FTO and Type 2 Diabetes
Body Mass
?
Association between FTO and Type 2 Diabetes via BMI?
... or does FTO have a independent role in Type 2 Diabetes...?
FTO Body Mass
PLINK:
(Standard) Whole Genome Analysis Software
PLINK:
(Standard) Whole Genome Analysis Software
http://pngu.mgh.harvard.edu/~purcell/plink/
β€’cited >9000 times since 2007

β€’allele frequency

β€’linkage disequilibrium (LD)

β€’data manipulation/filtering

β€’association: allelic, genotypic models

β€’chi-square

β€’logistic

β€’linear
Examples: 

GWASs in Type 2 Diabetes
Type 2 Diabetes Mellitus:
A complex, multifactorial disease
β€’Insulin production vs. use

β€’beta-cell function

β€’insulin sensitivity (BMI)

β€’Moves glucose from blood into
cells

β€’Complications arise due to
glucose in blood, hyperglycemia
β€’diagnosed by blood glucose
levels

CDC,
family history: 25%
body weight, diet, lifestyle, age
ARTICLES
A genome-wide association study
identifies novel risk loci for type 2 diabetes
Robert Sladek1,2,4
, Ghislain Rocheleau1
*, Johan Rung4
*, Christian Dina5
*, Lishuang Shen1
, David Serre1
,
Philippe Boutin5
, Daniel Vincent4
, Alexandre Belisle4
, Samy Hadjadj6
, Beverley Balkau7
, Barbara Heude7
,
Guillaume Charpentier8
, Thomas J. Hudson4,9
, Alexandre Montpetit4
, Alexey V. Pshezhetsky10
, Marc Prentki10,11
,
Barry I. Posner2,12
, David J. Balding13
, David Meyre5
, Constantin Polychronakos1,3
& Philippe Froguel5,14
Type 2 diabetes mellitus results from the interaction of environmental factors with a combination of genetic variants, most of
which were hitherto unknown. A systematic search for these variants was recently made possible by the development of
high-density arrays that permit the genotyping of hundreds of thousands of polymorphisms. We tested 392,935
single-nucleotide polymorphisms in a French case–control cohort. Markers with the most significant difference in genotype
frequencies between cases of type 2 diabetes and controls were fast-tracked for testing in a second cohort. This identified
four loci containing variants that confer type 2 diabetes risk, in addition to confirming the known association with the TCF7L2
gene. These loci include a non-synonymous polymorphism in the zinc transporter SLC30A8, which is expressed exclusively in
insulin-producing b-cells, and two linkage disequilibrium blocks that contain genes potentially involved in b-cell
development or function (IDE–KIF11–HHEX and EXT2–ALX4). These associations explain a substantial portion of disease risk
and constitute proof of principle for the genome-wide approach to the elucidation of complex genetic traits.
The rapidly increasing prevalence of type 2 diabetes mellitus (T2DM) is
thought to be due to environmental factors, such as increased availabil-
ity of food and decreased opportunity and motivation for physical
activity, acting on genetically susceptible individuals. The heritability
of T2DM is one of the best established among common diseases and,
consequently, genetic risk factors for T2DM have been the subject of
intense research1
. Although the genetic causes of many monogenic
forms of diabetes (maturity onset diabetes in the young, neonatal mito-
chondrial and other syndromic types of diabetes mellitus) have been
elucidated, few variants leading to common T2DM have been clearly
identified and individually confer only a small risk (odds ratio < 1.1–
1.25) of developing T2DM1
. Linkage studies have reported many
T2DM-linked chromosomal regions and have identified putative, cau-
sative genetic variants in CAPN10 (ref. 2), ENPP1 (ref. 3), HNF4A (refs
4, 5) and ACDC (also called ADIPOQ)6
. In parallel, candidate-gene
studieshavereportedmanyT2DM-associatedloci,withcodingvariants
in the nuclear receptor PPARG (P12A)7
and the potassium channel
KCNJ11 (E23K)8
being among the very few that havebeen convincingly
replicated. The strongest known (odds ratio < 1.7) T2DM association9
was recently mapped to the transcription factor TCF7L2 and has been
consistently replicated in multiple populations10–20
.
Subjects and study design
The recent availability of high-density genotyping arrays, which com-
bine the power of association studies with the systematic nature of a
genome-wide search, led us to undertake a two-stage, genome-wide
association study to identify additional T2DM susceptibility loci
(Supplementary Fig. 1). In the first stage of this study, we obtained
genotypes for 392,935 single-nucleotide polymorphisms (SNPs) in
1,363 T2DM cases and controls (Supplementary Table 1). In order to
enrich for risk alleles21
, the diabetic subjects studied in stage 1 were
selected to have at least one affected first degree relative and age at
onset under 45 yr (excluding patients with maturity onset diabetes in
the young). Furthermore, in order to decrease phenotypic hetero-
geneity and to enrich for variants determining insulin resistance and
b-cell dysfunction through mechanisms other than severe obesity, we
initially studied diabetic patients with a body mass index (BMI)
,30 kg m22
. Control subjects were selected to have fasting blood
glucose ,5.7 mmol l21
in DESIR, a large prospective cohort for the
study of insulin resistance in French subjects22
.
Genotypes for each study subject were obtained using two plat-
forms: Illumina Infinium Human1 BeadArrays, which assay 109,365
SNPs chosen using a gene-centred design; and Human Hap300
BeadArrays, which assay 317,503 SNPs chosen to tag haplotype
blocks identified by the Phase I HapMap23
. Of the 409,927 markers
that passed quality control (Supplementary Tables 2 and 3), geno-
types were obtained for an average of 99.2% (Human1) and 99.4%
(Hap300) of markers for each subject with a reproducibility of
.99.9% (both platforms). Forty-three subjects were removed from
analysis because of evidence of intercontinental admixture (Sup-
plementary Fig. 3) and an additional four because their genotype-
determined gender disagreed with clinical records. In total, T2DM
association was tested for 100,764 (Human1) and 309,163 (Hap300)
SNPs representing 392,935 unique loci (Fig. 1). Because of unequal
male/female ratios in our cases and controls, we analysed the 12,666
sex-chromosome SNPs separately for each gender.
*These authors contributed equally to this work.
1
Departments of Human Genetics, 2
Medicine and 3
Pediatrics, Faculty of Medicine, McGill University, Montreal H3H 1P3, Canada. 4
McGill University and Genome Quebec Innovation
Centre, Montreal H3A 1A4, Canada. 5
CNRS 8090-Institute of Biology, Pasteur Institute, Lille 59019 Cedex, France. 6
Endocrinology and Diabetology, University Hospital, Poitiers
86021 Cedex, France. 7
INSERM U780-IFR69, Villejuif 94807, France. 8
Endocrinology-Diabetology Unit, Corbeil-Essonnes Hospital, Corbeil-Essonnes 91100, France. 9
Ontario
Institute for Cancer Research, Toronto M5G 1L7, Canada. 10
Montreal Diabetes Research Center, Montreal H2L 4M1, Canada. 11
Molecular Nutrition Unit and the Department of
Nutrition, University of Montreal and the Centre Hospitalier de l’UniversiteΒ΄ de MontreΒ΄al, Montreal H3C 3J7, Canada. 12
Polypeptide Hormone Laboratory and Department of Anatomy
and Cell Biology, Montreal H3A 2B2, Canada. 13
Department of Epidemiology & Public Health, Imperial College, St Mary’s Campus, Norfolk Place, London W2 1PG, UK. 14
Section of
Genomic Medicine, Imperial College London W12 0NN, and Hammersmith Hospital, Du Cane Road, London W12 0HS, UK.
881
NatureΒ©2007 Publishing Group
Nature, 2/2007
References and Notes
1. B. G. Richmond, D. S. Strait, Nature 404, 382 (2000).
2. J. Kingdon, Lowly Origins (Princeton Univ. Press,
Princeton, NJ, 2003).
3. C. V. Ward, M. G. Leakey, A. Walker, Evol. Anthropol. 7,
197 (1999).
4. Y. Haile-Selassie, Nature 412, 178 (2001).
5. T. D. White et al., Nature 440, 883 (2006).
6. K. Kovarovic, P. Andrews, J. Hum. Evol., in press (available
at http://dx.doi.org./doi:10.1016/j.jhevol.2007.01.001; doi:
10.1016/j.jhevol.2007.01.001).
7. N. Patterson, D. J. Richter, S. Gnerre, E. S. Lander,
D. Reich, Nature 441, 1103 (2006).
8. K. D. Hunt et al., Primates 37, 363 (1996).
9. J. G. Fleagle et al., Symp. Zool. Soc. London 48, 359
(1981).
10. R. H. Crompton et al., Cour. Forsch-Inst. Senckenb. 243,
115 (2003).
11. J. T. Stern, Yrb. Phys. Anthropol. 19, 59 (1975).
12. S. K. S. Thorpe, R. H. Crompton, Am. J. Phys. Anthropol.
131, 384 (2006).
13. K. D. Hunt, J. Hum. Evol. 26, 183 (1994).
15. E. Larney, S. Larsen, Am. J. Phys. Anthropol. 125, 42 (2004).
16. S. K. S. Thorpe, R. H. Crompton, Am. J. Phys. Anthropol.
127, 58 (2005).
17. S. K. S. Thorpe, R. H. Crompton, M. M. Gunther,
R. F. Ker, R. McN. Alexander, Am. J. Phys. Anthropol.
110, 179 (1999).
18. R. McN. Alexander, Principles of Animal Locomotion
(Princeton Univ. Press, Princeton, NJ, 2003).
19. C. V. Ward, Yrbk. Phys. Anthropol. 45, 185 (2002).
20. R. W. Wrangham, N. L. Conklin-Brittain, K. D. Hunt,
Int. J. Primatol. 19, 949 (1998).
21. H. Pontzer, R. W. Wrangham, J. Hum. Evol. 46, 317 (2004).
22. R. C. Payne et al., J. Anat. 208, 709 (2006).
23. M. Pickford, B. Senut, B. Gommery, in Late Cenozoic
Environments and Hominid Evolution: a Tribute to Bill
Bishop, P. Andrews, P. Banham, Eds. (Geological Society,
London, 1999), pp. 27–38.
24. N. M. Young, L. MacLatchy, J. Hum. Evol. 46, 163 (2004).
25. D. Gommery, B. Senu, M. Pickford, E. Musiime,
Ann. PalΓ©ontol. 88, 167 (2002).
26. C. V. Ward, in Handbook of Paleoanthropology Vol. 2:
Primate Evolution and Human Origins, W. Henke,
I. Tattersall, Eds. (Springer, Heidelberg, Germany, 2007),
pp. 1011–1030.
N. Ogihara, M. Nakatsukasa, Eds. (Springer, Heidelberg,
Germany, 2006), pp. 199–208.
28. C. P. E. Zollikofer et al., Nature 434, 755 (2005).
29. M. Pickford, Anthropologie 69, 191 (2005).
30. We thank the Indonesian Institute of Science, Indonesian
Nature Conservation Service, and Leuser Development
Programme for granting permission and giving support
for research in the Leuser Ecosystem. R. McN. Alexander,
T. M. Blackburn, S. Burtles. J. Rees, N. Jeffery,
E. E. Vereecke, A. Walker, A. Wilson, and B. Wood
commented on the manuscript. R. Savage developed the
animation (fig. S1). Studies of captive animals were
hosted by the North of England Zoological Society. This
research was supported by grants from the Leverhulme
Trust, the Royal Society, the L.S.B. Leakey Foundation,
and the Natural Environment Research Council.
Supporting Online Material
www.sciencemag.org/cgi/content/full/316/5829/1328/DC1
Table S1
Movies S1 to S3
5 February 2007; accepted 18 April 2007
10.1126/science.1140799
Genome-Wide Association Analysis
Identifies Loci for Type 2 Diabetes
and Triglyceride Levels
Diabetes Genetics Initiative of Broad Institute of Harvard and MIT, Lund University,
and Novartis Institutes for BioMedical Research*†
New strategies for prevention and treatment of type 2 diabetes (T2D) require improved insight into
disease etiology. We analyzed 386,731 common single-nucleotide polymorphisms (SNPs) in 1464
patients with T2D and 1467 matched controls, each characterized for measures of glucose
metabolism, lipids, obesity, and blood pressure. With collaborators (FUSION and WTCCC/UKT2D),
we identified and confirmed three loci associated with T2Dβ€”in a noncoding region near CDKN2A
and CDKN2B, in an intron of IGF2BP2, and an intron of CDKAL1β€”and replicated associations near
HHEX and in SLC30A8 found by a recent whole-genome association study. We identified and
confirmed association of a SNP in an intron of glucokinase regulatory protein (GCKR) with serum
triglycerides. The discovery of associated variants in unsuspected genes and outside coding regions
illustrates the ability of genome-wide association studies to provide potentially important clues to
the pathogenesis of common diseases.
T
ype 2 diabetes, obesity, and cardiovascular
risk factors are caused by a combination
of genetic susceptibility, environment, be-
havior, and chance. Whole-genome association
studies (WGAS) offer a new approach to gene
discovery unbiased with regard to presumed
functions or locations of causal variants. This
approach is based on Fisher’s theory for additive
effects at common alleles (1); human heterozy-
to purifying selection, and has been made pos-
sible by genomic advances such as the human
genome sequence, SNP and HapMap databases,
and genotyping arrays (3).
We studied 1464 patients with T2D and
1467 controls from Finland and Sweden, each
characterized for 18 clinical traits: anthropomet-
ric measures, glucose tolerance and insulin se-
cretion, lipids and apolipoproteins, and blood
applying stringent quality-control filters, high-
quality genotypes for 386,731 common SNPs
were obtained (4). To extend the set of putative
causal alleles tested for association, we devel-
oped 284,968 additional multimarker (haplo-
type) tests based on these SNP genotypes (5, 6).
The 671,699 allelic tests capture (correlation co-
efficient r2
β‰₯ 0.8) 78% of common SNPs in
HapMap CEU (3).
Each SNP and haplotype test was assessed
for association to T2D and each of 18 traits with
the software package PLINK (http://pngu.mgh.
harvard.edu/purcell/plink/). For T2D, a weighted
meta-analysis was used to combine results for
the population-based and family-based subsam-
ples (4). For quantitative traits, multivariable
linear or logistic regression with or without co-
variates was performed (4). Association results
for each SNP, haplotype test, and phenotype are
available (www.broad.mit.edu/diabetes/).
In genome-wide analysis involving hundreds
of thousands of statistical tests, modest levels of
bias imposed on the null distribution can over-
whelm a small number of true results. We used
three strategies to search for evidence of sys-
tematic bias from unrecognized population struc-
ture, the analytical approach, and genotyping
artifacts (7, 8). First, we examined the distribu-
tion of P-values in the population-based sam-
ple, observing a close match to that expected
for a null distribution (genomic inflation factor
lGC = 1.05 for T2D). Second, we calculated
G. Brice,6
B. Bullman,7
J. Campbell,8
B. Castle,9
R. Cetnarsyj,8
C.
Chapman,10
C. Chu,11
N. Coates,12
T. Cole,10
R. Davidson,4
A. Donaldson,13
H. Dorkins,3
F. Douglas,2
D. Eccles,9
R. Eeles,1
F. Elmslie,6
D. G. Evans,7
S. Goff,6
S. Goodman,5
D. Goudie,2
J. Gray,15
L. Greenhalgh,16
H. Gregory,17
S. V. Hodgson,6
T. Homfray,6
R. S. Houlston,1
L. Izatt,18
L. Jackson,18
L. Jeffers,19
V. Johnson-Roffey,12
F. Kavalier,18
C. Kirk,19
F. Lalloo,7
C. Langman,18
I. Locke,1
M. Longmuir,4
J. Mackay,20
A. Magee,19
S. Mansour,6
Z. Miedzybrodzka,17
J. Miller,11
P. Morrison,19
V. Murday,4
J. Paterson,21
G. Pichert,18
M. Porteous,8
N. Rahman,6
M. Rogers,15
S. Rowe,22
S. Shanley,1
A. Saggar,6
G. Scott,2
L. Side,23
L. Snadden,4
M. Steel,2
M. Thomas,5
S. Thomas,1
1
Clinical Genetics Service, Royal Marsden Hospital, Downs
Road, Sutton, Surrey, SM2 5PT, UK. 2
Department of
Clinical Genetics, Ninewells Hospital, Dundee, DD1 9SY,
UK. 3
Medical and Community Genetics, Kennedy-Galton
Centre, Level 8V, Northwick Park and St. Mark’s NHS Trust,
Watford Rd, Harrow, HA1 3UJ, UK. 4
Institute of Medical
Genetics, Yorkhill NHS Trust, Dalnair Street, Glasgow, G3
8SJ, UK. 5
Clinical Genetics Department, Royal Devon and
Exeter Hospital (Heavitree), Gladstone Road, Exeter, EX1
2ED, UK. 6
Department of Clinical Genetics, St. George’s
Hospital Medical School, Jenner Wing, Cranmer Terrace,
London, SW17 0RE, UK. 7
Department of Medical Genetics,
St. Mary’s Hospital, Hathersage Road, Manchester, M13
0JH, UK. 8
South East of Scotland Clinical Genetics Service,
Western General Hospital, Crewe Road, Edinburgh, EH4
2XU, UK. 9
Department of Medical Genetics, The Princess
Anne Hospital, Coxford Road, Southampton, S016 5YA, UK.
10
Clinical Genetics Unit, Birmingham Women’s Hospital,
Metchley Park Road, Edgbaston, Birmingham, B15 2TG,
UK. 11
Yorkshire Regional Genetic Service, Department of
Clinical Genetics, Cancer Genetics Building, St. James
University Hospital, Beckett Street, Leeds, LS9 7TF, UK.
12
Department of Clinical Genetics, Leicester Royal Infirm-
ary, Leicester, LE1 5WW, UK. 13
Department of Clinical
Genetics, St Michael’s Hospital, Southwell Street, Bristol,
BS2 8EG, UK. 14
Institute of Human Genetics, International
Centre for Life, Central Parkway, Newcastle upon Tyne, NE1
3BZ, UK. 15
Institute of Medical Genetics, University
Hospital of Wales, Heath Park, Cardiff, CF14 4XW, UK.
16
Department of Clinical Genetics, Alder Hey Children’s
Hospital, Eaton Road, Liverpool L12 2AP, UK. 17
Clinical
Genetics Centre, Argyll House, Foresterhill, Aberdeen,
AB25 2ZR, UK. 18
Clinical Genetics, 7th Floor New Guy’s
House, Guy’s
UK. 19
Clinical
Belvoir Park H
20
Clinical and
Health, 30 G
21
Department
Trust, Box 13
22
Department
of Chester Ho
23
Department
Road, Headin
Supporting
www.sciencema
Materials and
Figs. S1 to S8
Tables S1 to S
References
9 March 2007
Published onli
10.1126/scien
Include this in
A Genome-Wide Association Study of
Type 2 Diabetes in Finns Detects
Multiple Susceptibility Variants
Laura J. Scott,1
Karen L. Mohlke,2
Lori L. Bonnycastle,3
Cristen J. Willer,1
Yun Li,1
William L. Duren,1
Michael R. Erdos,3
Heather M. Stringham,1
Peter S. Chines,3
Anne U. Jackson,1
Ludmila Prokunina-Olsson,3
Chia-Jen Ding,1
Amy J. Swift,3
Narisu Narisu,3
Tianle Hu,1
Randall Pruim,4
Rui Xiao,1
Xiao-Yi Li,1
Karen N. Conneely,1
Nancy L. Riebow,3
Andrew G. Sprau,3
Maurine Tong,3
Peggy P. White,1
Kurt N. Hetrick,5
Michael W. Barnhart,5
Craig W. Bark,5
Janet L. Goldstein,5
Lee Watkins,5
Fang Xiang,1
Jouko Saramies,6
Thomas A. Buchanan,7
Richard M. Watanabe,8,9
Timo T. Valle,10
Leena Kinnunen,10,11
Gonçalo R. Abecasis,1
Elizabeth W. Pugh,5
Kimberly F. Doheny,5
Richard N. Bergman,9
Jaakko Tuomilehto,10,11,12
Francis S. Collins,3
* Michael Boehnke1
*
Identifying the genetic variants that increase the risk of type 2 diabetes (T2D) in humans has
been a formidable challenge. Adopting a genome-wide association strategy, we genotyped 1161
Finnish T2D cases and 1174 Finnish normal glucose-tolerant (NGT) controls with >315,000
single-nucleotide polymorphisms (SNPs) and imputed genotypes for an additional >2 million
autosomal SNPs. We carried out association analysis with these SNPs to identify genetic variants
that predispose to T2D, compared our T2D association results with the results of two similar studies,
and genotyped 80 SNPs in an additional 1215 Finnish T2D cases and 1258 Finnish NGT controls.
We identify T2D-associated variants in an intergenic region of chromosome 11p12, contribute
to the identification of T2D-associated variants near the genes IGF2BP2 and CDKAL1 and the
ria (8). We
ciation with
the log-odd
(8). We ob
versus 31.6
P values <
against the
with a large
consistent w
SNPs that
also sugges
trols by birt
successful;
genomic co
Analysi
allowed us
variation in
portion, w
(8, 13) that
equilibrium
Centre d’E
(Utah resid
1
Department
Genetics, Uni
USA. 2
Depar
Science, 6/2007
Study design: Richa Saxena1–6
and Valeriya Lyssenko7
(Team
Leaders), Peter Almgren,7
Paul I. W. de Bakker,1–6
NoΓ«l P.
Burtt,1
Jose C. Florez,1–6
Hong Chen,8
Joanne Meyer,8
Joel N.
Hirschhorn,1,6,9–11
Mark J. Daly,1–3,5
Thomas E. Hughes,8
Leif
Groop,7,12
David Altshuler1–6
(Chair)
Clinical characterization and phenotypes: Valeriya Lyssenko7
and Richa Saxena1–6
(Team Leaders), Peter Almgren,7
Kristin
Ardlie,1
Kristina Bengtsson BostrΓΆm,13
NoΓ«l P. Burtt,1
Hong Chen,8
Jose C. Florez,1–6
Bo Isomaa,14,15
Sekar Kathiresan,1,3,5
Guillaume
Lettre,1,6,9–11
Ulf Lindblad,16
Helen N. Lyon,1,6,9–11
Olle Melander,7
Christopher Newton-Cheh,1–3,5
Peter Nilsson,17
Marju Orho-
Melander,7
Lennart RΓ₯stam,16
Elizabeth K. Speliotes,1,3,6,9–11
Marja-Riitta Taskinen,12
Tiinamaija Tuomi,12,15
Benjamin F.
Voight,1–3,5
David Altshuler,1–6
Joel N. Hirschhorn,1,6,9–11
Thomas
E. Hughes,8
Leif Groop7,12
(Chair)
DNA sample QC and diabetes replication genotyping:
Candace Guiducci1
and Valeriya Lyssenko7
(Team Leaders),
Anna Berglund,7
Joyce Carlson,18
Lauren Gianniny,1
Rachel
Hackett,1
Liselotte Hall,18
Johan Holmkvist,7
Esa Laurila,7
Marju
Orho-Melander,7
Marketa SjΓΆgren,7
Maria Sterner,18
Aarti
Surti1
Margareta Svensson,7
Malin Svensson,7
Ryan Tewhey,1
NoΓ«l P. Burtt1
(Chair)
Whole genome scan genotyping: Brendan Blumenstiel1
(Team Leader), Melissa Parkin,1
Matthew DeFelice,1
Candace
Guiducci,1
Ryan Tewhey,1
Rachel Barry,1
Wendy Brodeur,1
NoΓ«l
P. Burtt,1
Jody Camarata,1
Nancy Chia,1
Mary Fava,1
John
Gibbons,1
Bob Handsaker,1
Claire Healy,1
Kieu Nguyen,1
Casey
Gates,1
Carrie Sougnez,1
Diane Gage,1
Marcia Nizzari,1
David
Altshuler,1–6
Stacey B. Gabriel1
(Chair)
GCKR replication genotyping and analysis (MalmΓΆ Diet
and Cancer Study): Sekar Kathiresan1,3,5
(Team Leader),
Candace Guiducci,1
Aarti Surti,1
NoΓ«l P. Burtt,1
Olle Melander,7
Marju Orho-Melander7
(Chair)
Statistical analysis: Benjamin F. Voight1–3,5
and Paul I. W.
de Bakker1–6
(Team Leaders), Richa Saxena,1–6
Valeriya
Lyssenko,7
Peter Almgren,7
NoΓ«l P. Burtt,1
Hong Chen,8
Gung-Wei
Chirn,8
Qicheng Ma,8
Hemang Parikh,7
Delwood Richardson,8
Darrell Ricke,8
Jeffrey J. Roix,8
Leif Groop,7,12
Shaun Purcell,1,2
David Altshuler,1–6
Mark J. Daly1–3,5
(Chair)
1
Broad Institute of Harvard and Massachusetts Institute of
Technology (MIT), Cambridge, MA 02142, USA. 2
Center for
Human Genetic Research, Massachusetts General Hospital,
Boston, MA 02114, USA. 3
Department of Medicine, Mas-
sachusetts General Hospital, Boston, MA 02114, USA.
4
Department of Molecular Biology, Massachusetts General
Hospital, Boston, MA 02114, USA. 5
Department of Medicine,
Harvard Medical School, Boston, MA 02115, USA. 6
Depart-
ment of Genetics, Harvard Medical School, Boston, MA
02115, USA. 7
Department of Clinical Sciences, Diabetes and
Endocrinology Research Unit, University Hospital MalmΓΆ,
Lund University, MalmΓΆ, Sweden. 8
Diabetes and Metabolism
Disease Area, Novartis Institutes for BioMedical Research, 100
Technology Square, Cambridge, MA 02139, USA. 9
Depart-
ment of Pediatrics, Harvard Medical School, Boston, MA
02115, USA. 10
Division of Endocrinology, Children’s Hospital,
Boston, MA 02115, USA. 11
Division of Genetics, Children’s
Hospital, Boston, MA 02115, USA. 12
Department of Medicine,
Helsinki University Hospital, University of Helsinki, Helsinki,
Finland. 13
Skaraborg Institute, SkΓΆvde, Sweden. 14
Malmska
Municipal Health Center and Hospital, Jakobstad, Finland.
15
FolkhΓ€lsan Research Center, Helsinki, Finland. 16
Depart-
ment of Clinical Sciences, Community Medicine Research
Unit, University Hospital MalmΓΆ, Lund University, MalmΓΆ,
Sweden. 17
Department of Clinical Sciences, Medicine Research
Unit, University Hospital MalmΓΆ, Lund University, MalmΓΆ, Sweden.
18
Clinical Chemistry, University Hospital MalmΓΆ, Lund
University, MalmΓΆ, Sweden. 19
Department of Psychiatry,
Massachusetts General Hospital, Harvard Medical School,
Boston, MA 02115, USA.
Supporting Online Material
www.sciencemag.org/cgi/content/full/1142358/DC1
Materials and Methods
Figs. S1 and S2
Tables S1 to S6
References
9 March 2007; accepted 20 April 2007
Published online 26 April 2007;
10.1126/science.1142358
Include this information when citing this paper.
Replication of Genome-Wide
Association Signals in UK Samples
Reveals Risk Loci for Type 2 Diabetes
Eleftheria Zeggini,1,2
* Michael N. Weedon,3,4
* Cecilia M. Lindgren,1,2
* Timothy M. Frayling,3,4
*
Katherine S. Elliott,2
Hana Lango,3,4
Nicholas J. Timpson,2,5
John R. B. Perry,3,4
Nigel W. Rayner,1,2
Rachel M. Freathy,3,4
Jeffrey C. Barrett,2
Beverley Shields,4
Andrew P. Morris,2
Sian Ellard,4,6
Christopher J. Groves,1
Lorna W. Harries,4
Jonathan L. Marchini,7
Katharine R. Owen,1
Beatrice Knight,4
Lon R. Cardon,2
Mark Walker,8
Graham A. Hitman,9
Andrew D. Morris,10
Alex S. F. Doney,10
The Wellcome Trust Case Control
Consortium (WTCCC),† Mark I. McCarthy,1,2
‑§ Andrew T. Hattersley3,4
‑
The molecular mechanisms involved in the development of type 2 diabetes are poorly
understood. Starting from genome-wide genotype data for 1924 diabetic cases and 2938
population controls generated by the Wellcome Trust Case Control Consortium, we set out to detect
replicated diabetes association signals through analysis of 3757 additional cases and 5346 controls
and by integration of our findings with equivalent data from other international consortia. We
detected diabetes susceptibility loci in and around the genes CDKAL1, CDKN2A/CDKN2B, and
IGF2BP2 and confirmed the recently described associations at HHEX/IDE and SLC30A8. Our findings
provide insight into the genetic architecture of type 2 diabetes, emphasizing the contribution of
Here, we describe how integration of data
from the WTCCC scan and our own replication
studies with similar information generated by the
Diabetes Genetics Initiative (DGI) (6) and the
Finland–United States Investigation of NIDDM
Genetics (FUSION) (7) has identified several
additional susceptibility variants for T2D.
In the WTCCC study, analysis of 490,032
autosomal SNPs in 16,179 samples yielded
459,448 SNPs that passed initial quality control
(5). We considered only the 393,453 autosomal
SNPs with minor allele frequency (MAF) ex-
ceeding 1% in both cases and controls and no
extreme departure from Hardy-Weinberg equi-
librium (P < 10βˆ’4
in cases or controls) (8). This
T2D-specific data set shows no evidence of sub-
stantial confounding from population substruc-
ture and genotyping biases (8).
To distinguish true associations from those
reflecting fluctuations under the null or residual
errors arising from aberrant allele calling, we first
submitted putative signals from the WTCCC study
to additional quality control, including cluster-
plot visualization and validation genotyping on
REPORTS
onFebruary8,2010www.sciencemag.orgDownloadedfrom
ARTICLES
A genome-wide association study
identifies novel risk loci for type 2 diabetes
Robert Sladek1,2,4
, Ghislain Rocheleau1
*, Johan Rung4
*, Christian Dina5
*, Lishuang Shen1
, David Serre1
,
Philippe Boutin5
, Daniel Vincent4
, Alexandre Belisle4
, Samy Hadjadj6
, Beverley Balkau7
, Barbara Heude7
,
Guillaume Charpentier8
, Thomas J. Hudson4,9
, Alexandre Montpetit4
, Alexey V. Pshezhetsky10
, Marc Prentki10,11
,
Barry I. Posner2,12
, David J. Balding13
, David Meyre5
, Constantin Polychronakos1,3
& Philippe Froguel5,14
Type 2 diabetes mellitus results from the interaction of environmental factors with a combination of genetic variants, most of
which were hitherto unknown. A systematic search for these variants was recently made possible by the development of
high-density arrays that permit the genotyping of hundreds of thousands of polymorphisms. We tested 392,935
single-nucleotide polymorphisms in a French case–control cohort. Markers with the most significant difference in genotype
frequencies between cases of type 2 diabetes and controls were fast-tracked for testing in a second cohort. This identified
four loci containing variants that confer type 2 diabetes risk, in addition to confirming the known association with the TCF7L2
gene. These loci include a non-synonymous polymorphism in the zinc transporter SLC30A8, which is expressed exclusively in
insulin-producing b-cells, and two linkage disequilibrium blocks that contain genes potentially involved in b-cell
development or function (IDE–KIF11–HHEX and EXT2–ALX4). These associations explain a substantial portion of disease risk
and constitute proof of principle for the genome-wide approach to the elucidation of complex genetic traits.
The rapidly increasing prevalence of type 2 diabetes mellitus (T2DM) is
thought to be due to environmental factors, such as increased availabil-
ity of food and decreased opportunity and motivation for physical
activity, acting on genetically susceptible individuals. The heritability
of T2DM is one of the best established among common diseases and,
consequently, genetic risk factors for T2DM have been the subject of
intense research1
. Although the genetic causes of many monogenic
forms of diabetes (maturity onset diabetes in the young, neonatal mito-
chondrial and other syndromic types of diabetes mellitus) have been
elucidated, few variants leading to common T2DM have been clearly
identified and individually confer only a small risk (odds ratio < 1.1–
1.25) of developing T2DM1
. Linkage studies have reported many
T2DM-linked chromosomal regions and have identified putative, cau-
sative genetic variants in CAPN10 (ref. 2), ENPP1 (ref. 3), HNF4A (refs
genotypes for 392,935 single-nucleotide polymorphisms (SNPs) in
1,363 T2DM cases and controls (Supplementary Table 1). In order to
enrich for risk alleles21
, the diabetic subjects studied in stage 1 were
selected to have at least one affected first degree relative and age at
onset under 45 yr (excluding patients with maturity onset diabetes in
the young). Furthermore, in order to decrease phenotypic hetero-
geneity and to enrich for variants determining insulin resistance and
b-cell dysfunction through mechanisms other than severe obesity, we
initially studied diabetic patients with a body mass index (BMI)
,30 kg m22
. Control subjects were selected to have fasting blood
glucose ,5.7 mmol l21
in DESIR, a large prospective cohort for the
study of insulin resistance in French subjects22
.
Genotypes for each study subject were obtained using two plat-
Sladek, 2007How many SNPs (p-value?)
European-based; N ~ 1000
cases: high fasting blood glucose/non-obese

controls: non-obese
Human Hap300 chip, showing no T2DM association in stage 1
(P . 0.01) and separated by at least 100 kb. Using the first principal
component as a covariate for ancestry differences between cases and
controls, we tested for association between rs932206 and disease
status. Our result suggests that this apparent association is largely
BMI on the association between marker and disease, as it is asymp-
totically equivalent to the Armitage trend test used to detect asso-
ciation in stages 1 and 2. None of the associations (Supplementary
Table 7) was substantially changed by considering the effects of these
covariates.
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
15
10
5
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 19 20
21 22 X
18
Figure 1 | Graphical summary of stage 1 association results. T2DM
association was determined for SNPs on the Human1 and Hap300 chips. The
x axis represents the chromosome position from pter; the y axis shows
2log10[pMAX], the P-value obtained by the MAX statistic, for each SNP
(Note the different scale on the y axis of the chromosome 10 plot.). SNPs that
passed the cutoff for a fast-tracked second stage are highlighted in red.
882
NatureΒ©2007 Publishing Group Sladek, 2007
Identification of four novel T2DM loci
Our fast-track stage 2 genotyping confirmed the reported association
for rs7903146 (TCF7L2) on chromosome 10, and in addition iden-
tified significant associations for seven SNPs representing four new
T2DM loci (Table 1). In all cases, the strongest association for the
MAX statistic (see Methods) was obtained with the additive model.
The most significant of these corresponds to rs13266634, a non-
synonymous SNP (R325W) in SLC30A8, located in a 33-kb linkage
disequilibrium block on chromosome 8, containing only the 39 end
of this gene (Fig. 2a). SLC30A8 encodes a zinc transporter expressed
solely in the secretory vesicles of b-cells and is thus implicated in the
final stages of insulin biosynthesis, which involve co-crystallization
Table 1 | Confirmed association results
SNP Chromosome Position
(nucleotides)
Risk
allele
Major
allele
MAF
(case)
MAF
(ctrl)
Odds ratio
(het)
Odds ratio
(hom)
PAR ls Stage 2
pMAX
Stage 2 pMAX
(perm)
Stage 1
pMAX
Stage 1 pMAX
(perm)
Nearest
gene
rs7903146 10 114,748,339 T C 0.406 0.293 1.65 6 0.19 2.77 6 0.50 0.28 1.0546 1.5 3 10234
,1.0 3 1027
3.2 3 10217
,3.3 3 10210
TCF7L2
rs13266634 8 118,253,964 C C 0.254 0.301 1.18 6 0.25 1.53 6 0.31 0.24 1.0089 6.1 3 1028
5.0 3 1027
2.1 3 1025
1.8 3 1025
SLC30A8
rs1111875 10 94,452,862 G G 0.358 0.402 1.19 6 0.19 1.44 6 0.24 0.19 1.0069 3.0 3 1026
7.4 3 1026
9.1 3 1026
7.3 3 1026
HHEX
rs7923837 10 94,471,897 G G 0.335 0.377 1.22 6 0.21 1.45 6 0.25 0.20 1.0065 7.5 3 1026
2.2 3 1025
3.4 3 1026
2.5 3 1026
HHEX
rs7480010 11 42,203,294 G A 0.336 0.301 1.14 6 0.13 1.40 6 0.25 0.08 1.0041 1.1 3 1024
2.9 3 1024
1.5 3 1025
1.2 3 1025
LOC387761
rs3740878 11 44,214,378 A A 0.240 0.272 1.26 6 0.29 1.46 6 0.33 0.24 1.0046 1.2 3 1024
2.8 3 1024
1.8 3 1025
1.3 3 1025
EXT2
rs11037909 11 44,212,190 T T 0.240 0.271 1.27 6 0.30 1.47 6 0.33 0.25 1.0045 1.8 3 1024
4.5 3 1024
1.8 3 1025
1.3 3 1025
EXT2
rs1113132 11 44,209,979 C C 0.237 0.267 1.15 6 0.27 1.36 6 0.31 0.19 1.0044 3.3 3 1024
8.1 3 1024
3.7 3 1025
2.9 3 1025
EXT2
Significant T2DM associations were confirmed for eight SNPs in five loci. Allele frequencies, odds ratios (with 95% confidence intervals) and PAR were calculated using only the stage 2 data. Allele
frequencies in the controls were very close to those reported for the CEU set (European subjects genotyped in the HapMap project). Induced sibling recurrent risk ratios (ls) were estimated using
stage 2 genotype counts for the control subjects and assuming a T2DM prevalence of 7% in the French population. hom, homozygous; het, heterozygous; major allele, the allele with the higher
frequency in controls; pMAX, P-value of the MAX statistic from the x2
distribution; pMAX (perm), P-value of the MAX statistic from the permutation-derived empirical distribution (pMAX and
pMAX (perm) are adjusted for variance inflation); risk allele, the allele with higher frequency in cases compared with controls.
0
2
4
–log10[P]
–log10[P]
SLC30A8 IDE HHEXKIF11
0
2
4
a b
NATURE|Vol 445|22 February 2007 ARTICLES
Sladek, 2007
5
3
1
5
3
1
15
10
5
1 1 1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
5
3
1
3 4 5
8 9 10
13 14 15
19 20
X
18
DM 2log10[pMAX], the P-value obtained by the MAX statistic, for each SNP
How would you interpret the p-
values?
Odds ratios?
Confirmed 8 SNPs with N ~ 1000
Scaling up discovery by combining populations:

meta-analyses
g the Diabetes Genetics
nvestigation of NIDDM
nd (iv) the Framingham
omponent studies (n ΒΌ
ry Table 1 online.
aring, the four consortia
n 10 and 20 SNPs promi-
their individual, interim,
mentary Table 2 online).
oci with consistent effects
dies. Two of these repre-
6PC2 and GCK. In addi-
nerated evidence for an
NPs around the MTNR1B
rs1387153, P ΒΌ 2.2 Γ‚
10Γ€11; DFS: rs10830963,
5.8 Γ‚ 10Γ€4, for the most
ch analysis). The associa-
d on formal meta-analysis
r exclusion of individuals
ΒΌ 1.1 Γ‚ 10Γ€57; rs4607517
NR1B), P ΒΌ 3.2 Γ‚ 10Γ€50;
pplementary Table 3 and
ent efforts to harmonize
(including the additional
data from the WTCCC, DGI and FUSION scans)10 (Supplementary
Note). We found strong evidence that the minor G allele of
rs10830963 was associated with increased risk of T2D (odds ratio ΒΌ
1.09 (1.05–1.12), P ΒΌ 3.3 Γ‚ 10Γ€7; Fig. 2 and Supplementary Table 6
online). The possibility that the fasting glucose association might
DGI
Study ID OR (95% CI) Weight
(%)
1.12 (0.96, 1.30) 4.61
4.89
8.03
9.58
3.53
8.75
2.69
6.04
10.56
23.18
2.85
7.41
7.90
100.00
1.20 (1.03, 1.39)
1.07 (0.95, 1.20)
1.14 (1.03, 1.27)
1.00 (0.84, 1.19)
1.17 (1.04, 1.30)
1.07 (0.88, 1.31)
1.16 (1.02, 1.33)
1.00 (0.90, 1.10)
1.03 (0.96, 1.10)
0.91 (0.75, 1.10)
1.15 (1.02, 1.30)
1.16 (1.03, 1.30)
1.09 (1.05, 1.12)
Meta-analysis P value = 3.3 Γ— 10
–7
FUSION
WTCCC
deCODE
KORA
Rotterdam
CCC
ADDITION/ELY
Norfolk
UKT2DGC
OxGN/58BC
FUSION Stage 2
METSIM
.722 1 1.39
Overall (I
2
= 26.6%, P = 0.176)
Figure 2 Association of rs10830963 with type 2 diabetes (T2D) in 13 case-
control studies.
VOLUME 41 [ NUMBER 1 [ JANUARY 2009 NATURE GENETICS
Meta-analysis of SNP rs10830963:
Combining findings from multiple cohorts
Propenko, 2009
A RT I C L E S
By combining genome-wide association data from 8,130 individuals with type 2 diabetes (T2D) and 38,987 controls of
European descent and following up previously unidentified meta-analysis signals in a further 34,412 cases and 59,925 controls,
we identified 12 new T2D association signals with combined P < 5 Γ— 10βˆ’8. These include a second independent signal at the
KCNQ1 locus; the first report, to our knowledge, of an X-chromosomal association (near DUSP9); and a further instance of
overlap between loci implicated in monogenic and multifactorial forms of diabetes (at HNF1A). The identified loci affect both
beta-cell function and insulin action, and, overall, T2D association signals show evidence of enrichment for genes involved in
cell cycle regulation. We also show that a high proportion of T2D susceptibility loci harbor independent association signals
influencing apparently unrelated complex traits.
Type 2 diabetes (T2D) is characterized by insulin resistance and
deficient beta-cell function1. The escalating prevalence of T2D and
the limitations of currently available preventative and therapeutic
options highlight the need for a more complete understanding of
T2D pathogenesis. To date, approximately 25 genome-wide significant
common variant associations with T2D have been described, mostly
through genome-wide association (GWA) analyses2–13. The identities
of the variants and genes mediating the susceptibility effects at most
of these signals have yet to be established, and the known variants
account for less than 10% of the overall estimated genetic contribution
to T2D predisposition. Although some of the unexplained heritability
will reflect variants poorly captured by existing GWA platforms, we
reasoned that an expanded meta-analysis of existing GWA data would
the inverse-variance method (Online Methods, Fig. 1, Supplementary
Tables 1 and 2 and Supplementary Note). We observed only modest
genomic control inflation ( gc = 1.07), suggesting that the observed
results were not due to population stratification. After removing SNPs
within established T2D loci (Supplementary Table 3), the result-
ing quantile-quantile plot was consistent with a modest excess of
disease associations of relatively small effect (Supplementary Note).
Weak evidence for association at HLA variants strongly associated
with autoimmune forms of diabetes (Supplementary Table 3 and
Supplementary Note) suggested some case admixture involving
subjects with type 1 diabetes or latent autoimmune diabetes of adult-
hood; however, failure to detect T2D associations at other non-HLA
type 1 diabetes susceptibility loci (for example, INS, PTPN22 and
Twelve type 2 diabetes susceptibility loci identified
through large-scale association analysis
Voight, 2010
Meta-analyses for T2D:
N>40K and 90K identifies >30 loci among 2,400,000 SNPs
A RT I C L E S
13 autosomal loci exceeded the threshold for genome-wide significance
(P ranging from 2.8 Γ— 10βˆ’8 to 1.4 Γ— 10βˆ’22) with allele-specific odds
(r2 < 0.05), and conditional analyses (see below) establish these SNPs
as independent (Fig. 2 and Supplementary Table 4). Further analysis
50 Locus established previously
Locus identified by current study
Locus not confirmed by current study
BCL11A
THADA
NOTCH2
ADAMTS9
IRS1
IGF2BP2
WFS1
ZBED3
CDKAL1
HHEX/IDE
KCNQ1 (2 signals*: )
TCF7L2
KCNJ11
CENTD2
MTNR1B
HMGA2 ZFAND6
PRC1
FTO
HNF1B DUSP9
Conditional analysis
Unconditional analysis
TSPAN8/LGR5
HNF1A
CDC123/CAMK1D
CHCHD9
CDKN2A/2B
SLC30A8
TP53INP1
JAZF1
KLF14
PPAR
40
30
–log10(P)–log10(P)
20
10
10
1 2 3 4 5 6 7 8
Chromosome
9 10 11 12 13 14 15 16 17 18 19 20 21 22 X
0
0
Suggestive statistical association (P < 1 10
–5
)
Association in identified or established region (P < 1 10
–4
)
Figure 1 Genome-wide Manhattan plots for the DIAGRAM+ stage 1 meta-analysis. Top panel summarizes the results of the unconditional meta-
analysis. Previously established loci are denoted in red and loci identified by the current study are denoted in green. The ten signals in blue are those
taken forward but not confirmed in stage 2 analyses. The genes used to name signals have been chosen on the basis of proximity to the index SNP and
should not be presumed to indicate causality. The lower panel summarizes the results of equivalent meta-analysis after conditioning on 30 previously
established and newly identified autosomal T2D-associated SNPs (denoted by the dotted lines below these loci in the upper panel). Newly discovered
conditional signals (outside established loci) are denoted with an orange dot if they show suggestive levels of significance (P < 10βˆ’5), whereas
secondary signals close to already confirmed T2D loci are shown in purple (P < 10βˆ’4).
Meta-analyses for T2D:
N>40K and 90K identifies >30 loci among 2,400,000 SNPs
0
20
40
60
80
100
recombinationrate(cM/Mb)
●●●
●●
●●
●●●
●
●
●
●●●
●
●●●●●
●
●
●
●●●
●●
●● ●
●
●●●
●●
●
●
●
●
●
●
●●
●
●
●●
●● ●
●
●●
●●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●●●●●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●
●●●●●
●●●●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●●●
●●
●●
●
●●
●
●●
●
●
● ●
●●●●
●
●
●
●
●
●●
●
●● ●●
●● ●
●
●
●
●
● ●
●●
●
●●●●
●
●
●
●●
●
●●
●
●●●
●
●
●
●
●
●●●●
●
● ●● ●
●
●●●●●
●
●
2 βˆ’>
PGCP
98
SLC30A8 Region
0
2
4
6
8
10
βˆ’log10(Pβˆ’value)
0
20
40
60
80
100
recombinationrate(cM/Mb)
rs3802177
●●●●
●
● ●
●
●
●
●
● ●
●●
●
●●
●●● ●
●
●
●
●●●
●●
●
●●●●●●
●
●●●
●
●
●
●
●
●
●●
●●●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
● ●
● ●
●● ●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●●● ●● ●●
●
●
●
●
●
● ●
●
●
● ●
●
●
●● ●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●● ●
●● ●
●
●●
●●
●
●●
●●
●
● ●
●
● ●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●●
● ●● ●●
●
●
●
●●
●
●●
●
●
●
● ●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
● ●●●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
● ● ●
● ●
●
●
● ●
●
●
●
●●
● ●
●
●
●
●
●
●● ●
●● ●●●
●
●
●
●
●●●●●
●
●
●
●●
●● ●
●
●
●
● ●
● ●
●
●
●
●
●
●● ●
●●
●
●
●
●
●
●
●
●●●
●● ● ●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●●
●●
●● ●
●●
●
●●● ●
● ●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●●
●
●
●
● ●
●●
●
●
●●
● ●
● ●
● ●
●
●●
●
●
●
●
●●●
●
●
●
●
● ●●
●
●
●
●●
●
●
●
● ●
●
●●●●
●●
●
●
●●
●●●
●
●●●●●
●●
●●●
●
●●●
●
●
●
●
●●●
●●
●
●
●
●●●●●
●
●
●
●
●●
●
●●●
●
●
● ●
●
●
●●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●●
●
●
● ●
●●●
●
●●●
●
●
●●
●
●
●
●
●●
●
●
●●● ●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●●●
●
●
●
●
●
●
●
●
●●
● ●
●
●● ●
●
●
●
●
● ●●●●
●
●
●
●
●
●
●
● ●
●
●●
● ●● ●
●
●
●
●●
●
●
●●● ●●
●
●
●
●
●●●
●
●
●
●
●●
●
● ●●
●
● ●●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
● ●●●● ●●●
●
●
●
●●
●
● ●
●
●
●
●●
●
● ●
●
●
● ●●●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●● ●●
●●
●
●
●
●
●
● ●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●●● ●
●
●
●
● ●
●
●
●
●● ● ●
● ●●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●● ●
●●
●● ●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
● ●
●
● ●●
●
●●
●
●●
● ●
●● ●
●
●●
●
●●● ●
●●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●●
●●
●
●
●●
●
●
●
● ● ●
●
●
●
●
●
●
●●
●
●
● ● ●
●
●
●
●●
●
●
●
●
●
●●
●
●
● ●
●
●
●● ●●
●
●● ●●●
●●
●●●●●●
●
●
●
●●
●●
●
●
●
●
●●●
●
●
●
●●
●
●●
●
●●
●●●●●
●
●
●
●●● ●
●
●●
●
●●
●
●● ●
●●
●
●
●
●
●
●
●
●
●● ●●●
●
●● ●●●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●●
●
● ●●●●
●●
●●
●●●
●
●
●
●●●●●
●
●●
●
●
●
●
●●
●
● ●● ●●●●●●●●●
●●●
●
●●●
●
●● ●
●●●
●
●
●
●
●
●
●● ●
●
●
●
●● ●●
●
●●
●
●●●●●● ●
●
● ●
●
●
●
●
●
●●
● ●
●
●●
●
●
●
●
●
●
● ●● ●
●
●
●
●
●
●●
●
●
●
●●●
●
●●●●●
●
●
●●●
●
●●●● ●
●●
●●
● ●
●●● ●
●
●●●●●●●
●
●
●
●
●
●
●●
●
●●
●
●●
●●●●●
●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●●
●●
●●
●
●●●●
●●●
●
●● ●
●
●
●
●●●
●
●●●
●
●●
●
●●●
●
●●●●●●●●●●
●
●
●
●
●●●●
●
●●
●●●●●●●●●●●●●
●
●●●
●
●●
●● ●
● ●●
●●
●
●●●●●
●
●
●
●●
●●
●
●
●●●●●●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●●
●
●
●●●
●
●
●●●
●
●
●
●
●●●●●
●
●
● ●
●
●
●
●
●
●
●●
●
●●
●
●
●●●●
●
●●
●
●●● ●
●
●
●
●●●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●●●●●
●
●
●●
●●
●
●●●●●
●
●
●●●
●●
●●●
●
●
●
●
●●
●
●
●
● ●●
●
● ●●
●
●
● ●●
●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●●●●
●
●
● ●●●
●
●
●●●
●
●
●
●
●●
●
●
●●●●● ●
●● ●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●● ●●●●
●
●
●
●● ●
●●●●
●●
● ●
●
●●●●
●● ●
●
●
●
●●
●
● ●●
●
●●
● ●
●
●
●
●●●
● ●●
●●●
●
● ●●●
●
●
●●●●●
●
●
●
●
●●●●●
●
●●●●●
●
●●●
●
●
●●
●
●
●
●
●●●
●●
●●●
●● ●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●● ●
●
●
●
●
●●●
●
●
●
●●
●
●
● ●
●
●
●
●●
●●
●
●●
●
●
● ●●●
●
●
●
●
●
●
● ● ●
●
● ● ●● ●
●
●
● ●
●●
●
●
●●●● ● ●●●
●●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●●
●
●
● ●
● ● ●
●
●
●
●●
●
●
●●
●
●●●
●
●●●
●
●●●●●●● ●
●
●
●
●
●
●●●●●●●● ●●
● ●
●
● ●●●●●● ● ●
●●
●
●●
●●● ●
●
●
● ●
●
●
●●●● ●●
●
●
●●●
●●●
●
●●●●
●
●●●●●●
● ● ●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●●●●●●●●●●●●●
●●●●●●● ● ●
●●●●●●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●● ●●●
●
●●
●
●●●●
●● ●
●
●
●
●
●
●●●●
●
●●
●
●
●
●
●
●●
●●●●● ●
●
●
●●
●
●●●●●●●●●●●●●
●●●●●●●●●●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●●●●●●
●
●● ●
●●●●●●●
●
●●
●●●●
●
●●●●
●
● ●
●●●●●●
●
●●
●●●●●●●●●●●
●●● ● ●
●
●●●●●●
●
●●
● ●●●●●●
●●●●●
●
●
●
●
● ●●●●●●●●●●●●●●●●●●●
●●
●
●
●
●●
●
●
●
●
●●●●
●●
●
●●●
●●
●●●
●
●●
●●
●
●●
●
●
●●●●●
●
●
●
●●
●●
●●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●●● ●
●●●
●
●●●●●●●●
●
●●●●
●
●
●●●
●
●●
●
●●●
● ●●●●
●
●●
●●●
●
●●●●●
●●●●
●●
●●●
●
●
●
●
●
●
●●●●
● ●
●
●●●
●
●
●
●
●
●
●
●
●●●●●●●●●●●
●
●
●●●●●
●
●
●●●●●
●
●●●●
● ●●
●
●●●●●
●
●●●● ●●
●
●●
●
●
●
●●
●●●●●●●●●●●●●
● ●
●●●●●●●
●●●●
●
●●
●●
●●●
●
●
●● ●●●
●
●●●●
●
●
●●●
●●●●●●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●●
●●●●●
●
●●●●●●●●●●●
●
●●●●●●●
●●●●●●●●
●
●
● ●
●●
●
●
●
●●●
●●
●
●
●●●●●●●●●●●●●●●●
●●●●●
●●●●●
●
●
●
●
●
●
●●
●●
●
●
●
●●●
● ●
●●
●
●
●
●
● ●● ●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●●●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
● ●●
●
●
●
●
●
●
●
● ●
● ●
● ●●
● ●
●
●
●
●●
●
● ●
●● ●●
●
●
●
●
●
●●
● ●
●
●
●● ●
●
●
●
●
●
●● ●
●
●●
●●
● ●
●
●
●●
● ●● ●
●
● ● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●●
● ●●
●
● ●●
●
●
●●
●
●●
●
●● ●
●
●
●
●●
●
●
● ●● ●
●●●
●
●
●
●
●
● ●● ● ●
●
● ●
●
●● ●●●●●●●●●
●
●●●●
●●
●●●
●●
●●
●●●
●
●●
●
●
●
●●●●
●
●
●
●
●
● ●
●●
●
●
●●●
●
●●
●
●●
●
●
●
●●●
●
●
●●●●●●●●
●
●●●●
●●
● ●●
●●
●
●●●●●●●
●●●●
●
●
●●
●●●
● ●●●
●●●
●
●●
●
●
● ●●
●
●●●●
●
●
●
●
●●●
●
●●●●●●●●
●
● ●
●
●●
●
●
●
●●
●
●
●●
●
●● ●●
●
●
●
●●●●
●
●
●
●
●●
●
●●
●●
●
●
●● ●
●●●●
●●
●●
●
●
●
●
●
● ●● ● ●●●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●●●●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
● ●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
● ● ●
●
●
●
●
●
●
●
● ●● ●
●●
●
● ●●●●
●
●
●● ●
●
●●
●●
●
●
● ●
●
●
●
●
●● ●
●
●
●
●
●
●
● ● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
●●
●
●
●
●
● ● ●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●●
●
●
●
●
● ●●
●
●
●
●
●●
● ● ●
●
●
●
● ●
●
●●
●
●
●
● ●
●
● ●●● ●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ● ●
●
●
●
● ●
●
●● ●
●
●
●
●●
●
●
● ● ●
●●
●
●
●
●●
●
●
● ●
●
●
●
●
●●
● ●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
● ●
●●●
● ●
●
●
●
●
●●
● ●
●●
●●
● ● ●
● ●●
●
●● ●●
●
● ● ●
● ●
● ●●
●
●
● ●
●●
●●
●
●●
●●●●●●●●
●
●
●●●●●●●
●
●●●
●
●
●●●●●
● ●● ●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●● ● ●
●
●
●
●
●●
●
●
●●● ●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●● ●
●
●● ●●
●
●●
● ●
● ●● ●
●
● ●●
●
●
●●
●
●
●
●
●
●
●
●
● ●●
●
●● ●● ●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
● ●●
●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●● ●
● ●●
●
●
●
●●
● ●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●●●
●
●
●
●●●
● ●
●
● ●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●●●
●
●
● ●
●
●●
●
●●
●
●●●●●●●●●●
●●●●●
●●
●●●
●●●
●
●
●●●●
●●●●●●●●●●
●
●
●
●
●●
●●●●●
●●●●●●●●●●
●●●●●
●
●
●
●
●
●
●●●●●●●●
●
●
●
●●●●
●●●●
●●●
● ●
●●
●
●
●●
●
●
●
●●●●● ●●
●
●
●
●
●
●
●
●●●●
●
●●●
●
● ●●
●
●
●●
●
●
●
●● ●
●●
●●● ●
● ●
●
●●●
●●
●
●●
●
●
●
●
●
● ●●
●
●
● ● ●
●
●
●
●●
●
●
●
● ● ●●
●
● ● ●
●
●
●●●●
● ●
●
● ●
●
●
● ●● ● ●● ●
●
●
●
●
●
●●
●
●
●
● ●
●● ●●●●
●●
●
●
●● ●
●
●●
● ●
●
●
●
●
●● ●●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●
●
rs3802177 stage 1
● r^2: 0.8 βˆ’ 1.0
● r^2: 0.6 βˆ’ 0.8
● r^2: 0.4 βˆ’ 0.6
● r^2: 0.2 βˆ’ 0.4
● r^2: 0.0 βˆ’ 0.2
● r^2 missing
<βˆ’ TRPS1
<βˆ’ EIF3H
UTP23 βˆ’>
<βˆ’ RAD21
LOC441376 βˆ’>
SLC30A8 βˆ’>
MED30 βˆ’>
<βˆ’ EXT1
<βˆ’ SAMD12
<βˆ’ TNFRSF11
COLEC1
117 118 119 120
Position on chromosome 8 (Mb)
CDKN2A/B Region
0
2
4
6
8
10
βˆ’log10(Pβˆ’value)
0
20
40
60
80
100
recombinationrate(cM/Mb)
rs10965250
●● ●● ●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●●
●
●●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●
●
● ●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●●
●
●●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●
●
●
●
●
●
●●
●● ●
●
●
●●
●
●
●●
● ●●
●
●
●
● ●●
●
●
●
●
●
●
● ●●
●
●●
●
●
●
●
●●
●
●
●
●
●● ●
●
●● ● ●
●
●
●
●
●
●
●
● ●
●
●●
●●
●● ●
●
●
●
●
●
●
●
●●
●
●●●●
●
●
● ●
●●
●
●
●
●
●
● ● ●
●●
●
●
●
●
●●●●
●
●●
●
●●
●
●
●
●●●
●
●●●
● ●
●
● ●●●
●
●●●
●
●
●
●
●●●●
●●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
● ●
●
●
● ●
●
●
●
●
●
● ●
●
● ●●
●
●
●
● ●
● ●●●●
●
●●
●
●
●
●
●
● ●●
●
● ●●●●●
●
●●
●
●
●
●
●
● ●●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●●
●●●
●
●●
●●
●
●
●●
●●●
●●
●
●
●●
●
●
●●
● ● ●
●
● ●
●●●●●●●●●
●●
●●
●
●
●
●
●
●
●
●●
●
●
● ●●●●●●●
●●●
●
●
● ●●
●
●
●●●●
●
●
●
●●
●
●
●
●
●●●●●
●
●●
●●●●●●
●
●
●
●●
●
●
●●●
●
● ●
●●●
●
●●●●
●
●
●
●●●●
●●
●●●
●●
●●●●●
●●
●●●
●●●●●
●
●●●●
●
●
●
●●
●
●
●
●
●●●
●
●
●●
●●
●●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●●●●●●●
●●●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●●
●●●●●●●●●●
●
●●●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●●●●●
● ●●
●●
●
●
●●●
●●
●
●●
●
● ●
●
●
●●●
●
●●●
●
●●●
●
●
●
●
●●●●●●●●●●●●●
●
●●
●●●
●●●
●●●
●
●
●
●●●●
●●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●●●
●●
●●
●●●●●●●●●●●●●●●
●
●●●
●●●●●
●
●
●
●
●
●
●
●●●●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●●●●●
●
●●
●
●
●
●
●●●●●●●
●
●
●
●
●
●●●
●●
●
●●●
●
●●●
●
●●●●●●●●●●●●●●●●
●●●●
●●
●
●●
●●
●●
●
●
●
●
●
●●
●
●●
●
●●●
●
●●●
●
●●●●●
●
●●
●
●●●
●●
●●
●
●
●●●
●●
●●●●
●●
●●
●●
●●
●
●
●
●
●
●
●●●●
●
●●●●●
●
●
●
●●●●
●
●●
●
●
●
●
●●●
●
●●
●
●
●●●●●
●
●
●
●
●
●●
●
●●
● ●●●●●
●
●●
●●●●●
●●
●
●●●
●
●
●●●
●
●
●
●
●
●
●
●●
●
●●●●
●
●
●●
●
●●
●●●●●●●●●●●●●●
●●
●
●●
●●●
●
●
●
●●
●●
●
●●●
●
●●●●
●
●
●
●
●●
●●
●●
●●●●●●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●●
●●●
●
●
●●
●
●●
●
●
●
●●
●
●●●
●
●●
●
●
●●●
●
●●●●●
●
●
●●●
●●●●●
●●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●●●●●●
●
●●●
●●
●
●●●
●
● ● ●
●●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●● ●●●
●
●
●● ●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●●●
● ●●
●
● ●●●●● ●● ●
●●
● ●● ● ●
●
●●
●●
●●
●
● ●● ●
●
●
●●
● ●
●
●●
●
●●
● ●
●
●
● ● ●●●● ●
●
●
●
●●
●
● ●●●●
●●
●●●
●●
●●
●
●
●
●●
●
●
●●●●
●●●
●
● ●●
●●
●
●
●
●●●
●
●
●●●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●●
●
● ●
●
●
●
●
●●
●●
●
●
● ●●●
●
●
●
●
●●
●
●
●
●● ●●
●
●●
●
●
● ● ● ●
●
● ●
●
●●
● ●●●●
●
●
●
●
● ●
●
●
● ●
●
●● ●
●
●
●
● ●●
●●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
● ●●
●
●
●
●
●
● ●
●●
●
●
● ●
●
●
●
●
● ●
●
●●
●
●
●
● ●
●
●
●●●● ●
●
●
●●
●
●
● ●
●●
●
●●
●
●
●
●
●
●●●
●●●●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
● ●●●
●
●
● ●
●●● ●●
●
●
●
●●
●●
●
●●
●●
● ●●●●
● ●
●
●
● ●
● ● ●
●
● ●
●
●
●
●●
●● ●
●●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●● ●●
●● ●
●
●
●
● ●
●
●●
●
●
● ●
●●●●●
●● ●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●● ●●
●
●
● ●●
●
●
●●●
●
●●●●
●
●●
● ●
●
●
●
●●
● ●
●
●
●●●
●●●●●●
●●●●
●● ●●
●●●●
●●●
●●●
●
●
●
●
●● ●●
●
●●●
●● ● ●
●●●
●●●●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●●●
●●
●
●
●
● ●●
●
●
●●
●●
●
●
●●
●
●
●●
●
●●
●
●●●●●
●●
●●
●
●
● ●●●
●●
●
●
●●
●
●●
●●
●●●
●
●
●
●●
●
●
●● ● ●●
●●●●●●●●●●●●●●●●
● ●●
●●●
●●
●●●●
●
●
●
●
● ●●
● ●
●
●● ●●●●●
●
● ● ●
●
●● ●●
●
●●
●
●●
●
●
●●●
●●
●
●
●
●
●●●
●
●● ●● ●
●● ●
●
●
●●
●
●
●●●●
●●● ●
●●
●●●●●
●
●
●●●
●
●●
●
●●
●
●
●●●
●●
●●●
●
●
●
●
●
●
●
●●
●
●
●●●● ●●●
●●
●●
●● ●
●●
●
●●
●
●
●●●●●
● ●●
●
●
●●
●
●
●
●●●●
●
●●
●
●●●
●
●
●
●
●
●
●
●●●
●●
● ●
●
● ●●
● ●●●●●
●
●
●●
●
●●
●
●
●●
●
●
●
●●●●●●
●
●
●●●●
●●
● ●●●●● ●
●
●
●
●●
●●
●
●●
●
●
●
●
●●●●●
●
●
●
●●●●
●
●
●
●●●●●● ●
●●
●●
●●●
●●●
●●●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●●●●●
●●● ●
●●●
●
●
●
●
●
●
●●
●
●
●●●●● ●●● ●
●
●
●
●
● ●●●
●
●
●●
●
● ●●
●
●
●
● ●●
●
●
●
●
●
●●●
●
●
●● ● ● ●
●
●● ●
● ●●●
● ●
●
● ●
●
●
●
●
●●
● ● ● ●
●●
●
●
●
●●
●●
●
●●
●
●
●●
● ●
●
●
●
●
●
● ● ●
●
●
●
●
●●
●
●
●
●
●
● ●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●
●●●●
●
● ●
● ●
●
● ●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●● ●
●
●● ● ●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●● ●●
●
●
● ●
●
●●
●
●●
●●●
●
●
●
●● ●
●●
●●
●
● ●● ●
●
●
●
●
●●
●
●
●●
●●
● ●
●
●
●
●●
●
●
●
●
●●
●●●
●●
●●● ●●
●●
●●●
●●
●●
●
●
●
●
●●
●● ●● ●
●
●
●
●
●
●
●
●●
●
● ●●
●
●●
● ●
●
●●
●●
● ●●
●
●
●
●
● ●●
●
●
●
●
●
●●
●
●
●●
●●●●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●●
●
●
●
●
●
●
●●●
●● ●●●●●●
●●
●●●●●●●●
●
●
●
●
●
● ●
●●
●●
●
●●●●
●●
●●
●
●
●●
●
●
●
●
●
●●●●
●
●
●
● ● ●
●
●●
●●●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●●●●●●
●●
●
●
●
●
●●
●
●
●●
●
●
●●●●
●
●●
●●
● ●
●
●
●
● ● ● ●
●●●
●
●
●
●
●
●
●
● ●
●
● ●●
● ●
●●
●
●
●
●●
●
●
●● ●
●
●●
●
●
●
●
●
●
●●●●●
●●
●● ●
●
●
●●
●
●
●
●
●●●●●●●●
●●●
●
●●●●
●●● ●
●
●●
●
●
●●●● ●●●●
●
● ●
●
●
● ●●●●●
●
●
●
●
●
● ●
●
● ●
●●●
●●●
●
●
●
●●
●●
● ●
●
● ●
●●
●
●●
●
●●
●
●
●
●
●
●
● ●●
●
● ●
●
●●●●
●●
●
●
●
●
●
●●● ●
●
●● ●●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
● ●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
● ●●●●
●
●
●
●
●
● ●●
●
●
●
● ●
●
● ●
● ●●
●●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●● ●●
●
●
●
●
●● ●
●
●
●
●●
●
●●
●
●●●●
●●●
●
●
●
●●● ●
●
●
●
●●●
●
●
●
●
●●
●●
●
●●
●
● ●●●
●
●
●
●●
●● ●
●●
●
● ●
●
●●
●
●
●
●
●
● ●●
●●
●●●
●
●
●
●
●●●
●
●● ●●
●●
●● ●●
●
●●● ●●
●●● ●
●●●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●●●●
●
●
●
●●
●●●
●
●
●●●
●●
●●
●●●●●
●
●
●●●●
●
●
●●● ●● ●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
● ●●●●● ●●● ●●●
●
●
●
●
●
● ●●
●
●
●
●
●
●●
●●
● ●●●●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●●
●
● ●
●
●
●
●● ●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●● ●
●
●
●●●●●●●
●●
●●●●
●●
●
●
●●
●●
●
●
●
●●
●
●●●
●
●
●
●●
●
● ●●●● ●●●●●
●●●●●
●●
●
●●●●
●
●
●●
●
●●●
●
●
●●● ●● ●
●
●● ●
●
●
●
●●
●●● ●●
●●
●● ●
●
●
●●
●
●
●
●●
●●
●
●
●
● ●
●
●
●
●●●●●
● ●
● ●
●
●
●●
●
●●
●
●
●
●
● ● ●●● ●
●
● ●● ●
●
●
●●
●
● ●
●●
●
●
● ●
●
●
●●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●●
●
●
●
●
● ●
●●
● ●
● ●
●
●
●
●
●
●● ●
● ●
●
●
●
●
●
●
●
●●●● ●●
●
●
●
●
●
● ●●
●
●
●
● ● ●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
● ●
●
● ●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●● ●
●
●
● ●
●
●
●●●
●
●●
●●
●
●
●●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●
● ●
●
● ●
●
●●
●●
●●
●
●
●
●
● ●
●
●
●
●
● ●
● ●●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●●
●●
●
●
●
●
●
● ●
● ●
● ●
●
●●
●
●
● ●●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
● ●
●
● ●●
●
●
● ●
●
●
●
●●
● ●● ●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
● ●● ●●
●●
●
●
●
●
●
●●
●
●
●● ●
●
●
●
●
●
●●●
●
●●●
● ●
●●
●
●●●●
●
●
●
●
●●
●
●
● ●
● ●●
● ●● ●● ●
●
● ●
●
●
●
●
●
●
● ● ●
●
● ●
●
●
●
●
●● ●●
●
●
●
●
●
●
●
●
●
●
●
● ●●
● ●
●
● ●●
●
●
●●
●●
● ●
●
●
●
● ●
●
●
● ●●
●
● ●
●
● ● ●
●
● ●
●●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
● ●
● ● ●●●● ●
●
● ●●
●
●
● ●
●
●
●
● ●
●
●●
●
●
●
●
●
● ●● ●
●
●
● ●
●
●
●
● ●
●
●
●
●●
●
●●
●
● ●
●
● ●●
●
●
●
●
●
●
●
●●
● ●
●
●
●●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
● ●
●
●
●
●● ●
● ● ●
●●
●●●
●
●
●
● ●
●
●
●
●
● ●
●
●●
●
● ● ●
●
●
●
●
●
●
● ●
●
●
●● ●
●
●
●
● ●
●
●
●●
●
●
●
●● ●
●
●
●
●
● ●
●
●
● ●●
●
●
● ●
●
●● ● ●
●
● ●●
●● ●
●
● ●
●
●
●●
●●
●
●
● ●● ●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
● ●
●
● ●●●
●
●
●
●
●
●● ●
●
●
●●
●
●●
●
●
●●● ●
●
●●●●
●●
●
●
●
●
●
●
● ●●●
●
●
●●● ●●
●
●
●
●
●●
●
●
● ●●
● ● ● ●
●
●
●●
●
●
●
●●
●
● ●
● ● ●●●● ●
●●
●
●●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
● ●
●●
● ●●
●
●●
●
●
●●
● ● ●
●
●
● ●●●
●●●
●● ●
●
● ●
●
●●
●
●●● ●
●
●
● ●
●
●
●
●
●
●●
●●
●●
●
●● ●● ●
●●
●
●
●●●
●
●
●
●
●
●●
rs10965250 stage 1
● r^2: 0.8 βˆ’ 1.0
● r^2: 0.6 βˆ’ 0.8
● r^2: 0.4 βˆ’ 0.6
● r^2: 0.2 βˆ’ 0.4
● r^2: 0.0 βˆ’ 0.2
● r^2 missing
<βˆ’ MLLT3
KIAA1797 βˆ’>
<βˆ’ PTPLAD2
<βˆ’ IFNB1
<βˆ’ IFNW1
<βˆ’ IFNA21
<βˆ’ IFNA4
<βˆ’ IFNA7
<βˆ’ IFNA13
MTAP βˆ’>
<βˆ’ CDKN2A
<βˆ’ CDKN2B
DMRTA1 βˆ’>
<βˆ’ ELAVL2
21 22 23 24
Position on chromosome 9 (Mb)
40
60
80
100
recombinationrate(c
CDC123/CAMK1D Region
4
6
8
10
log10(Pβˆ’value)
40
60
80
100
recombinationrate(c
rs12779790
●●●
●
●
●●
●
rs12779790 stage 1
● r^2: 0.8 βˆ’ 1.0
● r^2: 0.6 βˆ’ 0.8
● r^2: 0.4 βˆ’ 0.6
● r^2: 0.2 βˆ’ 0.4
● r^2: 0.0 βˆ’ 0.2
● r^2 missing
HHEX/IDE Region
10
15
log10(Pβˆ’value)
40
60
80
100
recombinationrate(c
rs5015480
●
●
●
●
●
●●
●
●
●
●●●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●●
rs5015480 stage 1
● r^2: 0.8 βˆ’ 1.0
● r^2: 0.6 βˆ’ 0.8
● r^2: 0.4 βˆ’ 0.6
● r^2: 0.2 βˆ’ 0.4
● r^2: 0.0 βˆ’ 0.2
● r^2 missing
.609
Not in a gene...In a gene...
~90% of GWAS hits are non-coding!
pporting!Figures!
!
!
~90% of GWAS hits are non-coding!
Stamatoyannopoulos, Science 2012
Systematic Localization of Common
Disease-Associated Variation in
Regulatory DNA
Matthew T. Maurano,1
* Richard Humbert,1
* Eric Rynes,1
* Robert E. Thurman,1
Eric Haugen,1
Hao Wang,1
Alex P. Reynolds,1
Richard Sandstrom,1
Hongzhu Qu,1,2
Jennifer Brody,3
Anthony Shafer,1
Fidencio Neri,1
Kristen Lee,1
Tanya Kutyavin,1
Sandra Stehling-Sun,1
Audra K. Johnson,1
Theresa K. Canfield,1
Erika Giste,1
Morgan Diegel,1
Daniel Bates,1
R. Scott Hansen,4
Shane Neph,1
Peter J. Sabo,1
Shelly Heimfeld,5
Antony Raubitschek,6
Steven Ziegler,6
Chris Cotsapas,7,8
Nona Sotoodehnia,3,9
Ian Glass,10
Shamil R. Sunyaev,11
Rajinder Kaul,4
John A. Stamatoyannopoulos1,12
†
Genome-wide association studies have identified many noncoding variants associated with common
diseases and traits. We show that these variants are concentrated in regulatory DNA marked by
deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). Eighty-eight percent of such DHSs are active
during fetal development and are enriched in variants associated with gestational exposure–related
phenotypes. We identified distant gene targets for hundreds of variant-containing DHSs that may explain
phenotype associations. Disease-associated variants systematically perturb transcription factor recognition
sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrated
tissue-selective enrichment of more weakly disease-associated variants within DHSs and the de novo
identification of pathogenic cell types for Crohn’s disease, multiple sclerosis, and an electrocardiogram
trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of
regulatory DNA variation in common human disease and provide pathogenic insights into diverse disorders.
D
isease- and trait-associated genetic variants
are rapidly being identified with genome-
wide association studies (GWAS) and re-
lated strategies (1). To date, hundreds of GWAS
have been conducted, spanning diverse diseases
and quantitative phenotypes (2) (fig. S1A). How-
ever, the majority (~93%) of disease- and trait-
associated variants emerging from these studies
lie within noncoding sequence (fig. S1B), com-
plicating their functional evaluation. Several lines
of evidence suggest the involvement of a propor-
tion of such variants in transcriptional regulatory
mechanisms, including modulation of promoter
and enhancer elements (3–6) and enrichment with-
in expression quantitative trait loci (eQTL) (3, 7, 8).
Human regulatory DNA encompasses a vari-
ety of cis-regulatory elements within which the co-
operative binding of transcription factors creates
focal alterations in chromatin structure. Deoxy-
ribonuclease I (DNase I) hypersensitive sites (DHSs)
are sensitive and precise markers of this actuated
regulatory DNA, and DNase I mapping has been
instrumental in the discovery and census of hu-
man cis-regulatory elements (9). We performed
DNase I mapping genome-wide (10) in 349 cell
and tissue samples, including 85 cell types studied
under the ENCODE Project (10) and 264 sam-
ples studied under the Roadmap Epigenomics
Program (11). These encompass several classes
nome. In total, we identified 3,899,693 distinct
DHS positions along the genome (collectively
spanning 42.2%), each of which was detected in
one or more cell or tissue types (median = 5).
Disease- and trait-associated variants are
concentrated in regulatory DNA. We examined
the distribution of 5654 noncoding genome-wide
significant associations [5134 unique single-
nucleotide polymorphisms (SNPs); fig. S1 and
table S2] for 207 diseases and 447 quantitative
traits (2) with the deep genome-scale maps of
regulatory DNA marked by DHSs. This revealed
a collective 40% enrichment of GWAS SNPs in
DHSs (fig. S1C, P < 10βˆ’55
, binomial, compared to
the distribution of HapMap SNPs). Fully 76.6%
of all noncoding GWAS SNPs either lie within a
DHS (57.1%, 2931 SNPs) or are in complete
linkage disequilibrium (LD) with SNPs in a near-
by DHS (19.5%, 999 SNPs) (Fig. 1A) (12). To con-
firm this enrichment, we sampled variants from
the 1000 Genomes Project (13) with the same ge-
nomic feature localization (intronic versus inter-
genic), distance from the nearest transcriptional
start site, and allele frequency in individuals of
European ancestry. We confirmed significant en-
richment both for SNPs within DHSs (P < 10βˆ’59
,
simulation) and also including variants in com-
plete LD (r 2
= 1) with SNPs in DHSs (P < 10βˆ’37
,
simulation) (fig. S2).
In total, 47.5% of GWAS SNPs fall within
gene bodies (fig. S1B); however, only 10.9% of
intronic GWAS SNPs within DHSs are in strong
LD (r2
β‰₯ 0.8) with a coding SNP, indicating that
the vast majority of noncoding genic variants
are not simply tagging coding sequence. Analo-
gously, only 16.3% of GWAS variants within
coding sequences are in strong LD with variants in
DHSs. SNPs on widely used genotyping arrays
(e.g., Affymetrix) were modestly enriched with-
in DHSs (fig. S2), possibly due to selection of
SNPs with robust experimental performance in
genotyping assays. However, we found no evi-
dence for sequence composition bias (table S3).
To further examine the enrichment of GWAS
SNPs in regulatory DNA, we systematically clas-
sified all noncoding GWAS SNPs by the quality
1
Department of Genome Sciences, University of Washington,
Seattle, WA 98195, USA. 2
Laboratory of Disease Genomics
RESEARCH ARTICLE
onSeptember12,2012www.sciencemag.orgDownloadedfrom
There have been few, if any, similar bursts of discovery in the
history of medical research.
David Hunter and Peter Kraft, NEJM, 2007
Common claims discussed in regards to GWAS:
Despite issues, yielded many discoveries vs. cost
to a doubling of the number of associated variants discov-
ered. The proportion of genetic variation explained by
significantly associated SNPs is usually low (typically less
than 10%) for many complex traits, but for diseases such
as CD and multiple sclerosis (MS [MIM 126200]), and for
quantitative traits such as height and lipid traits, between
Figure 1. GWAS Discoveries over Time
Data obtained from the Published GWAS Catalog (see Web
Resources). Only the top SNPs representing loci with association
p values < 5 3 10Γ€8
are included, and so that multiple counting
is avoided, SNPs identified for the same traits with LD r2
> 0.8 esti-
mated from the entire HapMap samples are excluded.
~500,000 SNP chips x ~$500/chip

= $250M
Five years of GWAS Discovery (Visscher, 2012)
$250M / ~2000 loci

= $125K/locus
Candidate genes: >$250M!
100 NIH R01s

Fighter jet

Hadron Collider: $9B
P = G + EType 2 Diabetes

Cancer

Alzheimer’s

Gene expression
Phenotype Genome
Variants
Environment
Infectious agents

Nutrients

Pollutants

Drugs
Complex traits are a function of genes and
environment...
Nothing comparable to elucidate E influence!
We lack high-throughput methods
and data to discover new E in P…
E: ???
A similar paradigm for discovery should exist

for E!
Why?
Οƒ2
P = Οƒ2
G + Οƒ2
E
Οƒ2
G
Οƒ2
P
H2 =
Heritability (H2) is the range of phenotypic variability
attributed to genetic variability in a population
Indicator of the proportion of phenotypic
differences attributed to G.
Height is an example of a heritable trait:

Francis Galton shows how its done (1887)
β€œmid-height of 205 parents
described 60% of variability of 928
offspring”
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701

Mais conteΓΊdo relacionado

Mais procurados

Back to Basics: Using GWAS to Drive Discovery for Complex Diseases
Back to Basics: Using GWAS to Drive Discovery for Complex DiseasesBack to Basics: Using GWAS to Drive Discovery for Complex Diseases
Back to Basics: Using GWAS to Drive Discovery for Complex DiseasesGolden Helix Inc
Β 
Development of polygenic risk scores for ambulatory care sensitive hospitalis...
Development of polygenic risk scores for ambulatory care sensitive hospitalis...Development of polygenic risk scores for ambulatory care sensitive hospitalis...
Development of polygenic risk scores for ambulatory care sensitive hospitalis...Dr Arindam Basu
Β 
Genomic Selection in Plants
Genomic Selection in PlantsGenomic Selection in Plants
Genomic Selection in PlantsPrakash Narayan
Β 
Genomic selection, prediction models, GEBV values, genomic selection in plant...
Genomic selection, prediction models, GEBV values, genomic selection in plant...Genomic selection, prediction models, GEBV values, genomic selection in plant...
Genomic selection, prediction models, GEBV values, genomic selection in plant...Mahesh Biradar
Β 
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...Torsten Seemann
Β 
Genome wide association mapping
Genome wide association mappingGenome wide association mapping
Genome wide association mappingAvjinder (Avi) Kaler
Β 
Association mapping
Association mappingAssociation mapping
Association mappingSenthil Natesan
Β 
SNPs analysis methods
SNPs analysis methodsSNPs analysis methods
SNPs analysis methodshad89
Β 
Genomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single stepGenomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single stepILRI
Β 
Measures of Linkage Disequilibrium
Measures of Linkage DisequilibriumMeasures of Linkage Disequilibrium
Measures of Linkage DisequilibriumAwais Khan
Β 
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...FAO
Β 
Genomic selection
Genomic  selectionGenomic  selection
Genomic selectionpandadebadatta
Β 
Advanced biometrical and quantitative genetics akshay
Advanced biometrical and quantitative genetics akshayAdvanced biometrical and quantitative genetics akshay
Advanced biometrical and quantitative genetics akshayAkshay Deshmukh
Β 
Association mapping approaches for tagging quality traits in maize
Association mapping approaches for tagging quality traits in maizeAssociation mapping approaches for tagging quality traits in maize
Association mapping approaches for tagging quality traits in maizeSenthil Natesan
Β 
Qtl mapping sachin pbt
Qtl mapping sachin pbtQtl mapping sachin pbt
Qtl mapping sachin pbtSachin Ekatpure
Β 

Mais procurados (20)

Back to Basics: Using GWAS to Drive Discovery for Complex Diseases
Back to Basics: Using GWAS to Drive Discovery for Complex DiseasesBack to Basics: Using GWAS to Drive Discovery for Complex Diseases
Back to Basics: Using GWAS to Drive Discovery for Complex Diseases
Β 
Development of polygenic risk scores for ambulatory care sensitive hospitalis...
Development of polygenic risk scores for ambulatory care sensitive hospitalis...Development of polygenic risk scores for ambulatory care sensitive hospitalis...
Development of polygenic risk scores for ambulatory care sensitive hospitalis...
Β 
Genomic Selection in Plants
Genomic Selection in PlantsGenomic Selection in Plants
Genomic Selection in Plants
Β 
GWAS
GWASGWAS
GWAS
Β 
Genomic selection, prediction models, GEBV values, genomic selection in plant...
Genomic selection, prediction models, GEBV values, genomic selection in plant...Genomic selection, prediction models, GEBV values, genomic selection in plant...
Genomic selection, prediction models, GEBV values, genomic selection in plant...
Β 
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
Β 
Genome wide association mapping
Genome wide association mappingGenome wide association mapping
Genome wide association mapping
Β 
Lecture 7 gwas full
Lecture 7 gwas fullLecture 7 gwas full
Lecture 7 gwas full
Β 
Basics of association_mapping
Basics of association_mappingBasics of association_mapping
Basics of association_mapping
Β 
Molecular markers
Molecular markersMolecular markers
Molecular markers
Β 
Association mapping
Association mappingAssociation mapping
Association mapping
Β 
SNPs analysis methods
SNPs analysis methodsSNPs analysis methods
SNPs analysis methods
Β 
Genomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single stepGenomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single step
Β 
Measures of Linkage Disequilibrium
Measures of Linkage DisequilibriumMeasures of Linkage Disequilibrium
Measures of Linkage Disequilibrium
Β 
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Β 
Genomic selection
Genomic  selectionGenomic  selection
Genomic selection
Β 
Advanced biometrical and quantitative genetics akshay
Advanced biometrical and quantitative genetics akshayAdvanced biometrical and quantitative genetics akshay
Advanced biometrical and quantitative genetics akshay
Β 
QTL mapping for crop improvement
QTL mapping for crop improvementQTL mapping for crop improvement
QTL mapping for crop improvement
Β 
Association mapping approaches for tagging quality traits in maize
Association mapping approaches for tagging quality traits in maizeAssociation mapping approaches for tagging quality traits in maize
Association mapping approaches for tagging quality traits in maize
Β 
Qtl mapping sachin pbt
Qtl mapping sachin pbtQtl mapping sachin pbt
Qtl mapping sachin pbt
Β 

Destaque

Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Chirag Patel
Β 
Repurposing large datasets for exposomic discovery in disease
Repurposing large datasets for exposomic discovery in diseaseRepurposing large datasets for exposomic discovery in disease
Repurposing large datasets for exposomic discovery in diseaseChirag Patel
Β 
The path to implementation of Whole Genome Sequencing (WGS) in PulseNet
The path to implementation of Whole Genome Sequencing (WGS) in PulseNetThe path to implementation of Whole Genome Sequencing (WGS) in PulseNet
The path to implementation of Whole Genome Sequencing (WGS) in PulseNetExternalEvents
Β 
Data analytics to support exposome research course slides
Data analytics to support exposome research course slidesData analytics to support exposome research course slides
Data analytics to support exposome research course slidesChirag Patel
Β 
Informatics and data analytics to support for exposome-based discovery
Informatics and data analytics to support for exposome-based discoveryInformatics and data analytics to support for exposome-based discovery
Informatics and data analytics to support for exposome-based discoveryChirag Patel
Β 
Biomedical Informatics 706: Precision Medicine with exposures
Biomedical Informatics 706: Precision Medicine with exposuresBiomedical Informatics 706: Precision Medicine with exposures
Biomedical Informatics 706: Precision Medicine with exposuresChirag Patel
Β 
NSF Northeast Hub Big Data Workshop
NSF Northeast Hub Big Data WorkshopNSF Northeast Hub Big Data Workshop
NSF Northeast Hub Big Data WorkshopChirag Patel
Β 
Studying the elusive in larger scale
Studying the elusive in larger scaleStudying the elusive in larger scale
Studying the elusive in larger scaleChirag Patel
Β 
Big data exposome and pediatric outcomes
Big data exposome and pediatric outcomesBig data exposome and pediatric outcomes
Big data exposome and pediatric outcomesChirag Patel
Β 
a brief introduction to epistasis detection
a brief introduction to epistasis detectiona brief introduction to epistasis detection
a brief introduction to epistasis detectionHyun-hwan Jeong
Β 
Building a search engine for exposures in disease
Building a search engine for exposures in disease Building a search engine for exposures in disease
Building a search engine for exposures in disease Chirag Patel
Β 

Destaque (11)

Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Correlation globes of the exposome 2016
Correlation globes of the exposome 2016
Β 
Repurposing large datasets for exposomic discovery in disease
Repurposing large datasets for exposomic discovery in diseaseRepurposing large datasets for exposomic discovery in disease
Repurposing large datasets for exposomic discovery in disease
Β 
The path to implementation of Whole Genome Sequencing (WGS) in PulseNet
The path to implementation of Whole Genome Sequencing (WGS) in PulseNetThe path to implementation of Whole Genome Sequencing (WGS) in PulseNet
The path to implementation of Whole Genome Sequencing (WGS) in PulseNet
Β 
Data analytics to support exposome research course slides
Data analytics to support exposome research course slidesData analytics to support exposome research course slides
Data analytics to support exposome research course slides
Β 
Informatics and data analytics to support for exposome-based discovery
Informatics and data analytics to support for exposome-based discoveryInformatics and data analytics to support for exposome-based discovery
Informatics and data analytics to support for exposome-based discovery
Β 
Biomedical Informatics 706: Precision Medicine with exposures
Biomedical Informatics 706: Precision Medicine with exposuresBiomedical Informatics 706: Precision Medicine with exposures
Biomedical Informatics 706: Precision Medicine with exposures
Β 
NSF Northeast Hub Big Data Workshop
NSF Northeast Hub Big Data WorkshopNSF Northeast Hub Big Data Workshop
NSF Northeast Hub Big Data Workshop
Β 
Studying the elusive in larger scale
Studying the elusive in larger scaleStudying the elusive in larger scale
Studying the elusive in larger scale
Β 
Big data exposome and pediatric outcomes
Big data exposome and pediatric outcomesBig data exposome and pediatric outcomes
Big data exposome and pediatric outcomes
Β 
a brief introduction to epistasis detection
a brief introduction to epistasis detectiona brief introduction to epistasis detection
a brief introduction to epistasis detection
Β 
Building a search engine for exposures in disease
Building a search engine for exposures in disease Building a search engine for exposures in disease
Building a search engine for exposures in disease
Β 

Semelhante a Intro to Biomedical Informatics 701

Repurposing large datasets to dissect exposomic (and genomic) contributions i...
Repurposing large datasets to dissect exposomic (and genomic) contributions i...Repurposing large datasets to dissect exposomic (and genomic) contributions i...
Repurposing large datasets to dissect exposomic (and genomic) contributions i...Chirag Patel
Β 
Big data and the exposome, Oregon State 040616
Big data and the exposome, Oregon State 040616Big data and the exposome, Oregon State 040616
Big data and the exposome, Oregon State 040616Chirag Patel
Β 
Mark Daly - Finding risk genes in psychiatric disorders
Mark Daly - Finding risk genes in psychiatric disordersMark Daly - Finding risk genes in psychiatric disorders
Mark Daly - Finding risk genes in psychiatric disorderswef
Β 
Japanese Environmental Children's Study and Data-driven E
Japanese Environmental Children's Study and Data-driven EJapanese Environmental Children's Study and Data-driven E
Japanese Environmental Children's Study and Data-driven EChirag Patel
Β 
Search engine for E NEU network science 080817
Search engine for E NEU network science 080817Search engine for E NEU network science 080817
Search engine for E NEU network science 080817Chirag Patel
Β 
Montgomery expression
Montgomery expressionMontgomery expression
Montgomery expressionmorenorossi
Β 
Schizophrenia
SchizophreniaSchizophrenia
Schizophreniaguest0781e91
Β 
AACR 041616 digital exposomes
AACR 041616 digital exposomesAACR 041616 digital exposomes
AACR 041616 digital exposomesChirag Patel
Β 
Simulating Genes in Genome-wide Association Studies
Simulating Genes in Genome-wide Association StudiesSimulating Genes in Genome-wide Association Studies
Simulating Genes in Genome-wide Association StudiesKevin Thornton
Β 
Day2 145pm Crawford
Day2 145pm CrawfordDay2 145pm Crawford
Day2 145pm CrawfordSean Paul
Β 
ν—¬μŠ€μΌ€μ–΄ λΉ…λ°μ΄ν„°λ‘œ 무엇을 ν•  수 μžˆλŠ”κ°€?
ν—¬μŠ€μΌ€μ–΄ λΉ…λ°μ΄ν„°λ‘œ 무엇을 ν•  수 μžˆλŠ”κ°€?ν—¬μŠ€μΌ€μ–΄ λΉ…λ°μ΄ν„°λ‘œ 무엇을 ν•  수 μžˆλŠ”κ°€?
ν—¬μŠ€μΌ€μ–΄ λΉ…λ°μ΄ν„°λ‘œ 무엇을 ν•  수 μžˆλŠ”κ°€? Hyung Jin Choi
Β 
Genetics In Psychiatry
Genetics In PsychiatryGenetics In Psychiatry
Genetics In PsychiatryFrank Meissner
Β 
Thesis On Psoriasis
Thesis On PsoriasisThesis On Psoriasis
Thesis On PsoriasisAmanda Burkett
Β 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationJoaquin Dopazo
Β 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicJoaquin Dopazo
Β 
A genetic model for neurodevelopmental disease
A genetic model for neurodevelopmental diseaseA genetic model for neurodevelopmental disease
A genetic model for neurodevelopmental diseasePharmed Solutions Institute
Β 
10.1164@rccm.201701 0053 ed
10.1164@rccm.201701 0053 ed10.1164@rccm.201701 0053 ed
10.1164@rccm.201701 0053 edJulio A. Diaz M.
Β 

Semelhante a Intro to Biomedical Informatics 701 (20)

Repurposing large datasets to dissect exposomic (and genomic) contributions i...
Repurposing large datasets to dissect exposomic (and genomic) contributions i...Repurposing large datasets to dissect exposomic (and genomic) contributions i...
Repurposing large datasets to dissect exposomic (and genomic) contributions i...
Β 
Big data and the exposome, Oregon State 040616
Big data and the exposome, Oregon State 040616Big data and the exposome, Oregon State 040616
Big data and the exposome, Oregon State 040616
Β 
Mark Daly - Finding risk genes in psychiatric disorders
Mark Daly - Finding risk genes in psychiatric disordersMark Daly - Finding risk genes in psychiatric disorders
Mark Daly - Finding risk genes in psychiatric disorders
Β 
Japanese Environmental Children's Study and Data-driven E
Japanese Environmental Children's Study and Data-driven EJapanese Environmental Children's Study and Data-driven E
Japanese Environmental Children's Study and Data-driven E
Β 
Search engine for E NEU network science 080817
Search engine for E NEU network science 080817Search engine for E NEU network science 080817
Search engine for E NEU network science 080817
Β 
Montgomery expression
Montgomery expressionMontgomery expression
Montgomery expression
Β 
Schizophrenia
SchizophreniaSchizophrenia
Schizophrenia
Β 
GWAS Study.pdf
GWAS Study.pdfGWAS Study.pdf
GWAS Study.pdf
Β 
AACR 041616 digital exposomes
AACR 041616 digital exposomesAACR 041616 digital exposomes
AACR 041616 digital exposomes
Β 
Simulating Genes in Genome-wide Association Studies
Simulating Genes in Genome-wide Association StudiesSimulating Genes in Genome-wide Association Studies
Simulating Genes in Genome-wide Association Studies
Β 
Day2 145pm Crawford
Day2 145pm CrawfordDay2 145pm Crawford
Day2 145pm Crawford
Β 
Genetics in Psychiatry
Genetics in PsychiatryGenetics in Psychiatry
Genetics in Psychiatry
Β 
ν—¬μŠ€μΌ€μ–΄ λΉ…λ°μ΄ν„°λ‘œ 무엇을 ν•  수 μžˆλŠ”κ°€?
ν—¬μŠ€μΌ€μ–΄ λΉ…λ°μ΄ν„°λ‘œ 무엇을 ν•  수 μžˆλŠ”κ°€?ν—¬μŠ€μΌ€μ–΄ λΉ…λ°μ΄ν„°λ‘œ 무엇을 ν•  수 μžˆλŠ”κ°€?
ν—¬μŠ€μΌ€μ–΄ λΉ…λ°μ΄ν„°λ‘œ 무엇을 ν•  수 μžˆλŠ”κ°€?
Β 
Genetics In Psychiatry
Genetics In PsychiatryGenetics In Psychiatry
Genetics In Psychiatry
Β 
Thesis On Psoriasis
Thesis On PsoriasisThesis On Psoriasis
Thesis On Psoriasis
Β 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical information
Β 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
Β 
10 Liu, Dajiang
10 Liu, Dajiang10 Liu, Dajiang
10 Liu, Dajiang
Β 
A genetic model for neurodevelopmental disease
A genetic model for neurodevelopmental diseaseA genetic model for neurodevelopmental disease
A genetic model for neurodevelopmental disease
Β 
10.1164@rccm.201701 0053 ed
10.1164@rccm.201701 0053 ed10.1164@rccm.201701 0053 ed
10.1164@rccm.201701 0053 ed
Β 

Mais de Chirag Patel

EWAS and the exposome: Mt Sinai in Brescia 052119
EWAS and the exposome: Mt Sinai in Brescia 052119EWAS and the exposome: Mt Sinai in Brescia 052119
EWAS and the exposome: Mt Sinai in Brescia 052119Chirag Patel
Β 
NCI systems epidemiology 03012019
NCI systems epidemiology 03012019NCI systems epidemiology 03012019
NCI systems epidemiology 03012019Chirag Patel
Β 
Chirag patel unite for sight 041418
Chirag patel unite for sight 041418Chirag patel unite for sight 041418
Chirag patel unite for sight 041418Chirag Patel
Β 
Bioinformatics Strategies for Exposome 100416
Bioinformatics Strategies for Exposome 100416Bioinformatics Strategies for Exposome 100416
Bioinformatics Strategies for Exposome 100416Chirag Patel
Β 
Methods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataMethods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataChirag Patel
Β 
Searching for predictors of male fecundity
Searching for predictors of male fecunditySearching for predictors of male fecundity
Searching for predictors of male fecundityChirag Patel
Β 

Mais de Chirag Patel (6)

EWAS and the exposome: Mt Sinai in Brescia 052119
EWAS and the exposome: Mt Sinai in Brescia 052119EWAS and the exposome: Mt Sinai in Brescia 052119
EWAS and the exposome: Mt Sinai in Brescia 052119
Β 
NCI systems epidemiology 03012019
NCI systems epidemiology 03012019NCI systems epidemiology 03012019
NCI systems epidemiology 03012019
Β 
Chirag patel unite for sight 041418
Chirag patel unite for sight 041418Chirag patel unite for sight 041418
Chirag patel unite for sight 041418
Β 
Bioinformatics Strategies for Exposome 100416
Bioinformatics Strategies for Exposome 100416Bioinformatics Strategies for Exposome 100416
Bioinformatics Strategies for Exposome 100416
Β 
Methods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataMethods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big data
Β 
Searching for predictors of male fecundity
Searching for predictors of male fecunditySearching for predictors of male fecundity
Searching for predictors of male fecundity
Β 

Último

call girls in green park DELHI πŸ” >ΰΌ’9540349809 πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in green park  DELHI πŸ” >ΰΌ’9540349809 πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈcall girls in green park  DELHI πŸ” >ΰΌ’9540349809 πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in green park DELHI πŸ” >ΰΌ’9540349809 πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈsaminamagar
Β 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photosnarwatsonia7
Β 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
Β 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
Β 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
Β 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiNehru place Escorts
Β 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...Miss joya
Β 
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...Miss joya
Β 
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service MumbaiLow Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbaisonalikaur4
Β 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
Β 
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Modelssonalikaur4
Β 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknownarwatsonia7
Β 
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...narwatsonia7
Β 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalorenarwatsonia7
Β 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.MiadAlsulami
Β 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowNehru place Escorts
Β 
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...narwatsonia7
Β 
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment BookingHousewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
Β 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
Β 
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort ServiceCollege Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort ServiceNehru place Escorts
Β 

Último (20)

call girls in green park DELHI πŸ” >ΰΌ’9540349809 πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in green park  DELHI πŸ” >ΰΌ’9540349809 πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈcall girls in green park  DELHI πŸ” >ΰΌ’9540349809 πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in green park DELHI πŸ” >ΰΌ’9540349809 πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
Β 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Β 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Β 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Β 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Β 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Β 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
Β 
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
Β 
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service MumbaiLow Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Β 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Β 
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Β 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
Β 
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Β 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Β 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Β 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Β 
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Β 
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment BookingHousewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Β 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Β 
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort ServiceCollege Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
Β 

Intro to Biomedical Informatics 701

  • 1. Bioinformatics for discovery: Introduction to GWAS and EWAS BMI 701:Introduction to Biomedical Informatics 12/1/2015 chirag@hms.harvard.edu @chiragjp www.chiragjpgroup.org Chirag J Patel
  • 2. P = G + EType 2 Diabetes Cancer Alzheimer’s Gene expression Phenotype Genome Variants Environment Infectious agents Nutrients Pollutants Drugs Complex traits are a function of genes and environment...
  • 3. We are great at G investigation! over 2000 Genome-wide Association Studies (GWAS) https://www.ebi.ac.uk/gwas/ G
  • 4. >2,000 traits/diseases >15,000 SNPs >16,000 SNP-trait associations https://www.ebi.ac.uk/gwas/
  • 5. Dissecting G in P: What is a Genome-wide Association Study? Hypothesis-free β€œsearch engine” for genetic variants associated with a complex trait or disease in unrelated populations SNP(A) SNP(a) diseased non- diseased SNP(A) SNP(a) diseased non- diseased SNP(A) SNP(a) diseased non- diseased SNP(A) SNP(a) diseased non- diseased SNP(A) SNP(a) diseased non- diseased SNP(A) SNP(a) diseased non- diseased SNP(A) SNP(a) diseased non- diseased SNP(A) SNP(a) diseased non- diseased SNP(A) SNP(a) diseased non- diseased SNP(Z) SNP(z) diseased non- diseasedgenome-wide
  • 6. The road to GWAS...
  • 7. A new paradigm of GWAS for discovery of G in P: Human Genome Project to GWAS Sequencing of the genome 2001 HapMap project: http://hapmap.ncbi.nlm.nih.gov/ Characterize common variation 2001-current day High-throughput variant assay < $99 for ~1M variants Measurement tools ~2003 (ongoing) ARTICLES Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls The Wellcome Trust Case Control Consortium* There is increasing evidence that genome-wide association (GWA) studies represent a powerful approach to the identification of genes involved in common human diseases. We describe a joint GWA study (using the Affymetrix GeneChip 500K Mapping Array Set) undertaken in the British population, which has examined ,2,000 individuals for each of 7 major diseases and a shared set of ,3,000 controls. Case-control comparisons identified 24 independent association signals at P , 5 3 1027 : 1 in bipolar disorder, 1 in coronary artery disease, 9 in Crohn’s disease, 3 in rheumatoid arthritis, 7 in type 1 diabetes and 3 in type 2 diabetes. On the basis of prior findings and replication studies thus-far completed, almost all of these signals reflect genuine susceptibility effects. We observed association at many previously identified loci, and found compelling evidence that some loci confer risk for more than one of the diseases studied. Across all diseases, we identified a 25 27 Vol 447|7 June 2007|doi:10.1038/nature05911 Nature 2008 Comprehensive, high-throughput analyses GWAS
  • 8. Number of raw publications with subject of β€œGWAS” 0 1000 2000 3000 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Year NumberofPublications'GWAS' pubmed MeSH terms: human + GWAS
  • 9. Number of raw publications with subject of β€œGWAS” 0 1000 2000 3000 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Year NumberofPublications'GWAS' pubmed MeSH terms: human + GWAS Risch + Merikangas linkage vs. association human genome sequenced GWAS age-related macular degeneration mega-meta-GWAS WTCCC GWAS is relevant today (even with NGS) around the corner
  • 11. Geneticists have made substantial progress in identifying the genetic basis of many human diseases, at least those with conspicuous deter- minants.ThesesuccessesincludeHuntington's disease, Alzheimer's disease, and some forms of breast cancer. However, the detection of ge- netic factors for complex diseases-such as schizophrenia, bipolardisorder, anddiabetes- has been far more complicated. There have been numerous reports of genes or loci that might underlie these disorders, butfew ofthese findings have been replicated. The modest na- ture ofthe gene effectsforthese disorders likely explains the contradictory and inconclusive claims about their identification. Despite the small effects of such genes, the magnitude of theirattributable risk (theproportion ofpeople affectedduetothem) maybelargebecause they are quite frequent in the population, making them ofpublic health significance. Has the genetic study ofcomplex disorders reached its limits? The persistent lack of replicability of these reports of linkage be- tween various loci and complex diseases might imply that it has. We argue below that age analysis we have chosen for this argu- ment is a popular current paradigm in which pairs of siblings, both with the disease, are examined for sharing of alleles at multiple sites in the genome defined by genetic mark- ers. The more often the affected siblings share the same allele at a particular site, the more likely the site is close to the disease gene. Using the formulas in (1), we calculate the expected proportion Yofalleles shared by a pair ofaffected siblings for the best possible case-that is, a closely linked marker locus (recombination fraction 0 = 0) that is fully informative (heterozygosity = 1) (2)-as 1 +W wherew= pq(y-1)2 2+w (py+q)2 If there is no linkage of a marker at a particular site to the disease, the siblings would be expected to share alleles 50% ofthe time; that is, Y would equal 0.5. Values of Y for various values ofp and y are given in the third column of the table. For an allele of moderate frequency (p is 0.1 to 0.5) that con- linkage analysis for about 2 or less will ne because the numbe (more than -2500) able. Although testsof est effect are of low above example, direc a disease locus itself To illustrate this poi sion/disequilibrium t In this test, transmis at a locus from heter affected offspring is e lian inheritance, all a chance ofbeing tran eration. In contrast, associated with dise mitted more often th For this approach, with multiple affect just on single affect parents. For the same can calculate the pr parents as pq(y + 1 the probability for a transmit the high ris Association tests ca pairs of affected sibl associatedwithdiseas over 50% is the same the probability ofpar creased at lowvalues the probability ofpar creased. The formula The Future of Genetic Studies of Complex Human Diseases Neil Risch and Kathleen Merikangas onimm, 0In"a0,"a, Geneticists have made substantial progress in identifying the genetic basis of many human diseases, at least those with conspicuous deter- minants.ThesesuccessesincludeHuntington's disease, Alzheimer's disease, and some forms of breast cancer. However, the detection of ge- netic factors for complex diseases-such as schizophrenia, bipolardisorder, anddiabetes- has been far more complicated. There have been numerous reports of genes or loci that might underlie these disorders, butfew ofthese findings have been replicated. The modest na- ture ofthe gene effectsforthese disorders likely explains the contradictory and inconclusive claims about their identification. Despite the small effects of such genes, the magnitude of theirattributable risk (theproportion ofpeople affectedduetothem) maybelargebecause they are quite frequent in the population, making them ofpublic health significance. Has the genetic study ofcomplex disorders reached its limits? The persistent lack of replicability of these reports of linkage be- tween various loci and complex diseases might imply that it has. We argue below that age analysis we have chosen for this ar ment is a popular current paradigm in whi pairs of siblings, both with the disease, examined for sharing of alleles at multip sites in the genome defined by genetic mar ers. The more often the affected sibli share the same allele at a particular site, t more likely the site is close to the dise gene. Using the formulas in (1), we calcul the expected proportion Yofalleles shared a pair ofaffected siblings for the best possi case-that is, a closely linked marker lo (recombination fraction 0 = 0) that is fu informative (heterozygosity = 1) (2)-as 1 +W wherew= pq(y-1)2 2+w (py+q)2 If there is no linkage of a marker at particular site to the disease, the sibli would be expected to share alleles 50% oft time; that is, Y would equal 0.5. Values o for various values ofp and y are given in t third column of the table. For an allele moderate frequency (p is 0.1 to 0.5) that co The Future of Genetic Studies of Complex Human Diseases Neil Risch and Kathleen Merikangas Science, 1996 A new paradigm is needed for discovery!
  • 12. How does a GWAS work?
  • 13. Single nucleotide polymorphisms (SNPs): How many SNPs are in the human genome? >3,000,000,000 bases in human genome SNPs appear ~1000 bases ~3,000,000 SNPs 40-60% have minor allele frequency <5% GWAS focus on frequency >5% HapMap Consortium, 2010
  • 14. Can’t measure everything: Tag SNPs and Linkage Disequilibrium (LD) LD = co-occurance of SNPs in a contiguous region Bush and Moore, 2012
  • 15. The phenomenon of LD makes GWAS possible: How and why?: Indirect association additional studies to map the precise location of the influential SNP. Conceptually, the end result of GWAS under the common disease/common var- needed to capture the variation African genome. It is important to note that t ogy for measuring genomic Figure 3. Indirect Association. Genotyped SNPs often lie in a region of high linka will be statistically associated with disease as a surrogate for the disease SNP throu doi:10.1371/journal.pcbi.1002822.g003 Bush and Moore, 2012 LD blocks
  • 16. Can’t measure everything: Tag SNPs and Linkage Disequilibrium Tag SNPs are common proxies for other SNPs 500K - 1M per chip tified significant associations for seven SNPs representing four new T2DM loci (Table 1). In all cases, the strongest association for the MAX statistic (see Methods) was obtained with the additive model. of this gene (Fig. 2a) solely in the secretory final stages of insulin * * * 0 2 4 –log10[P] –log10[P] * 4954642sr 2373971sr 3373971sr 445409sr 8012261sr 3349941sr 883429sr 2019462sr 0349941sr 90350501sr 036169sr 0415007sr 2225991sr 6136642sr 8136642sr 1869646sr 8798751sr 04928201sr 3926642sr 5926642sr 43666231sr 9926642sr 2954642sr 01350501sr 5769646sr 4577187sr 4769646sr 41350501sr 5784931sr 2173387sr 39250501sr 5050007sr 7492602sr 1255051sr 156868sr 4373387sr 4784931sr 7501107sr 2697402sr 91518711sr 6461001sr 29250501sr 5889103sr 8669646sr 0889103sr 4688392sr SLC30A8 IDE 0 2 4 7912381sr 3148707sr 0283856sr 52078111sr 5227373sr 0491242sr 2369412sr 2297881sr 662155sr 7790197sr 44068701sr 35075221sr 5826807sr 7851092sr 9409522sr –log10[P] –log10[P] EXT2 ALX4 0 2 4 *** * 0 2 4 a b c d LD block 2 alleles are correlated because they are inherited together Sladek et al, 2007
  • 17. image: www.lifa-core.de/ Digitizing SNPs: e.g., Illumina Infinium Array image: illumina.com
  • 18. Assessing Thousands of Factors Simultaneously: Data-driven search for differences in SNP frequencies ~100,000 - ~1,000,000 association tests disease cases healthy controls GCAGGTACATG...GGTA... GCAGGTACACG...GGTA... GCAGGTACATG...GGTA... GCAGGTACACG...GGTA... GCAGGTACATG...GGTA... GCAGGTACACG...GGTA... disease cases GCAGGTACATG...GGTA... GCAGGTACATG...GGTA... GCAGGTACATG...GGTA... GCAGGTACATG...GGTA... healthy controls
  • 19. Associating One SNP with Disease Case-Control Study Design DiseaseSNP (A/a) ? A a diseased non- diseased cases controls
  • 20. Associating One SNP with Disease What is an β€œOdds Ratio”? DiseaseSNP (A/a) ? A a diseased c d non- diseased x y cases controls Chi-squared test Odds Ratio a vs A: Odds of disease with allele a vs. Odds of disease with allele A 1: equal odds (no difference) >1: increased odds (increased risk) <1: decreased odds (decreased risk)
  • 21. Associating One SNP with Disease Calculating the Odds Ratio DiseaseSNP (A/a) ? A a diseased c d non- diseased x y cases controls Chi-squared test Odds Ratio dx cy y/x d/c [d/(d+y)]/[y/(d+y)] Odds Ratio a vs A: [c/(x+y)]/[x/(c+x)] Odds with allele a Odds with allele A How would you interpret an OR of 2?
  • 22. Associating One SNP with Disease Cohort Study Design DiseaseSNP (A/a) ? β€’Direct measure of risk vs. odds ratio β€’Need to wait! β€’If incidence is low, N needs to be large! Non-diseasedSNP (A/a) vs. Cox survival regression Relative Risk
  • 23. Models to associate genotypes with disease Examples for a case-control study Aa AA AA aa Aa AaaaAa Disease Non-diseased ND=4 NC=4
  • 24. Models to associate genotypes with disease Examples for a case-control study Aa AA AA aa Aa AaaaAa Disease Non-diseased ND=4 NC=4 A a diseased non- diseased 6 2 2 6 OR A (vs a) OR a (vs A)
  • 25. AA Aa aa diseased non- diseased Models to associate genotypes with disease Genotypic Test (β€œ2 or 1 df test”) Aa AA AA aa Aa AaaaAa Diseased Non-diseased ND=4 NC=4 2 OR AA (vs. Aa) aa (vs. Aa) 2 0 220
  • 26. Associating One SNP with Quantitative Trait (e.g., height, weight, cholesterol) 40 60 80 100 1 2 3 factor(SNP) trait GG GC CC height SNP rs1234 SNP rs123456 25 50 75 100 125 1 2 3 factor(SNP) trait height CC CT TT
  • 27. Associating One SNP with Quantitative Trait Linear Regression and Additive Risk Model y=Ι‘+Ξ²x+Ξ΅ 25 50 75 100 125 1 2 3 factor(SNP) trait height CC (0) CT (1) TT (2) SNP rs123456 height = Ι‘+Ξ²x xCC=0 if individual is CC xCT=1 if individual is CT xTT=2 if individual is TT Ι‘ Ξ²: change in height for 1 risk allele T= risk allele Ξ²
  • 28. Prototypical β€œManhattan plot” to visualize associations Science, 2007 ~100,000 - ~1,000,000 association tests evol part ease tase well biol T capt imp STR reve subs libri clea βˆ’log10(P) 0 5 10 15 Chromosome 22 X 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 80 60 40 100 rvedteststatistic a b NATURE|Vol 447|7 June 2007 AA Aa aa diseased non- diseased
  • 29. ibility with schizophrenia, a psychotic disorder with many similar- ities to BD. In particular association findings have been reported with assium channel. Ion channelopathies are well-recognized as causes of episodic central nervous system disease, including seizures, ataxias βˆ’log10 (P) 0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15 Chromosome Type 2 diabetes 22 XX 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 22 XX 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 22 XX 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 22 XX 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 22 XX 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 22 XX 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 22 XX 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Coronary artery disease Crohn’s disease Hypertension Rheumatoid arthritis Type 1 diabetes Bipolar disorder Figure 4 | Genome-wide scan for seven diseases. For each of seven diseases 2log10 of the trend test P value for quality-control-positive SNPs, excluding Chromosomes are shown in alternating colours for clarity, with P values ,1 3 1025 highlighted in green. All panels are truncated at
  • 30. Type I Error: False Positives! what is a p-value? chance we attain the observed result if no difference (H0) Many tests: some can be significant (low p-value by chance)! 100 tests at a p-value of 0.05... how many would be significant per chance? Bonferroni β€œcorrection”: Correct the 0.05 significance level by number of tests e.g., 1M SNPs: 0.05/1x10-6 = 5x10-8
  • 31. QQplot: Distribution of of observed p-values vs. Ho p- values Histogram of runif(10000) runif(10000) Frequency 0.0 0.2 0.4 0.6 0.8 1.0 0100200300400500 p-values under Ho Histogram of gwas$P.value gwas$P.value Frequency 0.0 0.2 0.4 0.6 0.8 1.0 050000100000150000 p-values of GWAS in Total Cholesterol Global Lipids Consortium, 2012random uniform distribution
  • 32. QQplot: Distribution of of observed p-values vs. Ho p- values Histogram of gwas$P.value gwas$P.value Frequency 0.0 0.2 0.4 0.6 0.8 1.0 050000100000150000 p-values of GWAS in Total Cholesterol
  • 33. Which diseases show evidence of association? Examining the QQplot of test statistics in WTCCC sent study cannot provideconclusive exclusion of any given gene. This is the consequence of several factors including: less-than-complete coverage of common variation genome-wide on the Affymetrix chip; poor coverage (by design) of rare variants, including many structural variants (thereby reducing power to detect rare, penetrant, alleles)25 ; difficultieswithdefining thefullgenomicextentofthegene ofinterest; and, despite the sample size, relatively low power to detect, at levels of already allow us, for selected diseases, to highlight pathways and mechanisms of particular interest. Naturally, extensive resequencing and fine-mapping work, followed by functional studies will be required before such inferences can be translated into robust state- ments about the molecular and physiological mechanisms involved. We turn now to a discussion of the main findings for each disease, focusing here only on the most significant and interesting results 25 20 20 15 15 10 10 5 5 30 0 0 25 20 20 15 15 10 10 5 5 30 0 0 25 20 20 15 15 10 10 5 5 30 0 0 25 20 20 15 15 10 10 5 5 30 0 0 25 20 20 15 15 10 10 5 5 30 0 0 25 20 20 15 15 10 10 5 5 30 0 0 25 20 20 15 15 10 10 5 5 30 0 0 BD Observedteststatistic Expected chi-squared value CAD CD HT RA T2D T1D Figure 3 | Quantile-quantile plots for seven genome-wide scans. For each of the seven disease collections, a quantile-quantile plot of the results of the trend test is shown in black for all SNPs that pass the standard project filters, have a minor allele frequency .1% and missing data rate ,1%. SNPs that 360,000 SNPs. SNPs at which the test statistic exceeds 30 are represented by triangles. Additional quantile-quantile plots, which also exclude all SNPs located in the regions of association listed in Table 3, are superimposed in blue (for BD, the exclusion of these SNPs has no visible effect on the plot, and
  • 34. Observational associations do not equal causation...
  • 35. Ice Cream $ Drowning Confounding bias What is a confounder? Summer! ? Confounder is correlated to both the β€œrisk” factor and disease, leading to invalid inference. Common source of bias in observational studies (e.g., case-control, cohort, etc)
  • 36. SNP Disease Population Stratification: A source of possible confounding in GWAS race/ethnicity ? Ancestry correlated with allele frequency and disease GWAS are done on specific populations separately. (most have been done in populations of European ancestry)
  • 37. FTO Diabetes Mediation SNPs indicative of a mediator factor? Example: FTO and Type 2 Diabetes Body Mass ? Association between FTO and Type 2 Diabetes via BMI? ... or does FTO have a independent role in Type 2 Diabetes...? FTO Body Mass
  • 38. PLINK: (Standard) Whole Genome Analysis Software
  • 39. PLINK: (Standard) Whole Genome Analysis Software http://pngu.mgh.harvard.edu/~purcell/plink/ β€’cited >9000 times since 2007 β€’allele frequency β€’linkage disequilibrium (LD) β€’data manipulation/filtering β€’association: allelic, genotypic models β€’chi-square β€’logistic β€’linear
  • 40. Examples: GWASs in Type 2 Diabetes
  • 41. Type 2 Diabetes Mellitus: A complex, multifactorial disease β€’Insulin production vs. use β€’beta-cell function β€’insulin sensitivity (BMI) β€’Moves glucose from blood into cells β€’Complications arise due to glucose in blood, hyperglycemia β€’diagnosed by blood glucose levels CDC, family history: 25% body weight, diet, lifestyle, age
  • 42. ARTICLES A genome-wide association study identifies novel risk loci for type 2 diabetes Robert Sladek1,2,4 , Ghislain Rocheleau1 *, Johan Rung4 *, Christian Dina5 *, Lishuang Shen1 , David Serre1 , Philippe Boutin5 , Daniel Vincent4 , Alexandre Belisle4 , Samy Hadjadj6 , Beverley Balkau7 , Barbara Heude7 , Guillaume Charpentier8 , Thomas J. Hudson4,9 , Alexandre Montpetit4 , Alexey V. Pshezhetsky10 , Marc Prentki10,11 , Barry I. Posner2,12 , David J. Balding13 , David Meyre5 , Constantin Polychronakos1,3 & Philippe Froguel5,14 Type 2 diabetes mellitus results from the interaction of environmental factors with a combination of genetic variants, most of which were hitherto unknown. A systematic search for these variants was recently made possible by the development of high-density arrays that permit the genotyping of hundreds of thousands of polymorphisms. We tested 392,935 single-nucleotide polymorphisms in a French case–control cohort. Markers with the most significant difference in genotype frequencies between cases of type 2 diabetes and controls were fast-tracked for testing in a second cohort. This identified four loci containing variants that confer type 2 diabetes risk, in addition to confirming the known association with the TCF7L2 gene. These loci include a non-synonymous polymorphism in the zinc transporter SLC30A8, which is expressed exclusively in insulin-producing b-cells, and two linkage disequilibrium blocks that contain genes potentially involved in b-cell development or function (IDE–KIF11–HHEX and EXT2–ALX4). These associations explain a substantial portion of disease risk and constitute proof of principle for the genome-wide approach to the elucidation of complex genetic traits. The rapidly increasing prevalence of type 2 diabetes mellitus (T2DM) is thought to be due to environmental factors, such as increased availabil- ity of food and decreased opportunity and motivation for physical activity, acting on genetically susceptible individuals. The heritability of T2DM is one of the best established among common diseases and, consequently, genetic risk factors for T2DM have been the subject of intense research1 . Although the genetic causes of many monogenic forms of diabetes (maturity onset diabetes in the young, neonatal mito- chondrial and other syndromic types of diabetes mellitus) have been elucidated, few variants leading to common T2DM have been clearly identified and individually confer only a small risk (odds ratio < 1.1– 1.25) of developing T2DM1 . Linkage studies have reported many T2DM-linked chromosomal regions and have identified putative, cau- sative genetic variants in CAPN10 (ref. 2), ENPP1 (ref. 3), HNF4A (refs 4, 5) and ACDC (also called ADIPOQ)6 . In parallel, candidate-gene studieshavereportedmanyT2DM-associatedloci,withcodingvariants in the nuclear receptor PPARG (P12A)7 and the potassium channel KCNJ11 (E23K)8 being among the very few that havebeen convincingly replicated. The strongest known (odds ratio < 1.7) T2DM association9 was recently mapped to the transcription factor TCF7L2 and has been consistently replicated in multiple populations10–20 . Subjects and study design The recent availability of high-density genotyping arrays, which com- bine the power of association studies with the systematic nature of a genome-wide search, led us to undertake a two-stage, genome-wide association study to identify additional T2DM susceptibility loci (Supplementary Fig. 1). In the first stage of this study, we obtained genotypes for 392,935 single-nucleotide polymorphisms (SNPs) in 1,363 T2DM cases and controls (Supplementary Table 1). In order to enrich for risk alleles21 , the diabetic subjects studied in stage 1 were selected to have at least one affected first degree relative and age at onset under 45 yr (excluding patients with maturity onset diabetes in the young). Furthermore, in order to decrease phenotypic hetero- geneity and to enrich for variants determining insulin resistance and b-cell dysfunction through mechanisms other than severe obesity, we initially studied diabetic patients with a body mass index (BMI) ,30 kg m22 . Control subjects were selected to have fasting blood glucose ,5.7 mmol l21 in DESIR, a large prospective cohort for the study of insulin resistance in French subjects22 . Genotypes for each study subject were obtained using two plat- forms: Illumina Infinium Human1 BeadArrays, which assay 109,365 SNPs chosen using a gene-centred design; and Human Hap300 BeadArrays, which assay 317,503 SNPs chosen to tag haplotype blocks identified by the Phase I HapMap23 . Of the 409,927 markers that passed quality control (Supplementary Tables 2 and 3), geno- types were obtained for an average of 99.2% (Human1) and 99.4% (Hap300) of markers for each subject with a reproducibility of .99.9% (both platforms). Forty-three subjects were removed from analysis because of evidence of intercontinental admixture (Sup- plementary Fig. 3) and an additional four because their genotype- determined gender disagreed with clinical records. In total, T2DM association was tested for 100,764 (Human1) and 309,163 (Hap300) SNPs representing 392,935 unique loci (Fig. 1). Because of unequal male/female ratios in our cases and controls, we analysed the 12,666 sex-chromosome SNPs separately for each gender. *These authors contributed equally to this work. 1 Departments of Human Genetics, 2 Medicine and 3 Pediatrics, Faculty of Medicine, McGill University, Montreal H3H 1P3, Canada. 4 McGill University and Genome Quebec Innovation Centre, Montreal H3A 1A4, Canada. 5 CNRS 8090-Institute of Biology, Pasteur Institute, Lille 59019 Cedex, France. 6 Endocrinology and Diabetology, University Hospital, Poitiers 86021 Cedex, France. 7 INSERM U780-IFR69, Villejuif 94807, France. 8 Endocrinology-Diabetology Unit, Corbeil-Essonnes Hospital, Corbeil-Essonnes 91100, France. 9 Ontario Institute for Cancer Research, Toronto M5G 1L7, Canada. 10 Montreal Diabetes Research Center, Montreal H2L 4M1, Canada. 11 Molecular Nutrition Unit and the Department of Nutrition, University of Montreal and the Centre Hospitalier de l’UniversiteΒ΄ de MontreΒ΄al, Montreal H3C 3J7, Canada. 12 Polypeptide Hormone Laboratory and Department of Anatomy and Cell Biology, Montreal H3A 2B2, Canada. 13 Department of Epidemiology & Public Health, Imperial College, St Mary’s Campus, Norfolk Place, London W2 1PG, UK. 14 Section of Genomic Medicine, Imperial College London W12 0NN, and Hammersmith Hospital, Du Cane Road, London W12 0HS, UK. 881 NatureΒ©2007 Publishing Group Nature, 2/2007 References and Notes 1. B. G. Richmond, D. S. Strait, Nature 404, 382 (2000). 2. J. Kingdon, Lowly Origins (Princeton Univ. Press, Princeton, NJ, 2003). 3. C. V. Ward, M. G. Leakey, A. Walker, Evol. Anthropol. 7, 197 (1999). 4. Y. Haile-Selassie, Nature 412, 178 (2001). 5. T. D. White et al., Nature 440, 883 (2006). 6. K. Kovarovic, P. Andrews, J. Hum. Evol., in press (available at http://dx.doi.org./doi:10.1016/j.jhevol.2007.01.001; doi: 10.1016/j.jhevol.2007.01.001). 7. N. Patterson, D. J. Richter, S. Gnerre, E. S. Lander, D. Reich, Nature 441, 1103 (2006). 8. K. D. Hunt et al., Primates 37, 363 (1996). 9. J. G. Fleagle et al., Symp. Zool. Soc. London 48, 359 (1981). 10. R. H. Crompton et al., Cour. Forsch-Inst. Senckenb. 243, 115 (2003). 11. J. T. Stern, Yrb. Phys. Anthropol. 19, 59 (1975). 12. S. K. S. Thorpe, R. H. Crompton, Am. J. Phys. Anthropol. 131, 384 (2006). 13. K. D. Hunt, J. Hum. Evol. 26, 183 (1994). 15. E. Larney, S. Larsen, Am. J. Phys. Anthropol. 125, 42 (2004). 16. S. K. S. Thorpe, R. H. Crompton, Am. J. Phys. Anthropol. 127, 58 (2005). 17. S. K. S. Thorpe, R. H. Crompton, M. M. Gunther, R. F. Ker, R. McN. Alexander, Am. J. Phys. Anthropol. 110, 179 (1999). 18. R. McN. Alexander, Principles of Animal Locomotion (Princeton Univ. Press, Princeton, NJ, 2003). 19. C. V. Ward, Yrbk. Phys. Anthropol. 45, 185 (2002). 20. R. W. Wrangham, N. L. Conklin-Brittain, K. D. Hunt, Int. J. Primatol. 19, 949 (1998). 21. H. Pontzer, R. W. Wrangham, J. Hum. Evol. 46, 317 (2004). 22. R. C. Payne et al., J. Anat. 208, 709 (2006). 23. M. Pickford, B. Senut, B. Gommery, in Late Cenozoic Environments and Hominid Evolution: a Tribute to Bill Bishop, P. Andrews, P. Banham, Eds. (Geological Society, London, 1999), pp. 27–38. 24. N. M. Young, L. MacLatchy, J. Hum. Evol. 46, 163 (2004). 25. D. Gommery, B. Senu, M. Pickford, E. Musiime, Ann. PalΓ©ontol. 88, 167 (2002). 26. C. V. Ward, in Handbook of Paleoanthropology Vol. 2: Primate Evolution and Human Origins, W. Henke, I. Tattersall, Eds. (Springer, Heidelberg, Germany, 2007), pp. 1011–1030. N. Ogihara, M. Nakatsukasa, Eds. (Springer, Heidelberg, Germany, 2006), pp. 199–208. 28. C. P. E. Zollikofer et al., Nature 434, 755 (2005). 29. M. Pickford, Anthropologie 69, 191 (2005). 30. We thank the Indonesian Institute of Science, Indonesian Nature Conservation Service, and Leuser Development Programme for granting permission and giving support for research in the Leuser Ecosystem. R. McN. Alexander, T. M. Blackburn, S. Burtles. J. Rees, N. Jeffery, E. E. Vereecke, A. Walker, A. Wilson, and B. Wood commented on the manuscript. R. Savage developed the animation (fig. S1). Studies of captive animals were hosted by the North of England Zoological Society. This research was supported by grants from the Leverhulme Trust, the Royal Society, the L.S.B. Leakey Foundation, and the Natural Environment Research Council. Supporting Online Material www.sciencemag.org/cgi/content/full/316/5829/1328/DC1 Table S1 Movies S1 to S3 5 February 2007; accepted 18 April 2007 10.1126/science.1140799 Genome-Wide Association Analysis Identifies Loci for Type 2 Diabetes and Triglyceride Levels Diabetes Genetics Initiative of Broad Institute of Harvard and MIT, Lund University, and Novartis Institutes for BioMedical Research*† New strategies for prevention and treatment of type 2 diabetes (T2D) require improved insight into disease etiology. We analyzed 386,731 common single-nucleotide polymorphisms (SNPs) in 1464 patients with T2D and 1467 matched controls, each characterized for measures of glucose metabolism, lipids, obesity, and blood pressure. With collaborators (FUSION and WTCCC/UKT2D), we identified and confirmed three loci associated with T2Dβ€”in a noncoding region near CDKN2A and CDKN2B, in an intron of IGF2BP2, and an intron of CDKAL1β€”and replicated associations near HHEX and in SLC30A8 found by a recent whole-genome association study. We identified and confirmed association of a SNP in an intron of glucokinase regulatory protein (GCKR) with serum triglycerides. The discovery of associated variants in unsuspected genes and outside coding regions illustrates the ability of genome-wide association studies to provide potentially important clues to the pathogenesis of common diseases. T ype 2 diabetes, obesity, and cardiovascular risk factors are caused by a combination of genetic susceptibility, environment, be- havior, and chance. Whole-genome association studies (WGAS) offer a new approach to gene discovery unbiased with regard to presumed functions or locations of causal variants. This approach is based on Fisher’s theory for additive effects at common alleles (1); human heterozy- to purifying selection, and has been made pos- sible by genomic advances such as the human genome sequence, SNP and HapMap databases, and genotyping arrays (3). We studied 1464 patients with T2D and 1467 controls from Finland and Sweden, each characterized for 18 clinical traits: anthropomet- ric measures, glucose tolerance and insulin se- cretion, lipids and apolipoproteins, and blood applying stringent quality-control filters, high- quality genotypes for 386,731 common SNPs were obtained (4). To extend the set of putative causal alleles tested for association, we devel- oped 284,968 additional multimarker (haplo- type) tests based on these SNP genotypes (5, 6). The 671,699 allelic tests capture (correlation co- efficient r2 β‰₯ 0.8) 78% of common SNPs in HapMap CEU (3). Each SNP and haplotype test was assessed for association to T2D and each of 18 traits with the software package PLINK (http://pngu.mgh. harvard.edu/purcell/plink/). For T2D, a weighted meta-analysis was used to combine results for the population-based and family-based subsam- ples (4). For quantitative traits, multivariable linear or logistic regression with or without co- variates was performed (4). Association results for each SNP, haplotype test, and phenotype are available (www.broad.mit.edu/diabetes/). In genome-wide analysis involving hundreds of thousands of statistical tests, modest levels of bias imposed on the null distribution can over- whelm a small number of true results. We used three strategies to search for evidence of sys- tematic bias from unrecognized population struc- ture, the analytical approach, and genotyping artifacts (7, 8). First, we examined the distribu- tion of P-values in the population-based sam- ple, observing a close match to that expected for a null distribution (genomic inflation factor lGC = 1.05 for T2D). Second, we calculated G. Brice,6 B. Bullman,7 J. Campbell,8 B. Castle,9 R. Cetnarsyj,8 C. Chapman,10 C. Chu,11 N. Coates,12 T. Cole,10 R. Davidson,4 A. Donaldson,13 H. Dorkins,3 F. Douglas,2 D. Eccles,9 R. Eeles,1 F. Elmslie,6 D. G. Evans,7 S. Goff,6 S. Goodman,5 D. Goudie,2 J. Gray,15 L. Greenhalgh,16 H. Gregory,17 S. V. Hodgson,6 T. Homfray,6 R. S. Houlston,1 L. Izatt,18 L. Jackson,18 L. Jeffers,19 V. Johnson-Roffey,12 F. Kavalier,18 C. Kirk,19 F. Lalloo,7 C. Langman,18 I. Locke,1 M. Longmuir,4 J. Mackay,20 A. Magee,19 S. Mansour,6 Z. Miedzybrodzka,17 J. Miller,11 P. Morrison,19 V. Murday,4 J. Paterson,21 G. Pichert,18 M. Porteous,8 N. Rahman,6 M. Rogers,15 S. Rowe,22 S. Shanley,1 A. Saggar,6 G. Scott,2 L. Side,23 L. Snadden,4 M. Steel,2 M. Thomas,5 S. Thomas,1 1 Clinical Genetics Service, Royal Marsden Hospital, Downs Road, Sutton, Surrey, SM2 5PT, UK. 2 Department of Clinical Genetics, Ninewells Hospital, Dundee, DD1 9SY, UK. 3 Medical and Community Genetics, Kennedy-Galton Centre, Level 8V, Northwick Park and St. Mark’s NHS Trust, Watford Rd, Harrow, HA1 3UJ, UK. 4 Institute of Medical Genetics, Yorkhill NHS Trust, Dalnair Street, Glasgow, G3 8SJ, UK. 5 Clinical Genetics Department, Royal Devon and Exeter Hospital (Heavitree), Gladstone Road, Exeter, EX1 2ED, UK. 6 Department of Clinical Genetics, St. George’s Hospital Medical School, Jenner Wing, Cranmer Terrace, London, SW17 0RE, UK. 7 Department of Medical Genetics, St. Mary’s Hospital, Hathersage Road, Manchester, M13 0JH, UK. 8 South East of Scotland Clinical Genetics Service, Western General Hospital, Crewe Road, Edinburgh, EH4 2XU, UK. 9 Department of Medical Genetics, The Princess Anne Hospital, Coxford Road, Southampton, S016 5YA, UK. 10 Clinical Genetics Unit, Birmingham Women’s Hospital, Metchley Park Road, Edgbaston, Birmingham, B15 2TG, UK. 11 Yorkshire Regional Genetic Service, Department of Clinical Genetics, Cancer Genetics Building, St. James University Hospital, Beckett Street, Leeds, LS9 7TF, UK. 12 Department of Clinical Genetics, Leicester Royal Infirm- ary, Leicester, LE1 5WW, UK. 13 Department of Clinical Genetics, St Michael’s Hospital, Southwell Street, Bristol, BS2 8EG, UK. 14 Institute of Human Genetics, International Centre for Life, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK. 15 Institute of Medical Genetics, University Hospital of Wales, Heath Park, Cardiff, CF14 4XW, UK. 16 Department of Clinical Genetics, Alder Hey Children’s Hospital, Eaton Road, Liverpool L12 2AP, UK. 17 Clinical Genetics Centre, Argyll House, Foresterhill, Aberdeen, AB25 2ZR, UK. 18 Clinical Genetics, 7th Floor New Guy’s House, Guy’s UK. 19 Clinical Belvoir Park H 20 Clinical and Health, 30 G 21 Department Trust, Box 13 22 Department of Chester Ho 23 Department Road, Headin Supporting www.sciencema Materials and Figs. S1 to S8 Tables S1 to S References 9 March 2007 Published onli 10.1126/scien Include this in A Genome-Wide Association Study of Type 2 Diabetes in Finns Detects Multiple Susceptibility Variants Laura J. Scott,1 Karen L. Mohlke,2 Lori L. Bonnycastle,3 Cristen J. Willer,1 Yun Li,1 William L. Duren,1 Michael R. Erdos,3 Heather M. Stringham,1 Peter S. Chines,3 Anne U. Jackson,1 Ludmila Prokunina-Olsson,3 Chia-Jen Ding,1 Amy J. Swift,3 Narisu Narisu,3 Tianle Hu,1 Randall Pruim,4 Rui Xiao,1 Xiao-Yi Li,1 Karen N. Conneely,1 Nancy L. Riebow,3 Andrew G. Sprau,3 Maurine Tong,3 Peggy P. White,1 Kurt N. Hetrick,5 Michael W. Barnhart,5 Craig W. Bark,5 Janet L. Goldstein,5 Lee Watkins,5 Fang Xiang,1 Jouko Saramies,6 Thomas A. Buchanan,7 Richard M. Watanabe,8,9 Timo T. Valle,10 Leena Kinnunen,10,11 GonΓ§alo R. Abecasis,1 Elizabeth W. Pugh,5 Kimberly F. Doheny,5 Richard N. Bergman,9 Jaakko Tuomilehto,10,11,12 Francis S. Collins,3 * Michael Boehnke1 * Identifying the genetic variants that increase the risk of type 2 diabetes (T2D) in humans has been a formidable challenge. Adopting a genome-wide association strategy, we genotyped 1161 Finnish T2D cases and 1174 Finnish normal glucose-tolerant (NGT) controls with >315,000 single-nucleotide polymorphisms (SNPs) and imputed genotypes for an additional >2 million autosomal SNPs. We carried out association analysis with these SNPs to identify genetic variants that predispose to T2D, compared our T2D association results with the results of two similar studies, and genotyped 80 SNPs in an additional 1215 Finnish T2D cases and 1258 Finnish NGT controls. We identify T2D-associated variants in an intergenic region of chromosome 11p12, contribute to the identification of T2D-associated variants near the genes IGF2BP2 and CDKAL1 and the ria (8). We ciation with the log-odd (8). We ob versus 31.6 P values < against the with a large consistent w SNPs that also sugges trols by birt successful; genomic co Analysi allowed us variation in portion, w (8, 13) that equilibrium Centre d’E (Utah resid 1 Department Genetics, Uni USA. 2 Depar Science, 6/2007 Study design: Richa Saxena1–6 and Valeriya Lyssenko7 (Team Leaders), Peter Almgren,7 Paul I. W. de Bakker,1–6 NoΓ«l P. Burtt,1 Jose C. Florez,1–6 Hong Chen,8 Joanne Meyer,8 Joel N. Hirschhorn,1,6,9–11 Mark J. Daly,1–3,5 Thomas E. Hughes,8 Leif Groop,7,12 David Altshuler1–6 (Chair) Clinical characterization and phenotypes: Valeriya Lyssenko7 and Richa Saxena1–6 (Team Leaders), Peter Almgren,7 Kristin Ardlie,1 Kristina Bengtsson BostrΓΆm,13 NoΓ«l P. Burtt,1 Hong Chen,8 Jose C. Florez,1–6 Bo Isomaa,14,15 Sekar Kathiresan,1,3,5 Guillaume Lettre,1,6,9–11 Ulf Lindblad,16 Helen N. Lyon,1,6,9–11 Olle Melander,7 Christopher Newton-Cheh,1–3,5 Peter Nilsson,17 Marju Orho- Melander,7 Lennart RΓ₯stam,16 Elizabeth K. Speliotes,1,3,6,9–11 Marja-Riitta Taskinen,12 Tiinamaija Tuomi,12,15 Benjamin F. Voight,1–3,5 David Altshuler,1–6 Joel N. Hirschhorn,1,6,9–11 Thomas E. Hughes,8 Leif Groop7,12 (Chair) DNA sample QC and diabetes replication genotyping: Candace Guiducci1 and Valeriya Lyssenko7 (Team Leaders), Anna Berglund,7 Joyce Carlson,18 Lauren Gianniny,1 Rachel Hackett,1 Liselotte Hall,18 Johan Holmkvist,7 Esa Laurila,7 Marju Orho-Melander,7 Marketa SjΓΆgren,7 Maria Sterner,18 Aarti Surti1 Margareta Svensson,7 Malin Svensson,7 Ryan Tewhey,1 NoΓ«l P. Burtt1 (Chair) Whole genome scan genotyping: Brendan Blumenstiel1 (Team Leader), Melissa Parkin,1 Matthew DeFelice,1 Candace Guiducci,1 Ryan Tewhey,1 Rachel Barry,1 Wendy Brodeur,1 NoΓ«l P. Burtt,1 Jody Camarata,1 Nancy Chia,1 Mary Fava,1 John Gibbons,1 Bob Handsaker,1 Claire Healy,1 Kieu Nguyen,1 Casey Gates,1 Carrie Sougnez,1 Diane Gage,1 Marcia Nizzari,1 David Altshuler,1–6 Stacey B. Gabriel1 (Chair) GCKR replication genotyping and analysis (MalmΓΆ Diet and Cancer Study): Sekar Kathiresan1,3,5 (Team Leader), Candace Guiducci,1 Aarti Surti,1 NoΓ«l P. Burtt,1 Olle Melander,7 Marju Orho-Melander7 (Chair) Statistical analysis: Benjamin F. Voight1–3,5 and Paul I. W. de Bakker1–6 (Team Leaders), Richa Saxena,1–6 Valeriya Lyssenko,7 Peter Almgren,7 NoΓ«l P. Burtt,1 Hong Chen,8 Gung-Wei Chirn,8 Qicheng Ma,8 Hemang Parikh,7 Delwood Richardson,8 Darrell Ricke,8 Jeffrey J. Roix,8 Leif Groop,7,12 Shaun Purcell,1,2 David Altshuler,1–6 Mark J. Daly1–3,5 (Chair) 1 Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA 02142, USA. 2 Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA. 3 Department of Medicine, Mas- sachusetts General Hospital, Boston, MA 02114, USA. 4 Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02114, USA. 5 Department of Medicine, Harvard Medical School, Boston, MA 02115, USA. 6 Depart- ment of Genetics, Harvard Medical School, Boston, MA 02115, USA. 7 Department of Clinical Sciences, Diabetes and Endocrinology Research Unit, University Hospital MalmΓΆ, Lund University, MalmΓΆ, Sweden. 8 Diabetes and Metabolism Disease Area, Novartis Institutes for BioMedical Research, 100 Technology Square, Cambridge, MA 02139, USA. 9 Depart- ment of Pediatrics, Harvard Medical School, Boston, MA 02115, USA. 10 Division of Endocrinology, Children’s Hospital, Boston, MA 02115, USA. 11 Division of Genetics, Children’s Hospital, Boston, MA 02115, USA. 12 Department of Medicine, Helsinki University Hospital, University of Helsinki, Helsinki, Finland. 13 Skaraborg Institute, SkΓΆvde, Sweden. 14 Malmska Municipal Health Center and Hospital, Jakobstad, Finland. 15 FolkhΓ€lsan Research Center, Helsinki, Finland. 16 Depart- ment of Clinical Sciences, Community Medicine Research Unit, University Hospital MalmΓΆ, Lund University, MalmΓΆ, Sweden. 17 Department of Clinical Sciences, Medicine Research Unit, University Hospital MalmΓΆ, Lund University, MalmΓΆ, Sweden. 18 Clinical Chemistry, University Hospital MalmΓΆ, Lund University, MalmΓΆ, Sweden. 19 Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02115, USA. Supporting Online Material www.sciencemag.org/cgi/content/full/1142358/DC1 Materials and Methods Figs. S1 and S2 Tables S1 to S6 References 9 March 2007; accepted 20 April 2007 Published online 26 April 2007; 10.1126/science.1142358 Include this information when citing this paper. Replication of Genome-Wide Association Signals in UK Samples Reveals Risk Loci for Type 2 Diabetes Eleftheria Zeggini,1,2 * Michael N. Weedon,3,4 * Cecilia M. Lindgren,1,2 * Timothy M. Frayling,3,4 * Katherine S. Elliott,2 Hana Lango,3,4 Nicholas J. Timpson,2,5 John R. B. Perry,3,4 Nigel W. Rayner,1,2 Rachel M. Freathy,3,4 Jeffrey C. Barrett,2 Beverley Shields,4 Andrew P. Morris,2 Sian Ellard,4,6 Christopher J. Groves,1 Lorna W. Harries,4 Jonathan L. Marchini,7 Katharine R. Owen,1 Beatrice Knight,4 Lon R. Cardon,2 Mark Walker,8 Graham A. Hitman,9 Andrew D. Morris,10 Alex S. F. Doney,10 The Wellcome Trust Case Control Consortium (WTCCC),† Mark I. McCarthy,1,2 ‑§ Andrew T. Hattersley3,4 ‑ The molecular mechanisms involved in the development of type 2 diabetes are poorly understood. Starting from genome-wide genotype data for 1924 diabetic cases and 2938 population controls generated by the Wellcome Trust Case Control Consortium, we set out to detect replicated diabetes association signals through analysis of 3757 additional cases and 5346 controls and by integration of our findings with equivalent data from other international consortia. We detected diabetes susceptibility loci in and around the genes CDKAL1, CDKN2A/CDKN2B, and IGF2BP2 and confirmed the recently described associations at HHEX/IDE and SLC30A8. Our findings provide insight into the genetic architecture of type 2 diabetes, emphasizing the contribution of Here, we describe how integration of data from the WTCCC scan and our own replication studies with similar information generated by the Diabetes Genetics Initiative (DGI) (6) and the Finland–United States Investigation of NIDDM Genetics (FUSION) (7) has identified several additional susceptibility variants for T2D. In the WTCCC study, analysis of 490,032 autosomal SNPs in 16,179 samples yielded 459,448 SNPs that passed initial quality control (5). We considered only the 393,453 autosomal SNPs with minor allele frequency (MAF) ex- ceeding 1% in both cases and controls and no extreme departure from Hardy-Weinberg equi- librium (P < 10βˆ’4 in cases or controls) (8). This T2D-specific data set shows no evidence of sub- stantial confounding from population substruc- ture and genotyping biases (8). To distinguish true associations from those reflecting fluctuations under the null or residual errors arising from aberrant allele calling, we first submitted putative signals from the WTCCC study to additional quality control, including cluster- plot visualization and validation genotyping on REPORTS onFebruary8,2010www.sciencemag.orgDownloadedfrom
  • 43. ARTICLES A genome-wide association study identifies novel risk loci for type 2 diabetes Robert Sladek1,2,4 , Ghislain Rocheleau1 *, Johan Rung4 *, Christian Dina5 *, Lishuang Shen1 , David Serre1 , Philippe Boutin5 , Daniel Vincent4 , Alexandre Belisle4 , Samy Hadjadj6 , Beverley Balkau7 , Barbara Heude7 , Guillaume Charpentier8 , Thomas J. Hudson4,9 , Alexandre Montpetit4 , Alexey V. Pshezhetsky10 , Marc Prentki10,11 , Barry I. Posner2,12 , David J. Balding13 , David Meyre5 , Constantin Polychronakos1,3 & Philippe Froguel5,14 Type 2 diabetes mellitus results from the interaction of environmental factors with a combination of genetic variants, most of which were hitherto unknown. A systematic search for these variants was recently made possible by the development of high-density arrays that permit the genotyping of hundreds of thousands of polymorphisms. We tested 392,935 single-nucleotide polymorphisms in a French case–control cohort. Markers with the most significant difference in genotype frequencies between cases of type 2 diabetes and controls were fast-tracked for testing in a second cohort. This identified four loci containing variants that confer type 2 diabetes risk, in addition to confirming the known association with the TCF7L2 gene. These loci include a non-synonymous polymorphism in the zinc transporter SLC30A8, which is expressed exclusively in insulin-producing b-cells, and two linkage disequilibrium blocks that contain genes potentially involved in b-cell development or function (IDE–KIF11–HHEX and EXT2–ALX4). These associations explain a substantial portion of disease risk and constitute proof of principle for the genome-wide approach to the elucidation of complex genetic traits. The rapidly increasing prevalence of type 2 diabetes mellitus (T2DM) is thought to be due to environmental factors, such as increased availabil- ity of food and decreased opportunity and motivation for physical activity, acting on genetically susceptible individuals. The heritability of T2DM is one of the best established among common diseases and, consequently, genetic risk factors for T2DM have been the subject of intense research1 . Although the genetic causes of many monogenic forms of diabetes (maturity onset diabetes in the young, neonatal mito- chondrial and other syndromic types of diabetes mellitus) have been elucidated, few variants leading to common T2DM have been clearly identified and individually confer only a small risk (odds ratio < 1.1– 1.25) of developing T2DM1 . Linkage studies have reported many T2DM-linked chromosomal regions and have identified putative, cau- sative genetic variants in CAPN10 (ref. 2), ENPP1 (ref. 3), HNF4A (refs genotypes for 392,935 single-nucleotide polymorphisms (SNPs) in 1,363 T2DM cases and controls (Supplementary Table 1). In order to enrich for risk alleles21 , the diabetic subjects studied in stage 1 were selected to have at least one affected first degree relative and age at onset under 45 yr (excluding patients with maturity onset diabetes in the young). Furthermore, in order to decrease phenotypic hetero- geneity and to enrich for variants determining insulin resistance and b-cell dysfunction through mechanisms other than severe obesity, we initially studied diabetic patients with a body mass index (BMI) ,30 kg m22 . Control subjects were selected to have fasting blood glucose ,5.7 mmol l21 in DESIR, a large prospective cohort for the study of insulin resistance in French subjects22 . Genotypes for each study subject were obtained using two plat- Sladek, 2007How many SNPs (p-value?) European-based; N ~ 1000 cases: high fasting blood glucose/non-obese controls: non-obese
  • 44. Human Hap300 chip, showing no T2DM association in stage 1 (P . 0.01) and separated by at least 100 kb. Using the first principal component as a covariate for ancestry differences between cases and controls, we tested for association between rs932206 and disease status. Our result suggests that this apparent association is largely BMI on the association between marker and disease, as it is asymp- totically equivalent to the Armitage trend test used to detect asso- ciation in stages 1 and 2. None of the associations (Supplementary Table 7) was substantially changed by considering the effects of these covariates. 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 15 10 5 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 19 20 21 22 X 18 Figure 1 | Graphical summary of stage 1 association results. T2DM association was determined for SNPs on the Human1 and Hap300 chips. The x axis represents the chromosome position from pter; the y axis shows 2log10[pMAX], the P-value obtained by the MAX statistic, for each SNP (Note the different scale on the y axis of the chromosome 10 plot.). SNPs that passed the cutoff for a fast-tracked second stage are highlighted in red. 882 NatureΒ©2007 Publishing Group Sladek, 2007
  • 45. Identification of four novel T2DM loci Our fast-track stage 2 genotyping confirmed the reported association for rs7903146 (TCF7L2) on chromosome 10, and in addition iden- tified significant associations for seven SNPs representing four new T2DM loci (Table 1). In all cases, the strongest association for the MAX statistic (see Methods) was obtained with the additive model. The most significant of these corresponds to rs13266634, a non- synonymous SNP (R325W) in SLC30A8, located in a 33-kb linkage disequilibrium block on chromosome 8, containing only the 39 end of this gene (Fig. 2a). SLC30A8 encodes a zinc transporter expressed solely in the secretory vesicles of b-cells and is thus implicated in the final stages of insulin biosynthesis, which involve co-crystallization Table 1 | Confirmed association results SNP Chromosome Position (nucleotides) Risk allele Major allele MAF (case) MAF (ctrl) Odds ratio (het) Odds ratio (hom) PAR ls Stage 2 pMAX Stage 2 pMAX (perm) Stage 1 pMAX Stage 1 pMAX (perm) Nearest gene rs7903146 10 114,748,339 T C 0.406 0.293 1.65 6 0.19 2.77 6 0.50 0.28 1.0546 1.5 3 10234 ,1.0 3 1027 3.2 3 10217 ,3.3 3 10210 TCF7L2 rs13266634 8 118,253,964 C C 0.254 0.301 1.18 6 0.25 1.53 6 0.31 0.24 1.0089 6.1 3 1028 5.0 3 1027 2.1 3 1025 1.8 3 1025 SLC30A8 rs1111875 10 94,452,862 G G 0.358 0.402 1.19 6 0.19 1.44 6 0.24 0.19 1.0069 3.0 3 1026 7.4 3 1026 9.1 3 1026 7.3 3 1026 HHEX rs7923837 10 94,471,897 G G 0.335 0.377 1.22 6 0.21 1.45 6 0.25 0.20 1.0065 7.5 3 1026 2.2 3 1025 3.4 3 1026 2.5 3 1026 HHEX rs7480010 11 42,203,294 G A 0.336 0.301 1.14 6 0.13 1.40 6 0.25 0.08 1.0041 1.1 3 1024 2.9 3 1024 1.5 3 1025 1.2 3 1025 LOC387761 rs3740878 11 44,214,378 A A 0.240 0.272 1.26 6 0.29 1.46 6 0.33 0.24 1.0046 1.2 3 1024 2.8 3 1024 1.8 3 1025 1.3 3 1025 EXT2 rs11037909 11 44,212,190 T T 0.240 0.271 1.27 6 0.30 1.47 6 0.33 0.25 1.0045 1.8 3 1024 4.5 3 1024 1.8 3 1025 1.3 3 1025 EXT2 rs1113132 11 44,209,979 C C 0.237 0.267 1.15 6 0.27 1.36 6 0.31 0.19 1.0044 3.3 3 1024 8.1 3 1024 3.7 3 1025 2.9 3 1025 EXT2 Significant T2DM associations were confirmed for eight SNPs in five loci. Allele frequencies, odds ratios (with 95% confidence intervals) and PAR were calculated using only the stage 2 data. Allele frequencies in the controls were very close to those reported for the CEU set (European subjects genotyped in the HapMap project). Induced sibling recurrent risk ratios (ls) were estimated using stage 2 genotype counts for the control subjects and assuming a T2DM prevalence of 7% in the French population. hom, homozygous; het, heterozygous; major allele, the allele with the higher frequency in controls; pMAX, P-value of the MAX statistic from the x2 distribution; pMAX (perm), P-value of the MAX statistic from the permutation-derived empirical distribution (pMAX and pMAX (perm) are adjusted for variance inflation); risk allele, the allele with higher frequency in cases compared with controls. 0 2 4 –log10[P] –log10[P] SLC30A8 IDE HHEXKIF11 0 2 4 a b NATURE|Vol 445|22 February 2007 ARTICLES Sladek, 2007 5 3 1 5 3 1 15 10 5 1 1 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 5 3 1 3 4 5 8 9 10 13 14 15 19 20 X 18 DM 2log10[pMAX], the P-value obtained by the MAX statistic, for each SNP How would you interpret the p- values? Odds ratios? Confirmed 8 SNPs with N ~ 1000
  • 46. Scaling up discovery by combining populations: meta-analyses
  • 47. g the Diabetes Genetics nvestigation of NIDDM nd (iv) the Framingham omponent studies (n ΒΌ ry Table 1 online. aring, the four consortia n 10 and 20 SNPs promi- their individual, interim, mentary Table 2 online). oci with consistent effects dies. Two of these repre- 6PC2 and GCK. In addi- nerated evidence for an NPs around the MTNR1B rs1387153, P ΒΌ 2.2 Γ‚ 10Γ€11; DFS: rs10830963, 5.8 Γ‚ 10Γ€4, for the most ch analysis). The associa- d on formal meta-analysis r exclusion of individuals ΒΌ 1.1 Γ‚ 10Γ€57; rs4607517 NR1B), P ΒΌ 3.2 Γ‚ 10Γ€50; pplementary Table 3 and ent efforts to harmonize (including the additional data from the WTCCC, DGI and FUSION scans)10 (Supplementary Note). We found strong evidence that the minor G allele of rs10830963 was associated with increased risk of T2D (odds ratio ΒΌ 1.09 (1.05–1.12), P ΒΌ 3.3 Γ‚ 10Γ€7; Fig. 2 and Supplementary Table 6 online). The possibility that the fasting glucose association might DGI Study ID OR (95% CI) Weight (%) 1.12 (0.96, 1.30) 4.61 4.89 8.03 9.58 3.53 8.75 2.69 6.04 10.56 23.18 2.85 7.41 7.90 100.00 1.20 (1.03, 1.39) 1.07 (0.95, 1.20) 1.14 (1.03, 1.27) 1.00 (0.84, 1.19) 1.17 (1.04, 1.30) 1.07 (0.88, 1.31) 1.16 (1.02, 1.33) 1.00 (0.90, 1.10) 1.03 (0.96, 1.10) 0.91 (0.75, 1.10) 1.15 (1.02, 1.30) 1.16 (1.03, 1.30) 1.09 (1.05, 1.12) Meta-analysis P value = 3.3 Γ— 10 –7 FUSION WTCCC deCODE KORA Rotterdam CCC ADDITION/ELY Norfolk UKT2DGC OxGN/58BC FUSION Stage 2 METSIM .722 1 1.39 Overall (I 2 = 26.6%, P = 0.176) Figure 2 Association of rs10830963 with type 2 diabetes (T2D) in 13 case- control studies. VOLUME 41 [ NUMBER 1 [ JANUARY 2009 NATURE GENETICS Meta-analysis of SNP rs10830963: Combining findings from multiple cohorts Propenko, 2009
  • 48. A RT I C L E S By combining genome-wide association data from 8,130 individuals with type 2 diabetes (T2D) and 38,987 controls of European descent and following up previously unidentified meta-analysis signals in a further 34,412 cases and 59,925 controls, we identified 12 new T2D association signals with combined P < 5 Γ— 10βˆ’8. These include a second independent signal at the KCNQ1 locus; the first report, to our knowledge, of an X-chromosomal association (near DUSP9); and a further instance of overlap between loci implicated in monogenic and multifactorial forms of diabetes (at HNF1A). The identified loci affect both beta-cell function and insulin action, and, overall, T2D association signals show evidence of enrichment for genes involved in cell cycle regulation. We also show that a high proportion of T2D susceptibility loci harbor independent association signals influencing apparently unrelated complex traits. Type 2 diabetes (T2D) is characterized by insulin resistance and deficient beta-cell function1. The escalating prevalence of T2D and the limitations of currently available preventative and therapeutic options highlight the need for a more complete understanding of T2D pathogenesis. To date, approximately 25 genome-wide significant common variant associations with T2D have been described, mostly through genome-wide association (GWA) analyses2–13. The identities of the variants and genes mediating the susceptibility effects at most of these signals have yet to be established, and the known variants account for less than 10% of the overall estimated genetic contribution to T2D predisposition. Although some of the unexplained heritability will reflect variants poorly captured by existing GWA platforms, we reasoned that an expanded meta-analysis of existing GWA data would the inverse-variance method (Online Methods, Fig. 1, Supplementary Tables 1 and 2 and Supplementary Note). We observed only modest genomic control inflation ( gc = 1.07), suggesting that the observed results were not due to population stratification. After removing SNPs within established T2D loci (Supplementary Table 3), the result- ing quantile-quantile plot was consistent with a modest excess of disease associations of relatively small effect (Supplementary Note). Weak evidence for association at HLA variants strongly associated with autoimmune forms of diabetes (Supplementary Table 3 and Supplementary Note) suggested some case admixture involving subjects with type 1 diabetes or latent autoimmune diabetes of adult- hood; however, failure to detect T2D associations at other non-HLA type 1 diabetes susceptibility loci (for example, INS, PTPN22 and Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis Voight, 2010 Meta-analyses for T2D: N>40K and 90K identifies >30 loci among 2,400,000 SNPs
  • 49. A RT I C L E S 13 autosomal loci exceeded the threshold for genome-wide significance (P ranging from 2.8 Γ— 10βˆ’8 to 1.4 Γ— 10βˆ’22) with allele-specific odds (r2 < 0.05), and conditional analyses (see below) establish these SNPs as independent (Fig. 2 and Supplementary Table 4). Further analysis 50 Locus established previously Locus identified by current study Locus not confirmed by current study BCL11A THADA NOTCH2 ADAMTS9 IRS1 IGF2BP2 WFS1 ZBED3 CDKAL1 HHEX/IDE KCNQ1 (2 signals*: ) TCF7L2 KCNJ11 CENTD2 MTNR1B HMGA2 ZFAND6 PRC1 FTO HNF1B DUSP9 Conditional analysis Unconditional analysis TSPAN8/LGR5 HNF1A CDC123/CAMK1D CHCHD9 CDKN2A/2B SLC30A8 TP53INP1 JAZF1 KLF14 PPAR 40 30 –log10(P)–log10(P) 20 10 10 1 2 3 4 5 6 7 8 Chromosome 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X 0 0 Suggestive statistical association (P < 1 10 –5 ) Association in identified or established region (P < 1 10 –4 ) Figure 1 Genome-wide Manhattan plots for the DIAGRAM+ stage 1 meta-analysis. Top panel summarizes the results of the unconditional meta- analysis. Previously established loci are denoted in red and loci identified by the current study are denoted in green. The ten signals in blue are those taken forward but not confirmed in stage 2 analyses. The genes used to name signals have been chosen on the basis of proximity to the index SNP and should not be presumed to indicate causality. The lower panel summarizes the results of equivalent meta-analysis after conditioning on 30 previously established and newly identified autosomal T2D-associated SNPs (denoted by the dotted lines below these loci in the upper panel). Newly discovered conditional signals (outside established loci) are denoted with an orange dot if they show suggestive levels of significance (P < 10βˆ’5), whereas secondary signals close to already confirmed T2D loci are shown in purple (P < 10βˆ’4). Meta-analyses for T2D: N>40K and 90K identifies >30 loci among 2,400,000 SNPs
  • 50. 0 20 40 60 80 100 recombinationrate(cM/Mb) ●●● ●● ●● ●●● ● ● ● ●●● ● ●●●●● ● ● ● ●●● ●● ●● ● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ●● ●● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ●●●●● ● ●● ● ● ● ●●● ● ● ● ● ● ● ●● ● ●●● ●●● ● ● ● ● ● ● ●●●●● ●●●● ● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ●●● ●● ●● ● ●● ● ●● ● ● ● ● ●●●● ● ● ● ● ● ●● ● ●● ●● ●● ● ● ● ● ● ● ● ●● ● ●●●● ● ● ● ●● ● ●● ● ●●● ● ● ● ● ● ●●●● ● ● ●● ● ● ●●●●● ● ● 2 βˆ’> PGCP 98 SLC30A8 Region 0 2 4 6 8 10 βˆ’log10(Pβˆ’value) 0 20 40 60 80 100 recombinationrate(cM/Mb) rs3802177 ●●●● ● ● ● ● ● ● ● ● ● ●● ● ●● ●●● ● ● ● ● ●●● ●● ● ●●●●●● ● ●●● ● ● ● ● ● ● ●● ●●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ●● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ●●● ● ● ● ● ●●●●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ●● ● ●●● ●● ●● ● ●● ● ●●● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●●●● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●●●● ●● ● ● ●● ●●● ● ●●●●● ●● ●●● ● ●●● ● ● ● ● ●●● ●● ● ● ● ●●●●● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●●● ● ● ● ● ●●● ● ●●● ● ● ●● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ●●● ●● ● ● ● ● ●●● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●●● ●●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ●● ● ● ●● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ●● ●●● ●● ●●●●●● ● ● ● ●● ●● ● ● ● ● ●●● ● ● ● ●● ● ●● ● ●● ●●●●● ● ● ● ●●● ● ● ●● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ●●● ● ●● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ●●●● ●● ●● ●●● ● ● ● ●●●●● ● ●● ● ● ● ● ●● ● ● ●● ●●●●●●●●● ●●● ● ●●● ● ●● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ●● ● ●●●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●●● ● ●●●●● ● ● ●●● ● ●●●● ● ●● ●● ● ● ●●● ● ● ●●●●●●● ● ● ● ● ● ● ●● ● ●● ● ●● ●●●●● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ●● ●● ●● ● ●●●● ●●● ● ●● ● ● ● ● ●●● ● ●●● ● ●● ● ●●● ● ●●●●●●●●●● ● ● ● ● ●●●● ● ●● ●●●●●●●●●●●●● ● ●●● ● ●● ●● ● ● ●● ●● ● ●●●●● ● ● ● ●● ●● ● ● ●●●●●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●●● ● ● ●●● ● ● ●●● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●●●● ● ●● ● ●●● ● ● ● ● ●●● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●●●●● ● ● ●● ●● ● ●●●●● ● ● ●●● ●● ●●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ●●● ● ●● ●● ●●●● ● ● ● ●●● ● ● ●●● ● ● ● ● ●● ● ● ●●●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●●● ●●●● ● ● ● ●● ● ●●●● ●● ● ● ● ●●●● ●● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ●●● ● ●● ●●● ● ● ●●● ● ● ●●●●● ● ● ● ● ●●●●● ● ●●●●● ● ●●● ● ● ●● ● ● ● ● ●●● ●● ●●● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●●● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●●●● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●●● ● ●●● ● ●●●●●●● ● ● ● ● ● ● ●●●●●●●● ●● ● ● ● ● ●●●●●● ● ● ●● ● ●● ●●● ● ● ● ● ● ● ● ●●●● ●● ● ● ●●● ●●● ● ●●●● ● ●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●●●●●●●●●●●● ●●●●●●● ● ● ●●●●●●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ●●● ● ●● ● ●●●● ●● ● ● ● ● ● ● ●●●● ● ●● ● ● ● ● ● ●● ●●●●● ● ● ● ●● ● ●●●●●●●●●●●●● ●●●●●●●●●● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ●●●●●● ● ●● ● ●●●●●●● ● ●● ●●●● ● ●●●● ● ● ● ●●●●●● ● ●● ●●●●●●●●●●● ●●● ● ● ● ●●●●●● ● ●● ● ●●●●●● ●●●●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●● ●● ● ● ● ●● ● ● ● ● ●●●● ●● ● ●●● ●● ●●● ● ●● ●● ● ●● ● ● ●●●●● ● ● ● ●● ●● ●●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ●●● ● ●●●●●●●● ● ●●●● ● ● ●●● ● ●● ● ●●● ● ●●●● ● ●● ●●● ● ●●●●● ●●●● ●● ●●● ● ● ● ● ● ● ●●●● ● ● ● ●●● ● ● ● ● ● ● ● ● ●●●●●●●●●●● ● ● ●●●●● ● ● ●●●●● ● ●●●● ● ●● ● ●●●●● ● ●●●● ●● ● ●● ● ● ● ●● ●●●●●●●●●●●●● ● ● ●●●●●●● ●●●● ● ●● ●● ●●● ● ● ●● ●●● ● ●●●● ● ● ●●● ●●●●●●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ●●●●● ● ●●●●●●●●●●● ● ●●●●●●● ●●●●●●●● ● ● ● ● ●● ● ● ● ●●● ●● ● ● ●●●●●●●●●●●●●●●● ●●●●● ●●●●● ● ● ● ● ● ● ●● ●● ● ● ● ●●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ●●●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ●● ● ●● ● ●● ● ● ● ● ●● ● ● ● ●● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●●●●●●●●● ● ●●●● ●● ●●● ●● ●● ●●● ● ●● ● ● ● ●●●● ● ● ● ● ● ● ● ●● ● ● ●●● ● ●● ● ●● ● ● ● ●●● ● ● ●●●●●●●● ● ●●●● ●● ● ●● ●● ● ●●●●●●● ●●●● ● ● ●● ●●● ● ●●● ●●● ● ●● ● ● ● ●● ● ●●●● ● ● ● ● ●●● ● ●●●●●●●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ●● ●● ● ● ● ●●●● ● ● ● ● ●● ● ●● ●● ● ● ●● ● ●●●● ●● ●● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ●●● ● ●● ●●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●●●● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ●● ●●●●●●●● ● ● ●●●●●●● ● ●●● ● ● ●●●●● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ●● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ●● ● ● ●● ● ● ● ●● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ●● ● ●●●●●●●●●● ●●●●● ●● ●●● ●●● ● ● ●●●● ●●●●●●●●●● ● ● ● ● ●● ●●●●● ●●●●●●●●●● ●●●●● ● ● ● ● ● ● ●●●●●●●● ● ● ● ●●●● ●●●● ●●● ● ● ●● ● ● ●● ● ● ● ●●●●● ●● ● ● ● ● ● ● ● ●●●● ● ●●● ● ● ●● ● ● ●● ● ● ● ●● ● ●● ●●● ● ● ● ● ●●● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●●●● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ●● ● ● ● ●● ● ●● ●● ● ● ● rs3802177 stage 1 ● r^2: 0.8 βˆ’ 1.0 ● r^2: 0.6 βˆ’ 0.8 ● r^2: 0.4 βˆ’ 0.6 ● r^2: 0.2 βˆ’ 0.4 ● r^2: 0.0 βˆ’ 0.2 ● r^2 missing <βˆ’ TRPS1 <βˆ’ EIF3H UTP23 βˆ’> <βˆ’ RAD21 LOC441376 βˆ’> SLC30A8 βˆ’> MED30 βˆ’> <βˆ’ EXT1 <βˆ’ SAMD12 <βˆ’ TNFRSF11 COLEC1 117 118 119 120 Position on chromosome 8 (Mb) CDKN2A/B Region 0 2 4 6 8 10 βˆ’log10(Pβˆ’value) 0 20 40 60 80 100 recombinationrate(cM/Mb) rs10965250 ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●●● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●● ● ●●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●●● ● ●● ● ●● ● ● ● ●●● ● ●●● ● ● ● ● ●●● ● ●●● ● ● ● ● ●●●● ●● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●● ● ●● ● ● ● ● ● ● ●● ● ● ●●●●● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ●● ●●● ● ●● ●● ● ● ●● ●●● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ●●●●●●●●● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ●●●●●●● ●●● ● ● ● ●● ● ● ●●●● ● ● ● ●● ● ● ● ● ●●●●● ● ●● ●●●●●● ● ● ● ●● ● ● ●●● ● ● ● ●●● ● ●●●● ● ● ● ●●●● ●● ●●● ●● ●●●●● ●● ●●● ●●●●● ● ●●●● ● ● ● ●● ● ● ● ● ●●● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ●●●●●●● ●●● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ●● ●●●●●●●●●● ● ●●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ●●●●● ● ●● ●● ● ● ●●● ●● ● ●● ● ● ● ● ● ●●● ● ●●● ● ●●● ● ● ● ● ●●●●●●●●●●●●● ● ●● ●●● ●●● ●●● ● ● ● ●●●● ●● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ●●● ●● ●● ●●●●●●●●●●●●●●● ● ●●● ●●●●● ● ● ● ● ● ● ● ●●●● ●● ● ● ● ● ● ●● ● ●●● ● ● ●● ●●●●● ● ●● ● ● ● ● ●●●●●●● ● ● ● ● ● ●●● ●● ● ●●● ● ●●● ● ●●●●●●●●●●●●●●●● ●●●● ●● ● ●● ●● ●● ● ● ● ● ● ●● ● ●● ● ●●● ● ●●● ● ●●●●● ● ●● ● ●●● ●● ●● ● ● ●●● ●● ●●●● ●● ●● ●● ●● ● ● ● ● ● ● ●●●● ● ●●●●● ● ● ● ●●●● ● ●● ● ● ● ● ●●● ● ●● ● ● ●●●●● ● ● ● ● ● ●● ● ●● ● ●●●●● ● ●● ●●●●● ●● ● ●●● ● ● ●●● ● ● ● ● ● ● ● ●● ● ●●●● ● ● ●● ● ●● ●●●●●●●●●●●●●● ●● ● ●● ●●● ● ● ● ●● ●● ● ●●● ● ●●●● ● ● ● ● ●● ●● ●● ●●●●●● ● ● ●● ● ● ●● ● ● ● ● ●● ● ●● ●●● ● ● ●● ● ●● ● ● ● ●● ● ●●● ● ●● ● ● ●●● ● ●●●●● ● ● ●●● ●●●●● ●● ●●● ● ● ● ● ● ● ●● ● ● ● ● ●●●●●● ● ●●● ●● ● ●●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ●●●●● ●● ● ●● ● ●● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ● ●● ● ● ●●●● ●● ●●● ●● ●● ● ● ● ●● ● ● ●●●● ●●● ● ● ●● ●● ● ● ● ●●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ●●● ● ● ● ● ●● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●●● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ●●● ●●●● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ●●● ●● ● ● ● ●● ●● ● ●● ●● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●●●●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●●● ●● ● ● ● ●● ● ● ●●● ● ●●●● ● ●● ● ● ● ● ● ●● ● ● ● ● ●●● ●●●●●● ●●●● ●● ●● ●●●● ●●● ●●● ● ● ● ● ●● ●● ● ●●● ●● ● ● ●●● ●●●● ● ●● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ●● ● ● ●● ● ●● ● ●●●●● ●● ●● ● ● ● ●●● ●● ● ● ●● ● ●● ●● ●●● ● ● ● ●● ● ● ●● ● ●● ●●●●●●●●●●●●●●●● ● ●● ●●● ●● ●●●● ● ● ● ● ● ●● ● ● ● ●● ●●●●● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ●●● ●● ● ● ● ● ●●● ● ●● ●● ● ●● ● ● ● ●● ● ● ●●●● ●●● ● ●● ●●●●● ● ● ●●● ● ●● ● ●● ● ● ●●● ●● ●●● ● ● ● ● ● ● ● ●● ● ● ●●●● ●●● ●● ●● ●● ● ●● ● ●● ● ● ●●●●● ● ●● ● ● ●● ● ● ● ●●●● ● ●● ● ●●● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ●● ● ●●●●● ● ● ●● ● ●● ● ● ●● ● ● ● ●●●●●● ● ● ●●●● ●● ● ●●●●● ● ● ● ● ●● ●● ● ●● ● ● ● ● ●●●●● ● ● ● ●●●● ● ● ● ●●●●●● ● ●● ●● ●●● ●●● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ●●●●● ●●● ● ●●● ● ● ● ● ● ● ●● ● ● ●●●●● ●●● ● ● ● ● ● ● ●●● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ●● ●●● ● ● ● ●● ● ●● ●● ● ● ●● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ●● ●●● ●● ●●● ●● ●● ●●● ●● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ●●●●●● ● ● ● ● ● ● ● ● ● ● ●● ● ●●●● ● ● ● ● ● ● ●●● ●● ●●●●●● ●● ●●●●●●●● ● ● ● ● ● ● ● ●● ●● ● ●●●● ●● ●● ● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ●● ●●●●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●●●●●●● ●● ● ● ● ● ●● ● ● ●● ● ● ●●●● ● ●● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ●●●●● ●● ●● ● ● ● ●● ● ● ● ● ●●●●●●●● ●●● ● ●●●● ●●● ● ● ●● ● ● ●●●● ●●●● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ●● ●● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●●●● ●● ● ● ● ● ● ●●● ● ● ●● ●● ● ● ●●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ●●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ●●●● ●●● ● ● ● ●●● ● ● ● ● ●●● ● ● ● ● ●● ●● ● ●● ● ● ●●● ● ● ● ●● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ●●● ● ● ● ● ●●● ● ●● ●● ●● ●● ●● ● ●●● ●● ●●● ● ●●● ● ● ● ●● ● ● ● ● ● ● ●●● ● ●●●● ● ● ● ●● ●●● ● ● ●●● ●● ●● ●●●●● ● ● ●●●● ● ● ●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ●● ● ● ●● ● ●● ● ●●●●● ●●● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●●●●●●● ●● ●●●● ●● ● ● ●● ●● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ●●●● ●●●●● ●●●●● ●● ● ●●●● ● ● ●● ● ●●● ● ● ●●● ●● ● ● ●● ● ● ● ● ●● ●●● ●● ●● ●● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ●● ●● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●●● ● ●●● ● ● ●● ● ●●●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ●●● ● ● ●●●● ●● ● ● ● ● ● ● ● ●●● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●●●● ● ●● ● ●● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ●●● ●●● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ●● ●● ● ●● ● ● ●●● ● ● ● ● ● ●● rs10965250 stage 1 ● r^2: 0.8 βˆ’ 1.0 ● r^2: 0.6 βˆ’ 0.8 ● r^2: 0.4 βˆ’ 0.6 ● r^2: 0.2 βˆ’ 0.4 ● r^2: 0.0 βˆ’ 0.2 ● r^2 missing <βˆ’ MLLT3 KIAA1797 βˆ’> <βˆ’ PTPLAD2 <βˆ’ IFNB1 <βˆ’ IFNW1 <βˆ’ IFNA21 <βˆ’ IFNA4 <βˆ’ IFNA7 <βˆ’ IFNA13 MTAP βˆ’> <βˆ’ CDKN2A <βˆ’ CDKN2B DMRTA1 βˆ’> <βˆ’ ELAVL2 21 22 23 24 Position on chromosome 9 (Mb) 40 60 80 100 recombinationrate(c CDC123/CAMK1D Region 4 6 8 10 log10(Pβˆ’value) 40 60 80 100 recombinationrate(c rs12779790 ●●● ● ● ●● ● rs12779790 stage 1 ● r^2: 0.8 βˆ’ 1.0 ● r^2: 0.6 βˆ’ 0.8 ● r^2: 0.4 βˆ’ 0.6 ● r^2: 0.2 βˆ’ 0.4 ● r^2: 0.0 βˆ’ 0.2 ● r^2 missing HHEX/IDE Region 10 15 log10(Pβˆ’value) 40 60 80 100 recombinationrate(c rs5015480 ● ● ● ● ● ●● ● ● ● ●●●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●●● rs5015480 stage 1 ● r^2: 0.8 βˆ’ 1.0 ● r^2: 0.6 βˆ’ 0.8 ● r^2: 0.4 βˆ’ 0.6 ● r^2: 0.2 βˆ’ 0.4 ● r^2: 0.0 βˆ’ 0.2 ● r^2 missing .609 Not in a gene...In a gene... ~90% of GWAS hits are non-coding!
  • 51. pporting!Figures! ! ! ~90% of GWAS hits are non-coding! Stamatoyannopoulos, Science 2012 Systematic Localization of Common Disease-Associated Variation in Regulatory DNA Matthew T. Maurano,1 * Richard Humbert,1 * Eric Rynes,1 * Robert E. Thurman,1 Eric Haugen,1 Hao Wang,1 Alex P. Reynolds,1 Richard Sandstrom,1 Hongzhu Qu,1,2 Jennifer Brody,3 Anthony Shafer,1 Fidencio Neri,1 Kristen Lee,1 Tanya Kutyavin,1 Sandra Stehling-Sun,1 Audra K. Johnson,1 Theresa K. Canfield,1 Erika Giste,1 Morgan Diegel,1 Daniel Bates,1 R. Scott Hansen,4 Shane Neph,1 Peter J. Sabo,1 Shelly Heimfeld,5 Antony Raubitschek,6 Steven Ziegler,6 Chris Cotsapas,7,8 Nona Sotoodehnia,3,9 Ian Glass,10 Shamil R. Sunyaev,11 Rajinder Kaul,4 John A. Stamatoyannopoulos1,12 † Genome-wide association studies have identified many noncoding variants associated with common diseases and traits. We show that these variants are concentrated in regulatory DNA marked by deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). Eighty-eight percent of such DHSs are active during fetal development and are enriched in variants associated with gestational exposure–related phenotypes. We identified distant gene targets for hundreds of variant-containing DHSs that may explain phenotype associations. Disease-associated variants systematically perturb transcription factor recognition sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrated tissue-selective enrichment of more weakly disease-associated variants within DHSs and the de novo identification of pathogenic cell types for Crohn’s disease, multiple sclerosis, and an electrocardiogram trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of regulatory DNA variation in common human disease and provide pathogenic insights into diverse disorders. D isease- and trait-associated genetic variants are rapidly being identified with genome- wide association studies (GWAS) and re- lated strategies (1). To date, hundreds of GWAS have been conducted, spanning diverse diseases and quantitative phenotypes (2) (fig. S1A). How- ever, the majority (~93%) of disease- and trait- associated variants emerging from these studies lie within noncoding sequence (fig. S1B), com- plicating their functional evaluation. Several lines of evidence suggest the involvement of a propor- tion of such variants in transcriptional regulatory mechanisms, including modulation of promoter and enhancer elements (3–6) and enrichment with- in expression quantitative trait loci (eQTL) (3, 7, 8). Human regulatory DNA encompasses a vari- ety of cis-regulatory elements within which the co- operative binding of transcription factors creates focal alterations in chromatin structure. Deoxy- ribonuclease I (DNase I) hypersensitive sites (DHSs) are sensitive and precise markers of this actuated regulatory DNA, and DNase I mapping has been instrumental in the discovery and census of hu- man cis-regulatory elements (9). We performed DNase I mapping genome-wide (10) in 349 cell and tissue samples, including 85 cell types studied under the ENCODE Project (10) and 264 sam- ples studied under the Roadmap Epigenomics Program (11). These encompass several classes nome. In total, we identified 3,899,693 distinct DHS positions along the genome (collectively spanning 42.2%), each of which was detected in one or more cell or tissue types (median = 5). Disease- and trait-associated variants are concentrated in regulatory DNA. We examined the distribution of 5654 noncoding genome-wide significant associations [5134 unique single- nucleotide polymorphisms (SNPs); fig. S1 and table S2] for 207 diseases and 447 quantitative traits (2) with the deep genome-scale maps of regulatory DNA marked by DHSs. This revealed a collective 40% enrichment of GWAS SNPs in DHSs (fig. S1C, P < 10βˆ’55 , binomial, compared to the distribution of HapMap SNPs). Fully 76.6% of all noncoding GWAS SNPs either lie within a DHS (57.1%, 2931 SNPs) or are in complete linkage disequilibrium (LD) with SNPs in a near- by DHS (19.5%, 999 SNPs) (Fig. 1A) (12). To con- firm this enrichment, we sampled variants from the 1000 Genomes Project (13) with the same ge- nomic feature localization (intronic versus inter- genic), distance from the nearest transcriptional start site, and allele frequency in individuals of European ancestry. We confirmed significant en- richment both for SNPs within DHSs (P < 10βˆ’59 , simulation) and also including variants in com- plete LD (r 2 = 1) with SNPs in DHSs (P < 10βˆ’37 , simulation) (fig. S2). In total, 47.5% of GWAS SNPs fall within gene bodies (fig. S1B); however, only 10.9% of intronic GWAS SNPs within DHSs are in strong LD (r2 β‰₯ 0.8) with a coding SNP, indicating that the vast majority of noncoding genic variants are not simply tagging coding sequence. Analo- gously, only 16.3% of GWAS variants within coding sequences are in strong LD with variants in DHSs. SNPs on widely used genotyping arrays (e.g., Affymetrix) were modestly enriched with- in DHSs (fig. S2), possibly due to selection of SNPs with robust experimental performance in genotyping assays. However, we found no evi- dence for sequence composition bias (table S3). To further examine the enrichment of GWAS SNPs in regulatory DNA, we systematically clas- sified all noncoding GWAS SNPs by the quality 1 Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA. 2 Laboratory of Disease Genomics RESEARCH ARTICLE onSeptember12,2012www.sciencemag.orgDownloadedfrom
  • 52. There have been few, if any, similar bursts of discovery in the history of medical research. David Hunter and Peter Kraft, NEJM, 2007
  • 53. Common claims discussed in regards to GWAS: Despite issues, yielded many discoveries vs. cost to a doubling of the number of associated variants discov- ered. The proportion of genetic variation explained by significantly associated SNPs is usually low (typically less than 10%) for many complex traits, but for diseases such as CD and multiple sclerosis (MS [MIM 126200]), and for quantitative traits such as height and lipid traits, between Figure 1. GWAS Discoveries over Time Data obtained from the Published GWAS Catalog (see Web Resources). Only the top SNPs representing loci with association p values < 5 3 10Γ€8 are included, and so that multiple counting is avoided, SNPs identified for the same traits with LD r2 > 0.8 esti- mated from the entire HapMap samples are excluded. ~500,000 SNP chips x ~$500/chip = $250M Five years of GWAS Discovery (Visscher, 2012) $250M / ~2000 loci = $125K/locus Candidate genes: >$250M! 100 NIH R01s Fighter jet Hadron Collider: $9B
  • 54. P = G + EType 2 Diabetes Cancer Alzheimer’s Gene expression Phenotype Genome Variants Environment Infectious agents Nutrients Pollutants Drugs Complex traits are a function of genes and environment...
  • 55. Nothing comparable to elucidate E influence! We lack high-throughput methods and data to discover new E in P… E: ???
  • 56. A similar paradigm for discovery should exist for E! Why?
  • 57. Οƒ2 P = Οƒ2 G + Οƒ2 E
  • 58. Οƒ2 G Οƒ2 P H2 = Heritability (H2) is the range of phenotypic variability attributed to genetic variability in a population Indicator of the proportion of phenotypic differences attributed to G.
  • 59. Height is an example of a heritable trait: Francis Galton shows how its done (1887) β€œmid-height of 205 parents described 60% of variability of 928 offspring”