The CIMMYT Global Maize Program: Progress and Challenges

The CIMMYT Global Maize Program:
Progress and Challenges

Gary Atlin and the GMP team

El Batan
22 June 2012

Outline

1. The role of GMP in the world’s maize seed system
2. How do our products compare to those of the multi-nationals?
3. Adaptation to mega-environments: implications for breeding
4. The role of managed stress testing in the breeding pipeline
5. Identifying donors and delivering markers for abiotic and biotic stress
tolerance
6. Applying high density genotyping to maize breeding and managing
the “data tsunami”
7. An “open-source” model for delivering the benefits of high-density
genotyping and genomic selection to small breeding programs
8. Some things to watch out for

1. CIMMYT’s role in the world’s
maize seed system
.
 Only source of freely available maize
parental lines
 Our products support dozens of
independent regional seed companies in
Africa, Latin America, and Asia
 Our products help local companies
compete with multinationals
 We provide direct support to seed
companies in the commercialization of our
hybrids (DTMA, IMIC)
 We are a key source of donors of drought
tolerance and disease resistance

CIMMYT’s maize breeding effort

• Africa: 5 line development breeders, 2 molecular breeders, 4
seed specialists, 1 physiologist, 2 biotic stress specialists
• Latin America: 4 line development breeders, 1 physiologist, 1
nutritional specialist, 1 molecular breeder, 1 seed specialist
• India: 1 line development breeder, 1 physiologist
• China1 molecular breeding lead, 1 pathologist/breeder , 1
bioinformaticist, 2 molecular geneticists
• ca. 10,000 lines genotyped with 500K SNPs via GBS
• ca. 5000 DH lines produced in 2011 in-house
• ca. 400,000 nursery and yield plots world-wide*
• At least 2 million phenotypic data points annually
• At least 25 billion genotypic data points annually

How do our products get to
farmers?
• Hybrids are marketed mainly through
regional seed companies
• OPVs are distributed mainly through
national subsidy schemes

2. Where do we stand relative to the
multinationals?
Latin American tropics: PCCMCA trial 2011 (21 locations)

Grain Bad Husk
Yield Cover
Hybrid Pedigree (t/ha) (%) Ear rot (%)
MJ-9297 8.08 7.2 5.2
MH-9058 8.02 5.1 6.8
DK-357 Best Commercial Check 7.72 8.0 9.8
CIMMYT-2 CLRCW100/CLRCW96//CML494 7.68 5.2 9.5
CIMMYT-4 CML491/CLQ6316//CLRCWQ48 7.36 5.8 10.8
P4092W 6.93 3.3 6.5
P4063W 6.79 4.8 8.0

Heritability 0.91 0.89 0.91
LSD (0.05) 0.29 1.6 1.7

Regional tropical hybrid trial (PCCMCA), 2009: 28 public and private
sector hybrids, 18 locations in Mexico and Central America

% %
% ear Root Stem
Pedigree Yield rot lodging lodging
P-4082W 7.25 7.45 3.42 5.27
DK-357 6.99 9.80 5.29 3.53
(CML-264/CML-269)/CML494 6.64 8.67 4.11 6.00
MG9051 6.56 11.44 3.60 3.57
P-4081W 6.43 9.63 1.42 9.13
(CLQ-RCWQ26/CLQ-RCWQ108) /CML-491 6.23 6.69 8.13 10.16
NC 7218
6.20 12.84 1.84 1.24
INIFAP check- Mexico 5.43 9.01 9.38 5.49

LSD.05 0.52 3.93 5.15 4.66
Repeatability 0.94 0.67 0.53 0.47

Validation trials, 18 locations México, 2010
%bad
husk % Root % Stem
Hybrid Yield cover % Ear rot lodging lodging
(CML269/CML264)//CML494 7.033 11.70 5.48 3.49 5.52
P4082W 6.299 10.55 6.37 3.78 2.59
(CLG2312/CML495)//CML494 6.457 8.46 7.11 3.60 5.10
H565 6.294 8.74 10.14 1.82 3.45
H561 6.134 4.90 6.78 4.36 5.76
H564C 5.899 4.94 11.71 1.74 5.17
(LPS C7 F64-2-6-2-2-BBB / CML-
495)//CML494 5.552 9.61 8.55 2.64 2.73
H520 5.091 6.64 8.70 8.37 5.73

No. of locs. 18 11 13 13 13
LSD 0.62 7.60 2.84 2.69 2.94
Repeatability 0.86 0.00 0.78 0.80 0.41

Regional on-farm trials in ESA (2010/11 season)
Days to
Name GY-Locations > 3 t/ha GY: Locs < 3 t/ha anthesis
CZH0616 5.96 2.37 64.4
CZH0946 4.48 2.22 56.8
CZH0837 5.62 2.08 62.7
SC627 4.82 2.03 64.6
SC403 4.61 2.03 57.8
ZM627 4.74 1.90 65.0
ZM309 4.09 1.74 55.1
ZM521 4.27 1.73 59.6
SC513 4.76 1.60 64.0
Farmers Variety 4.64 1.54 64.7
Pan53 5.39 1.51 64.5

Mean 4.87 1.80 62.30

n 30 19
H 0.80 0.72

Mean yield of CIMMYT hybrids in the 2005 and 2010 Early and
Intermediate Regional Hybrid Trials (EIHYB) for Southern Africa

Optimal Optimal Managed
<3 t/ha >3 t/ha drought+low N

CIMMYT hybrid mean, % of checks, 2005 102.3 104.3 86.9
CIMMYT hybrid mean, % of checks, 2010 101.9 104.8 107.2

Mean of checks 2005 (t/ha) 1.73 4.63 1.30
Mean of checks 2010 (t/ha) 2.11 6.24 2.09

No trials 2005 6 14 6
No trials 2010 7 29 6

Mean yield of CIMMYT hybrids in the 2005 and 2010
Intermediate and Late Regional Hybrid Trials (ILHYB) for
Southern Africa.

Optimal Managed
>3 t/ha drought+low N

CIMMYT hybrid mean, % of checks, 2005 92.0 88.0
CIMMYT hybrid mean, % of checks, 2010 94.5 101.8

Mean of checks 2005 6.08 1.57
Mean of checks 2010 7.29 2.08

No trials 2005 15 6
No trials 2010 24 7

So, overall, where do we stand?
1. In Latin America, our materials compete with the best
multinational products, but we are not ahead
• Low-cost three-way and double crosses are
competitive!

2. In ESA, our materials are superior in low-yield, short-
duration locations. We are equivalent or ahead in high-
yield locations

3. Investment by MNSCs is increasing in the tropics. We
need to increase our rates of gain, especially in
favorable rainfed

3. Adaptation to mega-environments:
implications for breeding

1. Within and across huge regions, there is little local
adaptation that is not explained by local diseases,
elevation, and rainfall

- Breeding programs in Eastern and Southern Africa must
be fully integrated
- Germplasm moves easily from one continent to another
- We need efficient methods for transferring resistances to
adaptive diseases
- This means we need markers linked to QTLs!
- This means we need a marker-development pipeline!

Retrospective analysis in EIHYB and
ILHYB
Years: 2001-2009
Genotypes: 448
(24-65/year)
Maturity: early and late
513 trials with h² > 0.15 in
17 countries
α-lattice design with 3
reps

Weber et al. (2012a, b), Crop Science

Subdivision strategies of the TPE

Subdivision Typical environment
Climate A: Mid altitude, humid warm
B: Mid altitude, humid hot
C: Mid altitude, dry
D: Lowland, tropical humid
E: Lowland, tropical dry

Yield level low-yielding subregion, < 3 t ha-1
high-yielding subregion, ≥ 3 t ha-1

Geographic East
region South

Bänziger et al., 2006

 ge( ys)

 gs (sys)
2

2
2 2
 ( )
gy
ge
g

Variance components of maize grain yield in five different
subdivision systems of the undivided target population of
environments from 2001 to 2009: Southern Africa.

Early maturity group (n=219) †
VG VGS VGY(S) VGE(YS) VE
Climate 0.18±0.10 0.01±0.01 0.06±0.08 0.32±0.09 0.56±0.09
Altitude 0.15±0.09 0.01±0.01 0.07±0.10 0.33±0.09 0.56±0.09
Yield level 0.09±0.04 0.05±0.05 0.08±0.12 0.30±0.09 0.56±0.10
Geographic 0.19±0.09 0.00±0.00 0.06±0.12 0.33±0.09 0.57±0.10
region
Country 0.21±0.11 0.01±0.01 0.06±0.07 0.30±0.09 0.57±0.11

Rank changes over yield levels in the
2011 Southern African regional trial

Top 10 of 54 entries in 14 high-yield trials and 9 low-yield trials

All trials High yield trials Low yield trials
PEX 501 PEX 501 CZH1033
SC535 X7A344W CZH0935
AS113 AS113 CZH1036
X7A344W SC535 CZH0928
AS115 AS115 CZH1031
013WH63 CZH0923 CZH0946
CZH0935 013WH63 CZH1030
CZH0923 013WH29 AS115
CZH1036 CZH0935 013WH63
013WH29 CZH1036 CZH0831

Mean yield 4.81 6.51 2.17
H 0.88 0.89 0.75

Rank changes over yield levels in the
2011 Southern African regional trial

Top 10 of 54 entries in 14 high-yield trials and 9 low-yield trials

All trials High yield trials Low yield trials
PEX 501 PEX 501 CZH1033
SC535 X7A344W CZH0935
Correlations among
AS113 AS113 CZH1036
X7A344W SC535 CZH0928
yield levels
AS115 AS115 CZH1031 All High
013WH63 CZH0923 CZH0946
CZH0935 013WH63 CZH1030 High 0.97
CZH0923 013WH29 AS115 Low 0.57 0.36
CZH1036 CZH0935 013WH63
013WH29 CZH1036 CZH0831

Mean yield 4.81 6.51 2.17
H 0.88 0.89 0.75

Some important points about maize hybrid
adaptation:

2. Genotype x trial interaction and field “noise” are
huge constraints on precision of screening

- Large multi-location testing networks drive gains
- Genotype x trial interaction and plot-to-plot variability in
managed stress trials is greater than in optimally-
managed trials
- Too much weight on low-H managed stress trials can
reduce gains

2
 g2ge
Means, variances, and H for ESA regional trials conducted
under optimal, managed drought (MD), low N, and random
abiotic stress* (RAB) 2001-9

Test No. Grain VG VGE VE Predicted H for testing
environment of yield in:
trials (t ha-1)
5 trials 20 trials
Int-late trials
Optimal 175 6.26 22.2 22.4 55.3 0.68 0.92
RAB 63 1.73 10.4 18.2 71.5 0.38 0.83
MD 22 2.11 17.6 15.7 66.7 0.49 0.90
Low-N 34 1.82 15.7 15.3 68.9 0.49 0.89

Managing field variation: developing
comprehensive field maps
EM38 Penetrometer NDVI

Kiboko Chiredz Harare
i

Soil penetration
resistance
(MPa)

4. The role of managed stress testing in
the breeding pipeline

PH Zaidi, CIMMYT

Managed stress
screening
Notable border effect
indicates N depletion was
successful

60-80% yield
reduction
targeted for
both low N and
drought

Managed stress screening over 30
years led to the development of
the world’s most drought tolerant
maize germplasm

Edmeades, Lafitte, Bolaños, Bänziger

Pedigree selection for drought tolerance by CIMMYT
in eastern and southern Africa: Stage 1 evaluation

Management Season Sites Weight

Optimal Main 3-5 ?

Managed low N Main 1 ?

Managed drought Dry 1 ?

3000+ genotypes per year in Stage I testcross evaluation
Screens weighted based on their (assumed) importance in the target
environment (= southern and eastern Africa)

We select in selection environments (SE) to
make gains in the target population of
environments (TPE) via correlated response

rG(SE-TPE)
HSE

SE CR1(TPE-SE) = i rG √H
SE σP(SE)
TPE

Using managed-stress data to improve breeding
gains is complicated!

rGSS Stress
Hstress
rGSN

rG(SE) rGNS

rGNN Non-stress
Hnonstress

Hnonstress > Hstress
SE TPE
All of the rG’s are positive

Genetic correlations for yield between low-N and random abiotic
stress (RAB) target environments and optimal, managed drought,
and low-N selection environments: ESA 2001-9
Selection environment Random abiotic stress*

Genetic correlation
Early maturity group
Optimal 0.80
Managed drought 0.64
Low-N 0.91

Late maturity group
Optimal 0.75
Managed drought 0.76
Low-N 0.90

5. Success in identifying donors for
abiotic and biotic stress tolerance

• A massive effort has been undertaken by the
breeders and physiologists to characterize AM sets
to identify donors for drought, heat, and low N
tolerance

• George has established a large hot-spot screening
network to characterize donors for MSV, GLS,
turcicum, tar spot, rust, ear rots

• Sudha and Babu have implemented a pipeline for
developing breeder-ready markers.

• MSV is in validation now

5. Success in identifying donors for abiotic and
biotic stress tolerance

CIMMYT donors of drought and heat tolerance identified through
screening in multiple environments in Mexico, Africa, and Asia

Grain yield (t ha-1)
Pedigree Colour Texture Drought Drought + Well-
heat watered
DTPWC9-F24-4-3-1 White Flint 3.10 1.43 6.97
DTPYC9-F46-1-2-1-1-2 Yellow Flint 3.07 1.58 7.12
La Posta Sequia C7-F64-2-6-2-2 White Flint 3.06 1.39 7.72

Check (CML442/CML444) 2.36 0.96 7.70

Number of locations 7 3 7
H 0.64 0.50 0.84
Trial mean 2.58 1.13 6.88

Finally on the DTMA website!
…but these lines are at least 15 years old!

Best - bet sources of disease
resistance (G. Mahuku)
Mean Disease rating (1-5)
Stock ID Pedigree GLS MSV NCLB Rust
(6 locs) (3 locs) (12 locs) (5) locs
[(CML395/CML444)-B-4-1-3-1-
B/CML395//DTPWC8F31-1-1-2-2]-5-1-2-2-
DTMA-3 BB 1.43 1.12 1.74 1.30
DTMA-10 CIMCALI8843/S9243-BB-#-B-5-1-BB-2-3-4 2.06 1.60 1.67 2.13
[CML312/CML445//[TUXPSEQ]C1F2/P49-
DTMA-17 SR]F2-45-3-2-1-BBB]-1-2-1-1-2-BBB-B 1.87 1.12 1.80 1.59
DTMA-90 CML311/MBR C3 Bc F112-1-1-1-B-B-B-B-B 2.24 2.37 2.50 1.59
DTMA-146 [CML-384 X CML-176]F3-107-3-1-1-B-B-B 2.25 2.45 1.94 1.71
DTMA-268 La Posta Sequia C7-F33-1-2-1-B-B 2.25 2.23 1.99 1.58
DTMA-293 La Posta Seq C7-F153-1-1-1-2-B-B-B 2.50 2.35 2.33 2.43
[CML144/[CML144/CML395]F2-8sx]-1-2-3-
DTMA-40 2-B*5 2.01 2.03 1.70 1.52
[CML312/CML445//[TUXPSEQ]C1F2/P49-
DTMA-19 SR]F2-45-3-2-1-BBB]-1-2-1-1-1-BBB-B 2.20 1.61 1.77 1.23
DTMA-26 P502SRC0-F2-54-2-3-1-B 1.71 1.60 1.76 1.51

Association Mapping for Disease Resistance

MSV – Harare 2010 data (Heritability = 0.79) GLS-combined analysis (Heritability = 0.6)

Msv1 –Case Study
 QTL mapping in three populations and identification of consensus interval
 Initial interval identified about 75-132Mb on chr1 for Msv1
 Large F2 populations screened for the flanking markers of Msv1 and other
QTLs

PZE01132220936

PHM14104_23
PZE0175698629

 QTL isogenic recombinants identified

PZA00529_4
PZA02090_1

PZA03527_1

PZA02614_2

PZA03651_1
Chr.1 Chr.3 Chr.4 Chr.8
Msv1
R R R
S S S
 Phenotyping of recombinants under artificial disease pressure in field
conditions at Harare and IITA green house facilities
 Association analysis in DTMA panel with 55K SNP chip and GBS
genotypes identified SNP hits in the same interval
 The SNP hits and other markers in the interval used in further linkage
mapping on recombinants for fine-scale mapping
 The mapping confidence interval reduced to 7Mb
 8 SNPs in this interval tested for validation in breeders’ populations
 Initial results are encouraging!
 Further reduction in interval to a probable gene-based marker
expected with the recombinants in this interval

6. Applying high density genotyping to maize
breeding and managing the “data tsunami”

Genotypic data
tsunami (25 billion
data points
annually)

maize breeder

Reduced representation sequencing for rapidly
genotyping highly diverse species

RJ Elshire, JC Glaubitz, Q Sun, JA Poland, K Kawamoto,
ES Buckler, and SE Mitchell

Institute for
Genomic Diversity http://www.maizegenetics.net/

Genotyping-by-sequencing (GBS)

Genomes

Genome
representations

SNP: ATGACATATCAG
Polymorphism within
the fragments SNP
ATGAAATATCAG

Main genotyping options used by
CIMMYT
Low density: KasPar uniplex assays through KBiosciences
• KBio uniplex SNP assays: cost $20 to develop
• CIMMYT has about 3000, can share
• KBio SNPs are used for low-density QTL mapping, tracking
specific (“forward breeding”) @ ca. $.10 per data point ($20/DNA
sample for 200 markers)
- Heterozygote calls are easily made
• Genotyping x sequencing for GWAS, genomic selection, and soon
forward breeding @ $20/DNA sample for 500K+ markers
• - ca 50% missing data that must be imputed
- Heterozygotes are not easily called, but heterozygote calls
probably don’t matter for GS applications

Status of our breeding informatics effort
• All breeders, but not all phenotypers, are routinely generating
pedigrees in the IMIS database
• All lines have Genotype Identification Number (GID) to link pedigree,
phenotypic data, and genotypic data
• We have no high-density genotype database. Relational databases do
not work with more than 100K data points per element. Flat files are
searched with custom scripts. New database systems are being
developed by Cornell
• We have mixed-model software for combined analysis available via
SAS and R scripts in Fieldbook, in routine use by breeders.
• Plan is for all lines entering replicated testing to be genotyped at high
density next year
• Statistical support is excellent, informatics support is inadequate

Current status of high-density genotyping
application in CIMMYT GMP
• All new CIMMYT lines have GID and are in IMIS
pedigree database
• Over 10000 breeding lines have been GBS’d by
the Cornell IGD
• Past phenotypic data are poorly linked to pedigree
and genotype data
• No database capable of storing and searching
500+K allele calls in place
• GS pipeline is conceptualized but not in place;
models are developed de novo for each GS
experiment

Where should we be in two years?

• Over half of breeding lines should be DH
• All lines entering replicated field trials should
be genotyped at high density
• All phenotypic data should be linked through
the GID to pedigree and genotype
• Imputation, allele calling, and prediction
pipeline should be delivering predictions to
breeders
• SAGA should be operational

Lessons from our experience with high-
density genotypic data
• As a rule of thumb, 25% of the PYs in a modern maize breeding
program in a MNSC are devoted to breeding informatics
• Breeding informatics and breeding pipeline teams must be
closely linked
• If you have no database, you have no molecular breeding
program
• Pedigree and phenotypic databases must be linked and in very
good condition
• Development teams are led by breeders or other agricultural
scientists, preferably with programming skills.
• Development scientists are the interface between breeders and
programmers
• These scientists do not manage breeding programs but are
devoted full-time to application development
• Support must be available in real time.

At Pioneer, molecular breeding scientists
support the adoption and use of new tools

Line Line Line
breeder breeder breeder
1 2 3

MB
scientist

App team 1 App team 2 App team 3

What is genomic selection?
• Much research shows that the inheritance of quantitative traits like
yield in maize is controlled by many genes with small effects. QTL-
based breeding approaches do not work well for such traits
• Genomic selection (GS) is the selection of genotypes for
advancement or use as parents based on a high-density marker
genotype, rather than phenotype
• GS differs from older QTL-based breeding approaches in that it uses
all markers in a prediction of performance (genomic estimated
breeding value) GEBV
• Low-cost genotyping systems make selection based on high-density
markers feasible
• Bioinformatics requirements and breeding methods are complex
• Being used by multinational companies
• Networked approaches needed for small companies

Genomic selection systems can be used to:

- Discard unpromising lines based on genotype for
disease resistance, abiotic stress tolerance

- Predict the best lines within a full-sib family for
advancement of lines that have not been
phenotyped

- Drastically reduce breeding cycle time through the
use of recurrent selection schemes with selection
based on genotype rather than phenotype

Basic steps in the GS process:
1. A set of lines (training population) is genotyped at high density.
- These lines can be unselected testcrosses in the breeding
pipeline
2. Lines are phenotyped in testcross and/or per se.
3. Effects of markers or haplotype alleles are estimated.
4. Sum of marker effects in a line is the Genomic Estimated Breeding
Value (GEBV)
5. GEBVs are calculated on the next cohort of unselected lines and
used to predict their performance
6. GEBVs can be calculated for any trait for which the training
population has been phenotyped
7. Accuracy of the GEBV is expressed as the correlation between the
phenotype and the GEBV. Depends on population size, heritability,
marker number
8. The accuracy of a GEBV doesn’t need to be 1. It just needs to be
close to √H for the screening system
(see Heffner et al. 2009 Crop Sci. 49:1-12)

Factors that affect GS accuracy

1. Relatedness between training and
selected populations

2. Training population size

3. Broad-sense heritability in the phenotyping
system used for model training

4. Marker density

Advantages of GS for stress-prone environments

• GS allows programs to select for traits for which they cannot
screen, if they can have access to haplotype effects from other
programs
• Breeding cycle times could be reduced five-fold, greatly increasing
gains
• Sharing haplotype effects permits novel and synergistic ways to
network small breeding programs
• GS networks could make available to NARS and SME breeding
programs tools, methods, and scale now only available to
multinationals

There are 3 main ways to use GS in cultivar development

1. Incorporate GEBVs into a conventional pedigree
breeding pipeline to discard lines with weaknesses.
 As number of DH lines increases, we will need to discard many lines without
phenotyping, based on GEBV
 First use will be for defensive traits, with slightly higher H than yield.
 Breeder will receive a two-way table of GEBVs for all traits, and discard lines
predicted to have a serious weakness.
 Breeders will assess the reliability of predictions by comparing validation r
with √H achieved in field testing.
 To achieve gains, many more lines must be genotyped than phenotyped

Entry GY-Opt GY-DT GLS Ear rot
CKL001 4.69 1.4 2.5 14.5
CKL002 5.24 4.2 4.0 3.8
CKL003 7.15 3.1 2.2 4.9

r between geno. and pheno. in training pop 0.34 0.22 0.62 0.58
√H 0.80 0.55 0.85 0.80

Empirical results to date

Zhao et al Theor Appl Genet (2012) 124:769–776
- For grain yield, r across half-sib pops summing to 788 lines: 0.54

Albrecht et al, 2011:
-For grain yield, r=0.7 when prediction and validation sets contain
close relatives; 0.5 for prediction across distantly related families

- Crossa et al 2010
-For yield and other traits, r up to 0.79

- These are all huge over-estimates of GS accuracy!!

GS prediction ability across breeding groups for grain yield (GY)
and anthesis date (AD) on 55K markers.

GY AD

Breeding populations 0.12±0.28 0.02±0.25

• Cross-validation studies that use random lines with population structure
overestimate GS accuracy
• Markers simply assign the lines to groups, and the means of the groups predict
the phenotype
• Not relevant to real breeding situations

2. Use GEBVs to select unphenotyped DH lines within
full-sib families for advancement from Stage 1 to
Stage 2 .
 As number of DH lines increases, we will need to discard many lines without
phenotyping, based on GEBV
 We know predictions are very poor across families, and only work for close
relatives in high-LD populations
 Models can be trained on part of a large full-sib family, then used to advance
some ungenotyped lines to Stage 2

Example

 A set of 200 DH lines is extracted from an elite cross
 All lines are genotyped
 50 are phenotyped and used as a training set to build a GS model
 Best lines from training set are advanced based on phenotype
 Best lines from unphenotyped group are advanced based on GEBV
 Should result in modest gains from increased selection intensity

Correlation between GEBV and phenotype within
full-sib families: mean of cross-validation in 6 bi-
parental populations

Mean
Size of training pop accuracy

50 0.38
70 0.40
90 0.41

√H 0.70
No. of lines 236.5
No. of markers 240.2
No. of trials 4.33

3. Set up closed synthetic populations of key inbreds,
and conduct recurrent selection
 Advantages for GS are greatest with rapid-cycling
 Closed populations where a few elite parents contribute
equally ensure that marker allele effect estimates relate
directly to the population under selection
 High LD  low marker density required
 Improved populations can be used directly or as sources
of new inbreds
 Most CIMMYT breeding programs have now set up these
populations in the A and B heterotic groups, and are
beginning to phenotype

7. Implementing an open-source GS
network
“Open-source” breeding networks can provide
companies with proprietary lines, but allow
haplotypes to be shared

 Sharing haplotype effects allows phenotyping done by one program to
benefit another, even if they don’t test the same lines.
 Small programs could receive unique, unphenotyped DH lines (say,
500 ) from a “hub” program, with a GEBV predicting their performance
 Lines would then be testcrossed
 Company would phenotype the testcrossed set, and contribute the
phenotypes to the “training population” for the next cycle
 Company advances the lines with the best performance into product
testing.

“Open-source” genomic selection breeding plan

Rapid-cycle
marker-only
selection


Rapid-cycle
marker-only
selection

Line extracted, genotyped: untested,
proprietary DH lines provided to
companies based on GEBVs


Rapid-cycle
marker-only
selection


Phenotyping: company 1 Phenotyping: company 2 Phenotyping: company 3


Rapid-cycle
marker-only
selection


Phenotyping: company 1 Phenotyping: company 2 Phenotyping: company 3

Commercialization:company 1 Commercialization: company 2 Commercialization: company 3

Distribution of roles in an open-source
breeding network

Hub program

• Manages rapid-cycle source pops
• Extracts DH lines
• Genotypes DH lines at high density
• Coordinates managed stress screening
• Estimates GEBVs
• Updates model with new phenotypic data from partners
• Maintains database

Distribution of roles in an open-source
breeding network

Partner (spoke?) programs

• Receive and own proprietary DH lines with GEBV
• Phenotype, and contribute phenotypes to model
• Commercialize and deliver to farmers the best lines on the basis
of their own phenotyping
• Form new pedigree breeding populations, provide to hub for DH
line extraction, genotyping

Does this model make sense for pre-breeding in
China?

Advantages of open-source network model
• Small programs can access haplotype effect estimates for stresses,
environments, and traits for which they cannot do evaluation
• Partners benefit from the phenotyping done by other network
members, without having to share germplasm
• The small partner program accesses DH lines without the cost of
setting up a DH facility
• Lines are proprietary- only haplotype (marker) effects are shared
• The hub program provides partners with efficient DH, genotyping,
and informatics pipeline services, with economies of scale
• Low-cost out-sourced genotyping allows breeding programs to focus
on screening, selection, seed production, and marketing

The open-source GS network model can provide SMEs
and NARS with powerful breeding technologies now only
available to multinationals

Things to watch out for:

• Projects vs pipelines
• Over-weighting and inappropriate use of managed
stress data
• Failure to deliver the products of molecular breeding
to the product development pipeline
• Failure to exploit synergies and economies of scale
across regions
• Failure to exploit synergies and economies of scale
across maize and wheat
• Failure to come to grips with our data and breeding
informatics needs
• Thinking small about our science

The CIMMYT biparental populations: the
world’s largest resource for GS, GWAS in
tropical maize
• 28 biparental populations from DTMA and WEMA
MARS pops
• >200 lines/pop, over 5000 lines in total
• All elite Africa-adapted parents or drought donors
• Several linked half-sib families
• All genotyped with ca. 200 SNPs
• 100 lines per family GBS’d
• Imputation will permit assignment of genotypes for
>500K SNPs to each of the >5000 lines
• Phenotyped in 3-4 drought and 3-4 optimal
environments
• We will find genes for drought tolerance and disease
resistance, and pilot GS methods that work

Conclusions
1. GMP is the world’s most important source of elite and stress-resistant
germplasm, and the only large “open” public breeding program
2. Our germplasm is competitive with MNSC hybrids in most of our
target regions, and usually superior in low-yield environments
3. Gains in favorable conditions are inadequate. We must remain
competitive in commercial systems to interest seed company partners
4. We need to think hard about how to use managed stress data
5. Our drought and heat-tolerant germplasm is well-characterized and
unequalled: it needs to be used.
6. Using our stress-tolerant germplasm requires development of
breeder-ready markers
7. We have made no gains on maximum DT since the end of the
physiology breeding program
8. We have unparalled resources for genetic and breeding research for
development. Are we up to the task?

The CIMMYT Global Maize Program: Progress and Challenges

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a The CIMMYT Global Maize Program: Progress and Challenges

Semelhante a The CIMMYT Global Maize Program: Progress and Challenges (20)

Mais de CIMMYT

Mais de CIMMYT (20)

Último

Último (20)

The CIMMYT Global Maize Program: Progress and Challenges

Notas do Editor