Bioinformática y Tecnología alimentaria

Bioinformatics and
data knowledge:
the new frontiers
for nutrition and
foods
Frank Desiere,*,y
Bruce German,*,x
Heribert Watzke,*
Andrea Pfeifer* and
Sam Saguy{
*Nestle´ Research Center, PO Box 44, 1000 Lausanne
26, Switzerland (tel: +41-21-785-8054; fax: +41-21-
785-8925; e-mail: frank.desiere@rdls.nestle.com)
{
The Institute of Biochemistry, Food Science and
Nutrition, Faculty of Agriculture, Food and
Environmental Quality Sciences, The Hebrew
University of Jerusalem, PO Box 12,
Rehovot 76100, Israel
x
Department of Food Science and Technology,
University of California, Davis, California, 95616,USA
The recent publication of the Human Genome poses the
question: how will genome technologies influence food
development? Food products will be very different within
the decade with considerable new values added as a result
of the biological and chemical data that bioinformatics is
rapidly converting to usable knowledge. Bioinformatics will
provide details of the molecular basis of human health. The
immediate benefits of this information will be to extend our
understanding of the role of food in the health and well-
being of consumers. In the future, bioinformatics will impact
foods at a more profound level, defining the physical,
structural and biological properties of food commodities
leading to new crops, processes and foods with greater
quality in all aspects. Bioinformatics will improve the tox-
icological assessment of foods making them even safer.
Eventually, bioinformatics will extend the already existing
trend of personalized choice in the food marketplace to
enable consumers to match their food product choices
with their own personal health. To build this new knowl-
edge and to take full advantage of these tools there is a
need for a paradigm shift in assessing, collecting and shar-
ing databases, in developing new integrative models of
biological structure and function, in standardized experi-
mental methods, in data integration and storage, and in
analytical and visualization tools. # 2001 Elsevier Science
Ltd. All rights reserved.
Introduction
Bioinformatics and genomics are rapidly expanding
fields and in a matter of months have become a crucial
technology in Life Science Research. Bioinformatics
and knowledge integration have played and will con-
tinue to play a enabling role in Food Research inte-
grating the massive amounts of data that are generated
through new genome-wide experimental procedures
with other more traditional techniques.
Bioinformatics is defined as: ‘‘Research, development,
or application of computational tools and approaches
for expanding the use of biological, medical, behavioral,
health and nutrition data, including those to acquire,
store, organize, archive, analyze, visualize or build bio-
logical knowledge from very large and traditionally unre-
lated sources’’. It is about to revolutionize biological
research and more importantly to apply this research to
the human condition. With the availability of the human
genome, the completion of the rice genome, the mapping
and sequencing of other major crop plants and the publicly
available complete genome sequences of ever-growing
number of micro-organisms (http://www.ncbi.nlm.nih.-
gov/PMGifs/Genomes/org.html), Bioinformatics has,
out of necessity, become a key aspect in Life Science
Research and Food Research. Bioinformatics is essen-
tially a cross-disciplinary activity which includes aspects
of computer science, software-engineering and mole-
cular and physiological biology.
Although database management seems to be the
major task, bioinformatics goes much deeper; it provides
possible gene-function and cellular role of molecular
0924-2244/02/$ - see front matter # 2001 Elsevier Science Ltd. All rights reserved.
PII: S0924-2244(01)00089-9
Trends in Food Science & Technology 12 (2002) 215–229
y
Corresponding author.
Viewpoint

entities, new theoretical frameworks for complex biolo-
gical systems and new biological hypotheses for wet-lab
research. The combination of genomic data, informa-
tion technology and other advanced research tools will
give biologists the opportunity to think more broadly—
to investigate not only the workings of a single gene, but
to study all of the elements of a complex biological sys-
tem at the same time. In the future, the starting point
for a biological investigation will still be the generation
of an hypothesis, but that hypothesis will first be tested
theoretically, by modeling and polling existing data-
bases. A scientist will begin with a theoretical con-
jecture, test it on existing data and only then turning to
experiment as a last, not first resort.
The same knowledge doctrine is applicable to food
science. Food science is a coherent and systematic body
of knowledge and understanding of the nature and
composition of food biomaterials, and their behavior
under the various conditions to which they may be
subject. Food technology is the application of food sci-
ence to the practical treatment of food materials so as to
convert them into food products of the kind, quality and
stability, and packaged and distributed, so as to meet
the needs of consumers for safe, wholesome, nutritious
and attractive foods. (http://www.ifst.org/fst.htm).
In this respect, food science integrates the knowledge
of several sciences. It includes the knowledge of the
chemical composition of food materials, their physical,
biological and biochemical properties and behaviors as
well as human nutritional requirements and the nutri-
tional and trophic factors in food materials; the nature
and behavior of enzymes; the microbiology of foods; the
interaction of food components with each other, with
additives and contaminants, and with packaging mate-
rials; the pharmacology and toxicology of food materi-
als; and the effects of various manufacturing operations,
processes and storage conditions; Thus, food science is
an information-based science which integrates knowl-
edge from widely disparate sources.
The research focus in the food industry is directed by
the consumers need for high quality, convenient, tasty,
safe and affordable food. The scientific advances in
genome research and their biotechnological exploitation
alike represent unique opportunities to enhance food
performance and to build sound scientific knowledge
about its multiple functionalities. In the era before
bioinformatics and genomics, biological effects were
measurable only according to markers for specific con-
ditions (e.g. nutrient deficiencies and impairment of
health). Research was therefore targeted solely to con-
sumer health problems such as high blood pressure,
high cholesterol, lactose intolerance, osteoporosis and
diabetes. As our biological knowledge develops in this
new era, metabolic conditions consistent with improve-
ments in health will be the new markers (Watkins,
Hammock, Newman, & German, 2001). This knowl-
edge will allow intervention through foods to prevent
health problems long before deleterious effects are
apparent and the consumer will finally take advantage
of the technological breakthrough in these areas which
will yield healthy, high quality foods with positive
nutritive properties. This is just a part of the promise of
how new scientific knowledge of food, gained and made
available through bioinformatics will influence the
everyday lives of consumers.
Information and computer technology
Bioinformatics is absolutely dependant on integrated
and mature software solutions, which are available
through electronic telecommunications to the individual
scientist (Table 1). With the massive computing power
of modern computer systems we are facing fewer and
fewer limitations in storage space and calculation time,
the only limiting factor becoming the lack of informa-
tion on specific topics.
Applications and examples in the food industry
Food-grade organisms like bacteria, molds and yeasts
are the basis for a variety of biologically based indus-
trial food processes (Kuipers, 1999). The fast growing
number of complete genomic sequences of organisms
relevant to food research (Table 2) promotes the rapid
increase in valuable knowledge that can be used in
many different areas such as metabolic engineering,
improvement of cells as microprocess factories and the
development of novel preservation methods.Bioinfor-
matics will hasten the development of novel risk assess-
ment procedures (Fig. 1). Furthermore, genomic
knowledge of bacteria and other microorganisms will
revolutionize pre- and probiotic research making it
possible to, characterizate the broad range of bacterial
properties from growth to stress responses, to multi-
species microbial ecology within the human host.
Metabolic pathway reconstruction
Microbial metabolism has been the basis of a major
segment of food processing for centuries. Fermentation
of food takes advantage of the ability of desirable
microbes to convert substrates (usually carbohydrates)
to organic tailor-made compounds contributing to the
flavor, structure, texture, stability and safety of the food
product. Due to its fundamental importance to such a
wide variety of foods from breads to cheeses, wines to
sausage, literally over a century of research has focused
on understanding microbial metabolism. The potential
to build this knowledge into even greater value in foods
has been dramatically expanded by the availability of
tools to understand and control microbial metabolism
using modern genomic and bioinformatic approaches.
The production of diacetyl, alanine and ethanol from
this sugar metabolism has already been engineered in
lactic acid bacteria. With the metabolic reaction network
216 F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229

Table 1. Several bioinformatics resourcesa
Bioinformatics companies
Company URL Product Area
Affymetrix www.affymetrix.com Gene Chip Data
Mining Tool
Micro-array analysis
Applied Biosynthesis www.appliedbiosynthesis.com BioMerge Server,
BioLIMS
Genetic analysis system, LIMS
Axon Instruments Inc www.axon.com GenePix Pro 3.0 Micro-array analysis
Biodiscovery GeneSight www.biodiscovery.com GeneSight Micro-array analysis
Biomax Informatics www.biomax.de BioRS Databases
GMBH Pedant-Pro Bioinformatics analysis
HarvESTer EST-clustering
Compugen Inc. www.cgen.com Z3 2D-GE analysis
LEADS Expression analysis
Gencarta database
Doubletwist.com www.doubletwist.com Prophecy Human genome DB
GeneForest DB of expressed genes
Clustering Alignment
Tools (CAT)
EST-clustering
Genomica www.genomica.com LinkMapper Information management
Discovery Manager
Hitachi Genetic www.miraibio.com analysis DNASIS Mol-bio application
Systems CHIP Space ChipSpace Expression-analysis
DNASpace Bioinformatics analysis
IBM www-4.ibm.com/software/data DB2 DB-management
Incyte Genomics www.incyte.com LifeExpress, GEMTools, Bioinformatics tools
LifeArray Human genome database
LifeSeq Gold Gene-expression microarrays
Informax www.informax.com GenoMax Bioinformatics tools
Vector NTI Suite Mol-bio tools
Integrated Genomics Inc. www.integratedgenomics WITpro, MPW,
MicroAceTM
Sequencing, genome analysis,
metabolic design
Lion Bioscience www.lionbioscience.com bioSCOUT Bioinformatics tools
arraySCOUT Expression analysis
genomeSCOUT Genome comparisons
SRS DB management
ArrayTAG CDNA
arrayBase DB of annotated cDNA
Molecular Mining Corp. www.molecularmining.com GeneLinker Expression analysis
Packard Biochip www.packardbiochip.com QuantArray Windows Expression analysis
Technologies
Celera www.paracel.com GeneMatcher Hardware accelerator
Paracel Inc CAP4 EST-clustering
GeneWise Bioinformatics tools
Rosetta Inpharmatics www.rii.com Rosetta Resolver Expression analysis
Silicon Genetics www.sigenetics.com Gene Spring Expression analysis, DB
Allele Sorter SNP Analysis
Silicon Graphics Inc. www.sgi.com MineSet Data-mining
Spotfire Inc. www.spotfire.com Spotfire.net Data-mining
Spotfire Array Expression analysis
Commercial bioinformatics web-portals
Company Tool URL
Ebioinformatics Inc. Bionavigator www.bionavigator.com Over 200 bioinformatics tools, more
than 20 databases, access to GCG
Doubletwist.com Doubletwist.com www.doubletwist.com Integrated Genomics portal, access
to an annotated
Human Genome sequence, research
agents with many bioinformatics tools
Incyte IncyteGenomics www.incyte.com LifeSeq-ZooSeq-sequence DBs and
bioinformatics
(Continued on next page)
F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229 217

established it becomes possible to determine its under-
lying pathway structure by pathway models (Schilling &
Palsson, 2000). An important approach to a holistic
look at such biological processes uses genomic infor-
mation to reconstruct entire metabolic pathways. The
integration of the extensive information on metabolic
pathways available in the literature and databases
(as in KEGG (http://www.genome.ad.jp/kegg/), EcoCyc
(http://ecocyc.doubletwist.com/ecocyc/), WIT (http://
wit.integratedgenomics.com/IGwit) with the genomic
sequences of bacteria and eventually with stochiometric
models will deliver tools to describe cellular processes in
detail and to link genotype and phenotype. The match-
ing of well annotated genes and their expression level
from a new organism with a collection of known meta-
bolic pathways from databases is already feasible today.
However, the inclusion of kinetic information, which is
indispensable to describing the dynamic evolution of
these models, remains extremely complex. Beyond that,
many of the transcription, regulation and enzymatic
control pathways are not well understood. As the
knowledge increases in these areas, metabolic recon-
struction models will become more important in study-
ing the dynamic response of cells to external stimuli.
Plants
Plant genome research will provide the knowledge to
increase the success of genetics and breeding to produce
plants of interest for the food industry. Major objectives
of plant research are to improve the raw materials of the
food supply for higher-quality, better processability,
lower cost and safer food. The nutritional health and
well-being that plant based foods provide is tradition-
ally (DellaPenna, 1999) dominated by their provision of
essential vitamins and minerals and only recently has
the potential of a number of other health-promoting
phytochemicals been recognized to be valuable in the
daily diet. Genome sequencing projects are providing
novel approaches for identifying plant biosynthetic
genes of more specific health importance. Genome
research can therefore directly be used to increase the
efficiency and effectiveness of breeding for improvement
of plants. Biotechnology, accelerated by genomics and
bioinformatics, will increase the quality of food, reducing
Table 1 (continued)
Commercial bioinformatics web-portals
Company Tool URL
OnLine Research Tools, LifeExpress expression DB
Compugen LabOnWeb.com www.labonweb.com Bioinformatics tools and genome,
transcriptome and
Z3OnWeb.com www.2dgels.com Proteome DBs, access to
PathoGenome
Celera Celera Discovery System www.celera.com Access the Celera Human
genome sequence, many
bioinformatics tools
Free bioinformatics resources
EMBL www.embl-heidelberg.de/
CMS Molecular Biology Resource www.sdsc.edu/restools
National Centre or Biotechnology Information NCBI www.ncbi.nlm.nih.gov
European Bioinformatics Institute EBI www.ebi.ac.uk
ExPASy www.expasy.ch/
The Institute of Genomic Research TIGR ww.tigr.org
UK Human Genome Mapping Project Resource Centre www.hgmp.mrc.ac.uk/
Weizmann Institute of Science http://bioinformatics.weizmann.ac.il/
Whitehead Institute http://www-genome.wi.mit.edu/
MIPS www.mips.biochem.mpg.uk
The Sanger Centre www.sanger.ac.uk
GOLD: Genomes OnLine Database http://wit.integratedgenomics.com/GOLD/
Food Research related public bioinformatics sites
USDA Biotechnology Information Centre www.nal.usda.gov/bic/
UK Crop Plant Bioinformatics Network (UK CropNet) http://ukcrop.net/
The USDA-ARS Centre for Bioinformatics and Comparative Genomics http://ars-genome.cornell.edu/
a
The selection of companies and web-links is not exhaustive and is not an endorsement of the entities mentioned. These resources
represent the current status. Due to the dynamic nature of bioinformatics, they may change rapidly.
218 F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229

all aspects of the cost including the impact of food crop
production on the environment.
Cocoa (Theobroma cacao; Fig. 2) as an example is the
raw material for all chocolate containing foods and
drinks. The breeding and selection of higher quality
beans with superior flavor characteristics has been diffi-
cult in the past, since the trees must be maintained at
least 3–5 years before the cacao bean can be harvested
and analyzed. With the establishment of DNA finger-
printing technologies for screening plant collections,
RFLP markers for the detection of genotypic relation-
ships between breeds or species and the determination
of more than 300 molecular markers, breeding pro-
grams have been greatly enhanced. The future avail-
ability of EST sequences and genome comparisons to
other sequenced plants, which rely heavily on bioinfor-
matic tools, will result in a further acceleration with the
possibility to select for desired traits in an early stage of
plant development based on the genotype and the phe-
notype (Pridmore et al., 2000).
Implication of genomics/bioinformatics for health
and nutrition
Genomics, enabled by bioinformatics will contribute
to an improved understanding of the molecular mech-
anisms underlying the relationships between food and
health, from basic nutrient actions to the interactions
between food microorganisms and the human intestinal
system, including the gut and immunocompetent cells,
and the mechanisms underlying the interactions of the
microbial community in the intestinal tract (German,
Schiffrin, Reniero, Mollet, Pfeifer, & Neeser, 1999).
With the recent explosion of genome data, including
genomics, transcriptomics, proteomics, metabolomics
and structural genomics, bioinformatics is addressing
the task of developing computational methods to deal
with the massive flows of data emerging from modern
experimental approaches in relating genotype to pheno-
type (Lee & Lee, 2000). The approaches include func-
tional and comparative genomics and high-throughput
technologies such as genome sequencing and DNA
microarrays. The knowledge developed from this new
science will expand nutrition in three dimensions,
mechanism, human variation and time: the genetic
mechanisms underlying health, the basis of individual
variations in metabolism and the time scales during
which diet influences metabolism.
The scientific knowledge of both the genetic variation
amongst humans and the response of individual genes to
ingested molecules (drugs, foods and toxins) is growing
Table 2. Genome projects of organisms interesting for the food industrya
Organism Genome size (Mbp)b
Organism Genome size (Mbp)
Spoilage/pathogens Food-grade
Bacillus anthracis 4.5/progr. Aspergillus nidulans 29/progr.
Bacillus stearothermophilus 10/progr. Bacillus subtilis 4.20/published
Candida albicans 15/progr. Lactobacillus acidophilus 1.9/progr.
Campylobacter jejuni 1.641/published Lactobacillus sp. 2/progr.
Clostridium acetobutylicum 4.1/progr. Lactococcus lactis 2.365/published
Enterococcus faecalis 3/progr. Saccharomyces cerevisiae 12.069/published
Escherichia coli O157:H7 4.1/published Streptococcus thermophilus 2/progr.
Helicobacter pylori 1.667/published
Listeria innocua 3.2/progr.
Listeria monocytogenes 2.9/completed
Mycobacterium bovis 4.4/progr. Others:
Mycobacterium leprae 3.2/published
Mycobacterium tuberculosis 4.411/published Arabidopsis thaliana (thale cress) 115.428/published
Pseudomonas aeruginosa 6.264/published Bos taurus (Cattle) Mapping
Pseudomonas putida 6.1/progr. Canis familiaris (Dog) Mapping
Salmonella enteritidis 4.5/progr. Felis catus (Cat) Mapping
Salmonella paratyphi A 4.6/progr. Glycine max (Soybean) Mapping
Salmonella typhi 4.5/progr. Homo sapiens (Human) 3200/published
Salmonella typhimurium 4.5/progr. Mus musculus (Mouse) Progr.
Shewanella putrefaciens 4.5/progr. Oryza sativa (Rice) 450/finished
Shigella flexneri 4.7/progr. Phaseolus vulgaris (Bean) Progr.
Staphylococcus aureus 2.8/published Rattus norvegicus (Rat) Progr.
Staphylococcus epidermidis 2.4/progr. Solanum tuberosum (Potato) EST-sequencing
Streptococcus mutans 2.2/progr. Triticum aestivum (Wheat) Mapping
Streptococcus pneumoniae 2/completed Zea mays (Maize) Mapping
Streptococcus pyogenes 1.8/published
Thermus thermophilus 1.8/progr.
Vibrio cholerae 4/published
a
This table represents the current status. Due to the dynamic nature of bioinformatics it may change rapidly.
b
MBP, number of mega base pairs; progr., project in progress.
F. Desiere et al. / Trends in Food Science Technology 12 (2002) 215–229 219

exponentially as a result of the arrival of the human
genome and the tools of functional genomics (DNA
arrays, etc.). This explosion of information is only being
converted into usable knowledge because of the arrival
of the massive computing power and the bioinformatic
tools needed to apply them to large data sets being
generated by nutrition-related research. This knowledge
will not only drive a new generation of foods with
additional values but change dramatically the ability of
foods to influence individual quality of life. This
knowledge promises also to drive a new value system
for agriculture itself.
Genetic responsiveness or gene expression
The ability of nutrients to directly control the expres-
sion of particular genes is at the heart of a new generation
of nutritional science allowing researchers to apply
genomic information to technologies that can quantify
the amount of actively transcribing genes in any cell at
any time (e.g. gene expression arrays). With this tech-
nology in place, scientists of every biological discipline
are discovering the interaction between organisms and
their environment with an intimacy never thought pos-
sible. Nutrition is at its heart, a multidisciplinary field
focusing on integrative metabolism of animals and
humans. Nutritionists have strived for the last century
to deduce the mechanistic basis of the apparent strong
relationship between diet and health through under-
standing the interaction of nutrients with metabolic
pathways. Needless to say, this was a daunting task with
the traditional tools of reductionism biochemistry. Most
nutrients affect a wide range of biochemical pathways.
The net result is that nutrients exert multiple effects:
pleiotropic dysfunctions in their relative absence, i.e.
deficiencies, and pleiotropic benefits in their return to
appropriate, optimal intakes. Reductionism biochemical
Fig. 1. Electron micrograph of Streptococcus thermophilus (oval
chains) and Lactobacillus johnsonii (rod-like chains) cells used for
starters cultures in food fermentations.
Fig. 2. Example of a Cacao plant (Theobroma cacao L.) in natural
form as fruits, as beans and finally as ground powder. Cacao trees
must be maintained approx. 3–5 years before harvesting the
cacao. Selection of specific traits based on genotype in the early
development of the plant is therefore highly desirable.
Fig. 4. Food production is based on biological raw materials which
are refined into food ingredients. A unifiying approach is proposed
on the basis of common basic and material properties of the
comprising molecules in both domains. Moreover, the vast store of
knowledge currently being produced by the biomical sciences
(genomics, proteomics, metabolomics) will improve the knowledge
on ingredient characteristics and behaviours.
Fig. 3. The perceived food qualities are driven by flavors and tex-
ture. Both are composite events whose disparate elements show
specific interactions. While the elements can be controlled sepa-
rately, only understanding the underlying neuro-physiological
processes will lead to optimizing the flavor and texture impact of
foods.
220 F. Desiere et al. / Trends in Food Science Technology 12 (2002) 215–229

approaches describe very well the effects of a single
nutrient’s interaction with a single target; however, they
fail to adequately explore the multiplicity of metabolic
effects on the entire organism. The perspective of mod-
ern genomics is ostensibly the reverse (expansionist)
approach, to measure everything. Genomic-based
investigations do not avoid pleiotropic behavior of exo-
genous nutrients; quite the contrary, they reveal it. The
goal of differential gene expression array experiments
are to describe the full spectrum of transcriptional
responses to any variable, including nutrients. Such
global experimental designs are only possible due to the
advent of bioinformatic tools to adequately manage and
analyze the sheer volume of data that are produced.
With the arrival of broadly parallel assessment tools
including gene expression arrays and metabolomics,
single biomarkers of disease risk will no longer be con-
sidered useful (Watkins et al., 2001). Since it will be as
straightforward to measure the expression of 30,000
genes as the expression of one gene, knowledge from
expression profiling will impact health assessment. It is
equally certain that the days of building dossiers of effi-
cacy and safety based on a single metabolic endpoint,
e.g. cholesterol, are limited. Such comprehensive
knowledge of the effects of discrete food and nutritional
variables to overall metabolism will add new under-
standing to their health value.
Genetic variability
With the genome of one ‘individual’ human com-
pleted, the effective technologies to establish variations
from that single genome, are being implemented. The
Single Nucleotide Polymorphisms (SNP) Consortium
(http://snp.cshl.org/) is mapping the polymorphic
regions of the genome that control individual pheno-
typic differences among the population (Sachida-
nandam et al., 2001). While these variations are being
viewed initially as the key to the discovery of genetic
diseases, they are also the keys to individual variation in
diet and health. Sequence variation in particular genes
even as slight as single nucleotides can influence the
quantitative need for and physiological response to
various nutrients. Knowing that genes influence nutri-
tion, of course is not new. An understanding of this
variation is inherent in population recommendations for
essential nutrients (Young Scrimshaw, 1979). How-
ever, allowing for the variation in human genetics by
incorporating a large margin for error in quantitative
recommendations is not the same as designing diets for
specific individuals according to their genetic profiles
(Eckhardt, 2001; Nichols, 2000). An example of poly-
morphisms that influence nutrition and disease is phe-
nylketonuria, in which the inability to metabolize
phenylalanine renders this nutrient toxic (Lindee, 2000).
The occurrence of lactose intolerance is due to poly-
morphisms both in the structure of the lactase gene
which produce dysfunctional enzyme and in regulatory
regions of the genome that prevent perfectly functional
lactase enzyme from being produced in adults (Harvey
et al., 1998). With genomics will come the knowledge of
the integrative nature of multiple genes in predicting
health. The potential opportunity of bioinformatics to
deliver that knowledge to the individual consumer will
eventually lead to individualized dietary choices in the
hands of the consumer. This bold future is arriving
because of bioinformatic tools capable of managing the
volume of data implied by quantitatively assessing indi-
vidual metabolism and intervening in an that indivi-
dual’s metabolism using foods to improve their health.
Genomic and bioinformatic tools will improve human
clinical research. Historically, many nutrition trials
failed to find statistically significant effects of various
nutrients and food choices not because there was no
benefit, but because the magnitude of the benefit was
small relative to the overall variability in a sample of
humans chosen at random from the population.
Humans do not respond homogeneously to even the
most straightforward nutritional variables. A great
value of genotyping individuals in clinical trials is to
begin to assign the variation of the population to spe-
cific genetic differences. Clinical and epidemiological
trials are now being analyzed using SNP data as inde-
pendent input variables (Takeoka et al., 2001). Most
clinical trials are already cataloguing the SNPs of genes
whose variation in function have shown to be important
to the endpoint measures of these trials, for example
cancer, autoimmunity and heart disease (Marth et al.,
2001). Such ‘data-mining’ approaches have been suc-
cessful not only in identifying the causes of statistical
variation among trial participants but in identifying the
potential biochemical mechanisms responsible for the
variation in response. This approach is already proving
so powerful that scientific agencies are recognizing that
traditional avenues of scientific publishing aren’t ade-
quate and the processes of scientific discovery of genetic
polymorphism and health are accelerated by the avail-
ability of SNP data sets and bioinformatic packages
on the internet (Clifford, Edmonson, Hu, Nguyen,
Scherpbier, Buetow, 2000).
Genetic polymorphism and nutrient requirements
Polymorphisms in the various genes encoding
enzymes, transporters and regulatory proteins affect the
absolute quantities of essential nutrients that are neces-
sary to achieve sufficiency, including vitamins, minerals,
etc. (Bailey Gregory, 1999). Thus, the variation in the
population’s nutrient status is not simply the result of
variations in food intakes but also the result of inherent
variation amongst individuals within the population in
their genetically defined abilities to absorb, metabolize
and utilize these nutrients. Recommended daily allow-
ances of each nutrient are determined to meet the needs

of a statistically representative fraction of the popula-
tion; however, the range of responses to both micro-
and macronutrients in the general population is large.
Very recent research using genomic tools is highlighting
just how specifically individual food choices, genetics
and nutrition are linked. Polymorphism in a recently
identified sweet receptor protein has been proposed to
be the basis for the varying intakes of caloric-rich foods,
i.e. the famous sweet tooth (Davenport, 2001).
As genomics begins to reveal the basis for food pre-
ference and the respective roles of genetics and envir-
onment, nutritional superior foods could be made more
organoleptically attractive to precisely the subset of the
population for whom they are most appropriate. How-
ever, an important step is still missing. At this point,
while the technologies to describe the effects of diet on
various individuals experimentally are widely used for
example in clinical trials, the technologies are not yet
part of routine consumer assessment. Therefore, con-
sumers cannot take advantage of nutritional knowledge
about themselves, because they do not have it. This lack
of knowledge transfer is clearly the largest single factor
constraining a more widespread improvement in nutri-
tional health in the consumer population.
Genetic variation and the response to variations in
overall diet
Genetic differences affect the basic metabolism of
macronutrients and in particular fat and carbohydrate
in humans. For example, polymorphisms in the apo-
protein genes (apoE, apoAIV) or lipoprotein catalysts
(lipoprotein lipase) have been shown to directly affect
the clearance of dietary lipids. Hence polymorphisms in
lipid metabolic genes dictate the response of these indi-
viduals to dietary fat (Hockey et al., 2001; Pimstone et
al., 1996). Polymorphism in the genes encoding for the
apoE protein influence the functionality of this protein
in clearing liver-derived lipoproteins (VLDL and LDL)
from blood (Weintraub, Eisenberg, Breslow, 1986).
Health outcomes beyond heart disease including Alz-
heimer’s disease have been shown to be correlated to
apoE phenotypes. Once again, diet plays a differential
role in the development of these diseases according to
genotype through the role of diet in influencing the
quantitative flux of hepatic lipoprotein metabolism
(Corella et al., 2001).
Many consumers are concerned about the widespread
application of genomic testing in the population because
they see little value to themselves. However, there is
great value in acquiring knowledge about individual
variation in diet-responsive genes if it can lead to suc-
cessful intervention. For example, genotype predicts a
difference in post-prandial lipid metabolism of dietary
fat (Hockey et al., 2001). The most exciting aspect of
this discovery is the realization that this knowledge is
not just academic, but leads to an immediate individual
recommendation how to alter the intakes of dietary fat
for those affected. Thus, the information of how an
individual responds to foods provides that individual
with the means to change their diet to improve their
health. With each new discovery of genetic polymorph-
isms linked to health, the complexity of the science
increases. Fortunately, modern bioinformatics tools are
inherently integrative adding each new discovery into a
rapidly expanding coherent picture of diet and health of
individual consumers.
Food quality
Food is one of life’s great delights. Modern science
and technology have provided unparalleled value to
consumers in the breadth of individual choices in deli-
cious, safe and nutritious foods. This great value has
been driven by scientific knowledge at all levels of the
agricultural food chain from genetic improvements in
production agriculture to food process engineering to
precision in the analysis of consumer sensation. With its
power to build detailed molecular knowledge of biolo-
gical organisms, modern bioinformatic technologies are
assembling the means to re-invent the food supply. In
no other aspect of life do humans interface with other
biological organisms to the same extent as in the con-
sumption of food. Thus, the most tangible, daily value
that genomics will eventually produce for humans is a
dramatic increase in the quality of their lives through
the quality of their foods. Bioinformatics will help
understand the basis of different food flavors, and tex-
tures and even further why we find them delicious, and
hence how to enhance that experience. Bioinformatics
will not only define in molecular detail which foods are
safe, but develop foods that make consumers themselves
safer. Bioinformatics will not only improve the processes
of forming foods, but design foods that form themselves.
The understanding of the biomolecular basis of flavor
perception has been a major success of the last 5 years
of scientific investigation in the molecular biology of
sensation (Fig. 3).
Success in identifying, in molecular and genetic
details, the taste and flavor receptors has been remark-
able in the past months. These include:
Bitter: A family of 50 G protein-coupled
receptors (GPCRs) identified in human taste cells
(Chandrashekar et al., 2000);
Salt: The epithelial ion channel, ENaC is
responsible for over 80% of salt taste transduc-
tion (Nagel, Szellas, Riordan, Friedrich, Har-
tung, 2001);
Sour: An ion channel, identical to degenerin-1, is
proposed to be the receptor (Ugawa et al., 1998);
Umami: A ‘splice variant’ of brain glutamate
receptor, mGluR4 identified in rat taste cells
(Matsunami, Montmayeur, Buck, 2000); and

Sweet: The putative identity of the sweetness
receptor identified as a G protein coupled recep-
tor Tas1r3 (Max et al., 2001).
The discovery of these taste receptors is being trans-
lated rapidly into a variety of research programs
designed to discover the next generation of taste modi-
fiers for foods. The sugar substitutes demonstrated the
potential for replacing the traditional sweet molecules
(simple sugars) with non-caloric, non-cariagenic and
non-glycemic alternatives in a variety of food products.
Now, with the balance of taste receptors known, it will
be possible to develop flavor systems that either produce
or enhance positive or mask negative tastes. Much of
this work will be possible using combinatorial chemistry
approaches that use bioinformatic tools to screen thou-
sands of molecules and combinations at a time. Such
molecular simulations once took weeks and very large
super-computer installations. New developments in
computing power, computational algorithms and soft-
ware and the available databases of known structures
and successful simulations has brought molecular mod-
eling into mainstream food chemistry. Such simulations
will make it possible to develop not only more intense
tasting compounds as food additives, but understand
the basis of taste persistence, antagonism and com-
plementation. Flavor systems will become more com-
plex, more attractive and more individualized to
consumers.
Olfaction: a family of 1000 GPCRs, about 300
identified
Not far behind the taste receptors the much more
abundant odor receptors are being identified as well.
The full olfactory complement of genes has been pub-
lished (Glusman, Yanai, Rubin, Lancet, 2001). The
number of odor receptors exceeds the number of taste
receptors by a factor of 100. In spite of this expansion in
size and complexity, bioinformatics will have little diffi-
culty in translating the principles of ligand–receptor
interaction developed with taste into similar applica-
tions to odor sensations. With such capabilities, sophis-
ticated flavor systems will be designed from the
perspective not simply of what is available in natural
commodities and foods, but with final flavor perception
as the goal. Ultimately, it will be possible to design fla-
vor systems that optimize flavor perception in highly
nutritious foods that are currently organoleptically
undesirable in spite of their superior health value.
Making the next connection, i.e. understanding the
basis for healthy and unhealthy food choices, is already
proceeding.
Recently, the connection between gratification and
the brain was verified in rats (Cardinal et al., 2001).
Similar developments in our understanding of the brain
could lead the way to furnish tailor made specific orga-
nolopetic attributes as well as nutrition needs.
Bioinformatics and food processing
The most immediate application of bioinformatics to
food processing will be in optimizing the quantitative
compositional parameters of traditional unit operations.
Food commodities are processed largely to achieve sto-
rage stability and safety with considerable excess of
energy applied to ensure a large margin for error. This
margin of error is necessary due to our inexact knowl-
edge of the composition and structural complexity of
biological materials, the natural variability of living
organisms as food process input streams and the
response of these materials to processing parameters.
With the considerable knowledge of biological organ-
isms from bacteria and viruses to plants and animals
that is emerging from bioinformatics, food process
design will become optimized with narrower margins of
all cost-important inputs, especially energy.
The great future for food processing however is not in
simply processing for greater safety, but in merging
biological knowledge of living organisms with the bio-
material knowledge necessary to convert them to foods.
Traditional food processing relies on the aggressive
input of energy to restructure the biomaterials of living
organisms into simpler macrostructure forms of stable,
relatively uniform foods. In most cases the inherent
biological properties of the living systems are lost to the
final food product in the need to eliminate potentially
hazardous properties of some of the constituent mole-
cules (protease inhibitors, etc.). The arrival of the
knowledge base of modern bioinformatics, however, is
providing a detailed description of the inherent com-
plexity of biological macromolecules within living cells
together with the structural properties of these mole-
cules that provide much of their functions. Such
knowledge is the cornerstone of functional genomics
and proteomics. The arrival of such knowledge, how-
ever, provides an unprecedented opportunity to trans-
late this knowledge into an equally accurate assessment
of the biomaterial properties of each of the molecules in
a complex mixture. It will soon be possible to use the
inherent structural properties of natural food commod-
ities to self-assemble new foods with a minimum of
external energy retaining a maximum of biological and
nutritional value. The biological structure–function
relationships discovered through bioinformatics of liv-
ing systems will be able to be mapped into the struc-
ture–function relationships of the next generation of
foods with delightful results (Fig. 4).
All foodstuffs are ostensibly modified tissues. Thus,
the natural biomaterial properties of the molecules that
make up living organisms underlie the basic biomaterial
properties of foods. In most traditional food process-
ing, however, little advantage is taken of the unique

properties of specific molecules and instead, all bio-
molecules of a particular class, e.g. proteins, are exposed
to substantial physical, thermal and mechanical energy
to make these properties uniform in order to restructure
the material into more stable, and/or more bioavailable
food systems. Such processing eliminates the subtle
differences within most of the classes of the major bio-
molecules that are inherent to and the basis of complex
structure–function relationships of living organisms.
Processing replaces biological complexity with the
statistical average properties of the broad classes of
biomaterials, i.e. proteins, carbohydrates, lipids.
The processing of commodities to eliminate the com-
plexity of their biological structures are not necessary to
the quality of foods, in fact the opposite. There are vivid
examples in which highly specific biological properties
of the original living organism are a key to the proces-
sing strategy and ultimately the organoleptic attractive-
ness of final food products. The renneting of bovine
milk to induce the natural aggregation of milk caseins
leading to the gelation events of cheese manufacture is
such a process. The final product takes advantage of the
unique self-assembly properties of milk casein micelles
that are colloidally stabilized in milk by kappa caseins
but destabilized when enzymatically cleaved of their
solubilizing glycomacropeptide. Another example is
leavened bread in which a combination of both compo-
site processing and biological restructuring is the basis
of breads’ structures, textures and nutrition. In this
case, wheat seeds are ground to disassemble the major-
ity of their biological structures through mechanical
energy, but then the biological processes of yeast fer-
mentation achieve simultaneously the enzymatic elim-
ination of phytic acid during dough incubation and the
biochemical production of carbon dioxide gas as lea-
vening within a mechanically reworked protein gel
structure. In each of these cases, bread and cheese, tak-
ing advantage of the biological properties of the living
organisms, led to substantial value both organolepti-
cally and in greater safety and nutritional value. Fur-
thermore, the inherent variation in biological organisms
that plagues the standardization of simpler food pro-
cessing objectives is not a disadvantage to these two
food staples, but rather a wonderful benefit leading to
literally hundreds of distinctly flavored and textured
cheeses and varieties of breads. Thus, cheeses and
breads provide proof of what is possible when the bio-
logical processes of catalysis, self-assembly and restruc-
turation is retained as the basis of food processing.
Heretofore, empirical trial and error was the major
route to discovery of biodriven food processing. How-
ever, the biological knowledge that is emerging with
functional genomics, proteomics and metabolomics is
providing precisely the knowledge necessary to read-
dress food processing using bimolecular activities rather
than simply composite biomaterial properties. The
entire protein–protein interaction map of yeast, i.e. all
possible interactions between the 6000 proteins of yeast,
has been completed (Ito, Chiba, Ozawa, Yoshida, Hat-
tori, Sazaki, 2001). In the future, the structure func-
tion properties of living organisms that are emerging so
rapidly with bioinformatics will increasingly dictate the
design of new foods and new food processes. Once such
tools are in hand, process design engineers can then
work in a coordinated fashion with plant bioengineers
to produce crops that are not simply enriched in a single
valuable component, but instead redesigned with a
renewed purpose to increase the myriad values of foods
in providing quality of life.
Flavor analysis
The complex flavor profiles of many delightful com-
modities (e.g. fruits, baked goods) are not due to single
compounds but rather are the result of the presence and
interactions of literally dozens of different molecules.
This knowledge will provide the link and the compiler
integrating processing, quality and nutrition paving the
road for new product development based on insight
knowledge of actual consumers’ preferences and needs.
The impact of genomics on the quality assurance of
foods
Food safety is becoming more and more a major area
of concern for consumers and the food industry has
developed a coherent research programme to ensure
food safety with well-established classical methodolo-
gies but also new state-of-the-art research tools. The
goal here is to ensure that the inactivation or inhibition
of undesired microbes is possible using the minimum
treatment of foods necessary, to increase the under-
standing on the ecology of food-born microbial popu-
lations, to find-out how these populations respond to
environmental factors like stress and last but not least
the toxicological evaluation of foods and food com-
pounds.
The genomics era delivers many new tools like pro-
teomics and DNA-array technology to tackle the
abovementioned problems. These new technologies are
now a vital part of the scientific strategic plan to serve
the diet and health theme and to provide safe food to
the consumer.
Toxicogenomics, for example, is an emerging field
which utilizes DNA arrays (tox-chips) to test the tox-
icological effects of a specific compound. These DNA
arrays probe human or animal genetic material printed
on miniature devices to profile gene expression in cells
exposed to test compounds rather than using animal
pathology to define illness (Lovett, 2000). The advan-
tages of this test goes beyond the speed and the ease of
use which is typical for DNA expression analysis; it also
reduces massively animal testing. Another challenge
here is the massive amounts of data which are produced

via these high-density DNA arrays and the analysis and
the interpretation of the results is a real challenge. Once
this task has been tackled, the integration of tox-chip
data must be integrated into the knowledge basis of the
research institution to draw a maximum of benefit for
the acceleration of the development pipeline.
Data integration
The explosion of data, ever increasing developments
in information technology, abundant availability of
powerful computers and the ability to connect them
worldwide, affects enormous changes in knowledge
management. However, in order to gain full access to
these emerging powerful tools, it is paramount to
resolve the enormous challenge of unifying complex and
dissimilar data, each describing a large spectrum of
applications, each of which could be extremely far
apart. The need to combine observations from numer-
ous sources and domains, into a unified, seamlessly
searchable database and turning it into knowledge is
only the beginning of this uphill battle that will impact
every facet of food and nutrition science.
Advances in data collection, storage and distribution
technologies have far outpaced techniques to assist the
analysis and digestion of this information. In the past,
most databases were quite small and utilized as typeset
tables or simple online documents. Today, far larger
and more complex databases are emerging in many
fields at a level well beyond the reach of the traditional
model of solitary workers or small groups. (Maurer,
Firestone, Scriver, 2000). This has led to an all-too-
common data glut situation creating a strong need and
a valuable opportunity for extracting knowledge from
databases collected throughout RD and elsewhere.
One of the greatest challenges we are facing is how to
turn this rapidly expanding or even exploding data into
accessible and actionable knowledge. Moreover, food
and nutrition RD is engaged in an assortment of
complex studies producing enpoint measures comprised
of numeric, sensory and perceptions, structure, biologi-
cal, chemical and vision data. This need to manage such
disparate inputs is critical as the amount of data dou-
bles almost every 20 months (Colbourn Rowe, 2000).
Underlying the need to convert data into actionable
knowledge, organizations have started an aggressive
effort to deploy Knowledge Discovery in Databases
(KDD), Knowledge Management (KM), Data Mining
(DM) and Intellectual Asset Management (IAM). These
areas of common interest to researchers are: pattern
recognition, statistics and statistical inference, intelligent
databases, knowledge acquisition, data visualization, high
performance computing and expert systems, to mention
just a few. Although these high technology information
management systems are starting to play a fundamental
role for the experts who are working on their develop-
ment, they are however almost invisible for most users.
Data mining refers to a new genre of bioinformatics
tools used to sift through the mass of raw data, finding
and extracting relevant information and developing
relationships among them. As advances in instrumenta-
tion and experimental techniques have led to the accu-
mulation of massive amounts of information, data
mining applications are providing the tools to harvest
the fruits of these labors. Maximally useful data mining
applications should:
Process information from disparate experimental
techniques, and technologies, including data that
have both temporal (time studies) and spatial
(organism, organ, cell type, sub-cellular location)
dimensions;
Identifying and interpreting outlying, spurious
and rare data;
Analyze data in an iterative process, re-applying
gained knowledge to constantly examine and re-
examine data;
Utilize novel text-mining and pattern recognition
algorithms.
In the early years of modern scientific discovery,
research findings would appear in a journal and then get
buried in the depths of poorly accessible library space.
Information existed in various formats (e.g. graphic,
hard copy, tape), and was not easily retrievable. Data
analyses were generally limited to slide rule and manual
manipulation. However, technological advances in
computational science and scientific instrumentation
have facilitated the exponential growth, not only in
data, but also the tools to record and analyze these data.
What was the Computer Age as we entered the 1990s
has been supplanted by the Information Age. This
change was made possible by the advent of the Internet,
in particular the World Wide Web. This innovative,
truly universal mechanism of information dissemina-
tion, in concert with new computation-based analytical
tools, has provided practically endless opportunities for
scientific discovery.
The exponential rate of discovery in the era of mod-
ern molecular biology is phenomenal, culminating with
the June 2000 announcement that preliminary sequen-
cing of the human genome had been completed. This
landmark is just a taste of the scientific successes that
are to come. As impressive as it is, the determination of
the sequence of the approximately 3.2 billion nucleo-
tides of the human genome, encoding an estimated
100,000 proteins, represents only the first step down a
long road of knowledge discovery and its application to
added value to consumers.
Another application of bioinformatics that is growing
extremely fast is Chemometrics, the chemical discipline
that applies mathematics and statistical methods, and

uses designs of experiment to understand the effects and
interactions of several process parameters, and also to
optimize specific outcomes (Otto, 1999). Chemometrics,
originally rooted in analytical chemistry, is currently
more focused on addressing issues related to molecular
conformations and behavior. With the increasing avail-
ability of databases (e.g. through WWW), the need for
improved techniques that help extracting information
and turn it into knowledge has been therefore ever
growing (Brazma, Robinson, Cameron, Ashburner,
2000).
It should be highlighted that food and nutrition are
related topics and are prone to another more crucial
problem. Generally, advanced data mining and other
sophisticated search tools are no better than the infor-
mation provided. As the scientific literature may contain
both editorial and/or more fundamental errors (e.g.
false methodology, unjustified conclusions, faulty appa-
ratus), hence the need for the impartial scrutiny of
human editorial judgement is indispensable. One might
make a compelling case that the value of the databases
is compromised most by their inherent bias: in concept
and design towards only benefit and in publication
towards only a positive outcome. Databases are most
valuable to data mining and bioinformatics searchers
when they are balanced. It should be emphasized that if
data mining techniques are polling databases that are so
inherently unbalanced that no matter what the truth
is, the data mining will invariably reflect the inherent
bias in the databases that has been the result of con-
scious or unconscious editorial influence. Hence, like
most other computer applications, the outcome in the
short term will be only as good as the quality of the
data. Moreover, the more complex the calculation is,
the more paramount is the need for adequate checks
and balances. The solution is for more balanced data
collection. At present, this is not the norm for nutri-
tional research.
Typical examples, far from being representative, yet
demonstrating how knowledge management is utilized,
are provided:
1. Food industry—A software package (NetStat)
was developed for analyzing reams of data, and
is reported to have changed every aspect of the
Pillsbury company (i.e. from the way it develops
new products to how it capitalizes on consumers’
tastes). The NetStat uniqueness is its ability to
share information across all the company’s nine
brands including manufacturing lines. The pro-
gram is implemented as a Web site shared by
researchers across a 70-country conglomerate,
and allows engineers and scientists to perform
rigorous tests and compare them with data
and specifications and consumer information
(Crockett, 2000).
2. Pharmaceutical industry—Building of huge
combinatorial libraries by automatically synthe-
sizing all possible combinations of components is
underway. The number of compounds in such a
database can now be confidently stated to be in
the hundreds of thousands or even millions. The
new automated screening technologies can test
each of these compounds, giving an indication of
whether a compound is going to be effective
against a specific biochemical target and a spe-
cific disease.
3. Chemical industry—Chemical reaction databases
are available and could be used to derive knowl-
edge for predicting the course and products of
chemical reactions as well as to design organic
syntheses. To reach this goal, the essential fea-
tures of the chemical reaction have only to be
recognized and generalized. This was achieved by
classifying a set of reactions by unsupervised
learning techniques such as self-organizing maps
(Kohonen). In this approach, reactions are char-
acterized by physicochemical features directly
derived by computations from the constitution of
the starting materials or products of a reaction
(Gasteiger Sacher, 1999).
4. Information industry—Chemical Abstracts Ser-
vice (CAS) has launched its SciFinder 2000,
empowering the user with greater visualization
tools and the ability to cross-tabulate and display
searches graphically. This ‘wizard’ allows a
researcher to simultaneously locate information
within a multitude of databases and subse-
quently explore the relationship between them.
The retrieved data may be displayed in a 3D
representation that can be further manipulated
to zero in on the requisite research. The use
of such data mining could revolutionize the
way scientists approach their research projects
(Massie, 2000).
5. Environmental safety—To reduce the need for
animal testing, Unilever has applied data mining
techniques (Clementine) to model skin corrosiv-
ity of organic acids, bases and phenols. This
facilitated uncovering new information from the
existing database, and eventually will furnish
toxicologists with neural network based packa-
ges to help assess and predict corrosivity and
other toxicological properties. This approach is
much more approachable than current tech-
niques (e.g. principal component analysis). It
is hoped that it will lead to a movement away
from in vivo and in vitro experimentation towards
‘in silico’ analyses, reducing costs, time scales
for product development, and minimizing the
need for animal testing (http://www.spss.com/
clementine/).

6. Consumers—Data mining techniques are now
being used to extract a surprising amount of
information on individual customers and their
buying patterns. These data are then used to
develop customer loyalty programs, for carefully
focused marketing or additional services that fit
the customer’s individual preferences, and for
identifying possible synergies with other compa-
nies who might share the same or similar base of
customers. Applications are ranging from direct
marketers, books, to credit card companies,
which identify trends, potential users, and target
marketing strategies.
Development needs for data integration
Computational biology and electronic technologies
will be crucial for the future of Life science research and
offer in addition promising opportunities to many
industries. Future central issues for the shortening of
research driven product development and gaining com-
petitive advantage will be the issue of data integration.
Companies which started initiatives in this area are now
struggling to integrate legacy enterprise resource plan-
ning and data warehouse technologies with bioinfor-
matics. Compared to this challenge all other issues
including electronic commerce fade into insignificance.
To be successful, companies are now focusing on spe-
cific enabling technologies like Java, message-oriented
middleware and XML to encourage web-based colla-
boration between research teams and operating units.
Clear integration paths and benchmarks are, however,
still lacking.
The ability to make better, faster and more innovative
research decisions is paramount to progress. Emerging
technologies and the exploding amount of data high-
light the need for new approaches. The availability of a
large number of fast PC’s connected together allows
parallel processing, overcoming barriers due to speed
and computer resources. However, the ability to inte-
grate the data and utilize KM is a real challenge, which
is compounded by the increased economic pressures and
demanding marketplace, global competition, regulation,
and consumer demands. Implementing these new meth-
odologies could open new avenues improving our ability
to quickly and efficiently gain new knowledge and
insights from cell structure to consumer perceived sen-
sory attributes. Ultimately, one should envision ‘an
engine’ able to ‘plug and play’ into various data
domains, integrating all the facets of a business increas-
ing the likelihood of identifying the next target or new
food product for development and quality improvement
addressing the consumers’ real and perceived needs.
Planning for the future is no longer a luxury; it is a
standard operating procedure for the existence and well-
being of the enterprise.
Future areas required development are:
Models—Models that describe a class of reac-
tions in an actual food system or food concept
‘in silico’ (Hultzman, 2000). These models should
be designed so that they could also be applied for
testing the validity of previous data reported.
This goal also mandates that terminology be
harmonized, to improve accessibility. It could
lead to a movement away from in vivo and in
vitro experimentation towards ‘in silico’ ana-
lyses, reducing costs, time scales for product
development, and minimizing the need for ani-
mal testing.
Standardized protocols—Standard experimental
design and replication must be set if data accu-
mulated by different groups and various techni-
ques should be integrated. Thus, leading to
improved reproducibility, reduce variability, fur-
nishing truly quantitative data, increase sensitiv-
ity and provides means for comparing data
obtained from different sets (e.g. Lee, Kuo,
Whitmore, Sklar, 2000).
Data integration and storage—Linking, inte-
grating interoperable large databases with differ-
ent heterogeneous structure and data types is far
from being a straightforward task when con-
sidering the vast differences that do exist between
various domains makes this task immense. Simi-
larly the ever-growing amount of information
needs adequate storage and maintenance. Cata-
loging and automated extraction (e.g. Andrade
Bork, 2000) are paramount. As the informa-
tion complexity and quantity grows, the food
practitioners need to define and develop a unified
and acceptable approach. This task requires sig-
nificant planning where all facets of the food,
nutrition, biology and other domains are
involved.
Predictive tools—Techniques allowing the auto-
mated discovery from large and different data
sets need to be further developed before they
could be fully utilized in the food and nutrition
domains. Once implemented, it would open new
avenues towards broad interdisciplinary science
that involves both conceptual and practical tools
for generation, processing, analyzing and propa-
gation of information leading ultimately to fun-
damental understanding.
Data visualization—A large volume of the
human brain is devoted to visual data processing
(Going Gusterson, 1999). Data visualization
methods therefore will play a significant role
allowing pattern characteristics and recognition.
Paradigm shift—Food and nutrition science
should develop a holistic approach, by moving

away from studying ‘vertically’ the role(s) of few
variables to ‘horizontally’ studying simulta-
neously many variables and applying advanced
modeling and analysis techniques (e.g., Fiehn,
Kloska, Altmann, 2001).
Conclusions
Biomics, comprised of genomics, proteomics and
metabolomics, is taking up its position as a lead science
for the 21st century. Its influence is already felt through
out the biological sciences. Moreover, its influence on
nutrition and food science will generate a unified area of
research where both nutritional benefit and traditional
food values become parts of an extended life science
driving towards enhanced quality of life. Impacts of the
knowledge obtained through this research on raw
materials, ingredients, safety, quality and nutrition can
be expected to have a far greater impact on product
improvements than today’s functional food research is
imagining. Future developments in biomics, bioinfor-
matics and information technology based approaches to
foods will truly change and revolutionize the way food
industry will satisfy consumer needs and wants.
Uncited references
Bender (1999), Firestein (2000), Gasch et al. (2000)
and Weggemans et al. (2001).
References
Andrade, M. A., Bork, P. (2000). Minireview: Automated extrac-
tion of information in molecular biology. FEBS Letters, 476, 12–
17.
Bailey, L. B., Gregory, J. F. (1999). 3rd Polymorphisms of methyl-
enetetrahydrofolate reductase and other enzymes: metabolic
significance, risks and impact on folate requirement. Journal of
Nutrition, 129, 919–922.
Bender, D. A. (1999). Optimum nutrition: thiamin, biotin and
pantothenate. Proc Nutr Soc., 58, 1999 427-433.
Brazma, A., Robinson, A., Cameron, G., Ashburner, M. (2000).
One-stop shop for microarray data. Nature, 403, 699–700.
Cardinal, R. et al. (2001). Impulsive choice induced in rats by lesions
of the nucleus accumbens core. Science, 292.
Clifford, R., Edmonson, M., Hu, Y., Nguyen, C., Scherpbier, T.,
Buetow, K. H. (2000). Expression-based genetic/physical maps
of single-nucleotide polymorphisms identified by the cancer
genome anatomy project. Genome Research, 10, 1259–1265.
Chandrashekar, J., Mueller, K. L., Hoon, M. A., Adler, E., Feng, L.,
Guo, W., Zuker, C. S., Ryba, N. J. (2000). T2Rs function as bitter
taste receptors. Cell, 100, 703–711.
Colbourn, E., Rowe, R. (2000, April 3). A logical step forward.
Chem Ind., 252–254.
Corella, D., Tucker, K., Lahoz, C., Coltell, O., Cupples, L. A., Wilson,
P. W., Schaefer, E. J., Ordovas, J. M. (2001). Alcohol drinking
determines the effect of the APOE locus on LDL-cholesterol
concentrations in men: the Framingham Offspring Study.
American Journal of Clinical Nutrition, 73, 736–745.
Crockett, R. O. (2000, April 3). Pillsbury: a digital doughboy.
Business Week.
Davenport, R. F. (2001). Taste research. New gene may be key to
sweet tooth. Science 27, 292(5517), 620.
DellaPenna, D. (1999). Nutritional genomics: manipulating plant
micronutrients to improve human health. Science, 285, 375–379.
Eckhardt, R. B. (2001). Genetic research and nutritional indivi-
duality. Journal of Nutrition, 131, 336S–339S.
Fiehn, O., Kloska, S., Altmann, T. (2001). Integared studies using
multiparrallel techniques. Current Opinion in Biotechnol, 12,
82–86.
Firestein, S. (2000). The good taste of genomics [news; comment].
Nature, 404, 552–553.
Gasch, A. P., Spellman, P. T., Kao, C. M., Carmel-Harel, O., Eisen,
M. B., Storz, G., Botstein, D., Brown, P. O. (2000). Genomic
expression programs in the response of yeast cells to environ-
mental changes. Mol Biol Cell, 11, 4241–4257.
Gasteiger, J., Sacher, O. (1999). Unsupervised learning in reaction
databases. ACS Meeting, March, Anaheim, CA, USA.
German, B., Schiffrin, E. J., Reniero, R., Mollet, B., Pfeifer, A.,
Neeser, J. R. (1999). The development of functional foods: lessons
from the gut. Trends in Biotechnology, 17, 492–499.
Glusman, G., Yanai, I., Rubin, I., Lancet, D. (2001). The complete
human olfactory subgenome. Genome Research, 11, 685–702.
Going, I. J., Gusterson, B. A. (1999). Moleculare phatology and
future developments. European Journal of Cancer, 35, 1895–
1904.
Harvey, C. B., Hollox, E. J., Poulter, M., Wang, Y., Rossi, M.,
Auricchio, S., Iqbal, T. H., Cooper, B. T., Barton, R., Sarner, M.,
Korpela, R., Swallow, D. M. (1998). Lactase haplotype
frequencies in Caucasians: association with the lactase
persistence/non-persistence polymorphism. Annals of Human
Genetics, 62(Pt 3), 215–223.
Hockey, K., Anderson, R., Cook, V., Hantgan, R., Weinberg, R.,
Hockey, K., Anderson, R., Cook, V., Hantgan, R., Weinberg, R.
(2001). Effect of the apolipoprotein A-IV Q360H polymorphism
on postprandial plasma triglyceride clearance. Journal of Lipid
Research, 42, 2001 211-217.
Hultzman, S. (2000). In silico toxicology. Annals of the New York
Academy of Science, 919, 68–74.
Ito, T. Chiba, T. Ozawa, R. Yoshida, M., Hattori, M., Sakaki, Y. A.
comprehensive two-hybrid analysis to explore the yeast protein
interactome. Proceedings of the National Academy of Sciences
of the United States of America, 98(8), 4569–4574.
Kuipers, O. P. (1999). Genomics for food biotechnology: prospects
of the use of high-throughput technologies for the improvement
of food microorganisms. Current Opinion in Biotechnology, 10,
511–516.
Lee, M. L. T., Kuo, F. C., Whitmore, G. A., Sklar, J. (2000). Impor-
tance of replication in microarray gene expression studies:
statistical methods and evidence from repetitive cDNA
hybridizations. Proceedings of the National Acadamy of Sciences
of the United States of America, 97, 9834–9839.
Lee, P. S., Lee, K. H. (2000). Genomic analysis. Current Opinion in
Biotechnology, 11, 171–175.
Lindee, M. S. (2000). Genetic disease since 1945. Nat Rev Genet, 1,
236–241.
Lovett, R. A. (2000). Toxicogenomics. Toxicologists brace for
genomics revolution. Science, 289, 536–537.
Marth, G., Yeh, R., Minton, M., Donaldson, R., Li, Q., Duan, S.,
Davenport, R., Miller, R. D., Kwok, P. Y. (2001). Single-
nucleotide polymorphisms in the public domain:
how useful are they? Nature Genetics, 27, 371–372.
Massie, B. (2000). Moving towards a new digital environment. Am.
Chem. Soc. 219th National Meeting, Part XI, 26–30 March,
San Francisco, CA.
Matsunami, H., Montmayeur, J. P., Buck, L. B. (2000). A family of
candidate taste receptors in human and mouse. Nature,
404(6778), 601–604.

Maurer, S. M., Firestone, R. B., Scriver, C. R. (2000). Science’s
neglected legacy. Nature, 405, 117–120.
Max, M., Shaker, Y., Huang, L., Rong, M., Liu, Z., Campagne, F.,
Weinstein, H., Damak, S., Margolskee, R. F. (2001). Nature
Genetics, 28, 58–63.
Nagel, G., Szellas, T., Riordan, J. R., Friedrich, T., Hartung, K.
(2001). Non-specific activation of the epithelial sodium channel
by the CFTR chloride channel. EMBO Reports, 2, 249–254.
Nichols, B. L. (2000). Nutrigenetics and child development in the
21st century. Nutrition, 16, 493–495.
Otto, M. (1999). Chemometrics statistical and computer application
in analytical chemistry. Weinheim, Germany: Wiley VCH.
Pimstone, S. N., Clee, S. M., Gagne, S. E., Miao, L., Zhang, H., Stein,
E. A., Hayden, M. R. (1996). A frequently occurring mutation in
lipoprotein lipase gene (Asn291Ser) results in altered post-
prandial chylomicron triglyceride and retinal palmitate response
in normolipidemic carriers. Journal of Lipid Research, 37, 1675–
1684.
Pridmore, R. D., Crouzillat, D., Walker, C., Foley, S., Zink, R.,
Zwahlen, M. C., Brussow, H., Petiard, V., Mollet, B. (2000).
Genomics, molecular genetics and the food industry. Journal of
Biotechnology, 78, 251–258.
Sachidanandam, R., Weissman, D., Schmidt, S. C., Kakol, J. M.,
Stein, L. D., Marth, G., Sherry, S., Mullikin, J. C., Mortimore, B. J.,
Willey, D. L., Hunt, S. E., Cole, C. G., Coggill, P. C., Rice, C. M.,
Ning, Z., Rogers, J., Bentley, D. R., Kwok, P. Y., Mardis, E. R., Yeh,
R. T., Schultz, B., Cook, L., Davenport, R., Dante, M., Fulton, L.,
Hillier, L., Waterston, R. H., McPherson, J. D., Gilman, B.,
Schaffner, S., Van Etten, W. J., Reich, D., Higgins, J., Daly, M. J.,
Blumenstiel, B., Baldwin, J., Stange-Thomann, N., Zody, M. C.,
Linton, L., Lander, E. S., Attshuler, D. (2001). The International
SNP Map Working Group A map of human genome sequence
variation containing 1.42 million single nucleotide polymorph-
ism. Nature, 409(6822), 928–933.
Schilling, C. H., Palsson, B. O. (2000). Assessment of the meta-
bolic capabilities of Haemophilus influenzae Rd through a
genome-scale pathway analysis. Journal of Theoretical Biology,
203, 249–283.
Takeoka, S., Unoki, M., Onouchi, Y., Doi, S., Fujiwara, H., Miyatake,
A., Fujita, K., Inoue, I., Nakamura, Y., Tamari, M. (2001). Amino-
acid substitutions in the IKAP gene product significantly increase
risk for bronchial asthma in children. Journal of Human Genetics,
46, 57–63.
Ugawa, S., Minami, Y., Guo, W., Saishin, Y., Takatsuji, K.,
Yamamoto, T., Tohyama, M., Shimada, S. (1998). Receptor that
leaves a sour taste in the mouth. Nature, 395(6702), 555–556.
Watkins, S. M., Hammock, B. D., Newman, J. W., German, J. B.
(2001). Individual metabolism should guide agriculture toward
foods for improved health and nutrition. American Journal of
Clinical Nutrition, 74, 283–286.
Weggemans, R. M., Zock, P. L., Ordovas, J. M., Pedro-Botet, J.,
Katan, M. B. (2001). Apoprotein E genotype and the response of
serum cholesterol to dietary fat, cholesterol and cafestol.
Atherosclerosis, 154, 547–555.
Weintraub, M. S., Eisenberg, S., Breslow, J. L. (1987). Dietary fat
clearance in normal subjects is regulated by genetic variation in
apolipoprotein E. Journal of Clinical Investigation, 80, 1571–1577.
Young, V. R., Scrimshaw, N. S. (1979). Genetic and biological
variability in human nutrient requirements. American Journal of
Clinical Nutrition, 32, 486–500.

Bioinformática y Tecnología alimentaria

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Bioinformática y Tecnología alimentaria

Semelhante a Bioinformática y Tecnología alimentaria (20)

Último

Último (20)

Bioinformática y Tecnología alimentaria