2. The science of storing, retrieving and analyzing
large amounts of biological information
An interdisciplinary science, involving biologists,
computer scientists and mathematicians
At the heart of modern biology
2
What is bioinformatics?
3. 3
Biology is changing
• Data explosion
• New types of data
• High-throughput biology
• Emphasis on systems,
not reductionism
• Growth of applied biology
• molecular medicine
• agriculture
• food
• environmental sciences…
0
2000
4000
6000
8000
10000
12000
Disks(TB)
Year
Growth of raw storage
at EMBL-EBI
(in terabytes)
4. What is EMBL-EBI?
4
• Bioinformatics research and services institute
• Non-profit organization
• www.ebi.ac.uk/embl and www.ebi.ac.uk
• ~ 500 staff
• Part of the European Molecular Biology Laboratory
5. • Bioinformatics research and services institute
the resources of European Molecular Biology Laboratory (EMBL/EBI):
divided into two categories –
databases and tools or bio-software.
EBI’s FTP server provides open access to downloadable databases and bio-
software.
6. 6
New types of data
Genomes
Nucleotide sequence
Gene expression
Proteomes
Protein families, domains
and motifs
Protein structure
Protein-protein
interactions
Chemical entities
Pathways
Systems
Literature
Protein sequence
7. The five branches of EMBL
Mouse biologyStructural biology
Bioinformatics
Hinxton
Structural biology
Hamburg
• Basic research in
molecular biology
• Administration
• EMBO
Heidelberg
• 1500 staff
• >60 nationalities
MonterotondoGrenoble
8. 8
EMBL member states
Austria, Belgium, Croatia, Denmark,
Finland, France, Germany, Greece,
Iceland, Ireland, Israel, Italy,
Luxembourg, the Netherlands,
Norway, Portugal, Spain, Sweden,
Switzerland and the United Kingdom
Associate member state: Australia
9. 9
EMBL-EBI’s Mission
• To provide freely available data and bioinformatics services to
all facets of the scientific community in ways that promote
scientific progress
• To contribute to the advancement of biology through basic
investigator-driven research in bioinformatics
• To provide advanced bioinformatics training to scientists at all
levels, from PhD students to independent investigators
• To help disseminate cutting-edge technologies to industry
• To coordinate biological data provision across Europe
10. • The key goal of the EMBL nucleotide sequence database is to build,
maintain and prepare biological database and other computational
services to support data deposition and data analysis make them available
to the scientific community.
• EMBL-EBI is a huge warehouse of biological data and bio-software.
Key Goal
12. European node for globally coordinated data
collection and dissemination projects
Core databases produced in collaboration
with other world leaders, including NCBI (US),
National Institute of Genetics (Japan), Swiss
Institute of Bioinformatics, Cold Spring Harbor
Laboratory (US)
The world’s most comprehensive collection of
molecular databases
12
Key facts about services
13. 1
3
Databases: molecules to systems
Genomes
Ensembl
Ensembl Genomes
EGA
Nucleotide sequence
ENA
Functional genomics
ArrayExpress
Expression Atlas
Protein Sequences
UniProt
Protein families,
motifs and domains
InterPro
Macromolecular
PDBe
Protein activity
IntAct , PRIDE
Chemical entities
ChEBI
Pathways
Reactome
Systems
BioModels
BioSamples
Literature and ontologies
CiteXplore, GO
Chemogenomics
ChEMBL
15. 1
5
Standards development – international collaborations
Genome annotation
www.geneontology.org
Functional Genomics Data
Society
www.fged.org
Protein sequence
www.uniprot.org
HUPO- Proteomics
Standards Initiative
(PSI)
www.psidev.info/
Protein structure
www.wwpdb.org
Cheminformatics
www.ebi.ac.uk/chebi
Pathways
www.reactome.org
www.biopax.org
Systems modelling
standards
www.sbml.orgMetabolomics Standards Initiative (MSI)
www.metabolomicssociety.org
Genomics Standards Consortium (GSC)
http://gensc.org
Nucleotide sequence
www.insdc.org
16. New search service
Access from the EBI’s
homepage
Data organised
according to:
• gene
• expression
• protein
• structure
• literature
Species selector
allows for easy
comparison
Explore data,
return easily to
your results
1
6
17. Relevant to ‘wet-lab’ biologists
Organises information based around a single
gene
(or a small number of genes)
User-expectation centric (not database centric)
Smooth transition to the detailed information in
many of EBI’s core databases
NOT for bioinformaticians:
does not provide programmatic access
17
Goals of the new EBI Search
19. 34
Research themes
Genomes
Nick Goldman
Ewan Birney
Paul Flicek
Transcriptomes
Anton Enright
John Marioni
Alvis Brazma
Proteins
Janet Thornton
Rolf Apweiler
Gerard Kleywegt
Chemistry
Christoph Steinbeck
John Overington
Pathways and systems
Nicolas Le Novère
Nick Luscombe
Paul Bertone
Julio Saez-Rodriguez
Text mining
Dietrich Rebholz-Schuhmann
Service team leaders
who also have research
groups are in italics
20. Pre- and postdocs at EMBL-EBI
• EMBL International PhD Programme
• Postdoctoral fellowships:
– EIPOD – EMBL-sponsored interdisciplinary fellowships
– ESPOD – EBI–Sanger combined experimental and computational
fellowships