SlideShare uma empresa Scribd logo
1 de 18
ECOCYC DATABASE
Presented by:
Shiv kumar
M.Sc (Microbial Biotechnology)
Introduction:
 EcoCyc word comes from E.coli (Eco) and Encyclopaedia (cyc).
 The encyclopedia of Escherichia coli K-12 genes and metabolism
(EcoCyc) is a database that combines information about the genome
and the intermediary metabolism of E.coli .
 It describes the known genes of E.coli , the enzymes encoded by
these genes, the reactions catalyzed by each enzyme and the
organization of these reactions into metabolic pathways.
 It is a freely accessible, comprehensive database that collects and
summarizes experimental data for Escherichia coli K-12 at
EcoCyc.org .
 It has graphical user interface (GUI) which allows query, exploration
and visualization of the EcoCyc database. EcoCyc spans the space
from sequence to function to allow investigatation of an unusually
broad range of questions .
 It is Curation based database in which Curation is the process of
manually refining and updating a bioinformatics database. The EcoCyc
project uses a literature-based Curation approach in which database
updates are based on evidence in the experimental literature.
Cont…
 The EcoCyc system consists of a knowledge base (KB) that describes
the genes and intermediary metabolism of E. coli, and a graphical user
interface for accessing that knowledge.
 EcoCyc is joint work between our group at SRI International, and a
group at the Marine Biological Laboratory (MBL) led by Monica
Riley.
 Taken together, the knowledge base and the graphical user interface
(GUI) constitute an electronic encyclopedia that allows scientists to
visualize and explore an integrated collection of genomic and
biochemical information.
 By integrating genomic data with comprehensive knowledge of the
metabolic functions of gene products, we greatly increase the value of
the genomic data and the types of analyses that can be applied to it.
Why it is designed?
It is designed for several different modes of interactive use via both the
EcoCyc.org web site and in conjunction with the downloadable
Pathway Tools software:
 It is a Visualization tools in which genome browser, metabolic
map display, and regulatory network diagram and aid in the
comprehension of these complex data found.
 EcoCyc facilitates analysis of high-throughput data such as gene-
expression and metabolomics data via tools for enrichment
analysis, and for visualizing omics data on a metabolic map
diagram, complete genome diagram, or regulatory network
diagram.
 The EcoCyc metabolic flux model can predict growth or no-growth
of wildtype and knock-out E. coli strains under different nutrient
conditions.
Problems that might be solved by using
EcoCyc:
 Genomic investigations
Coupled with sequence databases EcoCyc could be used to perform
function-based retrieval of DNA or protein sequences, such as to prepare
data sets for studies of protein structure-function relationships. Microbial
geneticists who work with bacteria other than E.coli will find EcoCyc useful
as a point of reference for gene functions, as well as for similarities and
differences in gene-product relationships.
 Studies of the metabolism
Scientists who study the evolution of the metabolism could use EcoCyc to
search out examples of duplication and divergence of enzymes and
pathways. Systematic computational studies of pathway evolution can
compare related pathways from different organisms. EcoCyc provides a
foundation for automatically generating simulations of the metabolism,
although it lacks the kinetics data needed by most simulation techniques.
 Pathway design for biotechnology
Biotechnologists seek to design novel biochemical pathways that produce
useful chemical products (such as pharmaceuticals) or that catabolize
unwanted chemicals such as toxins. EcoCyc provides the wiring diagram
of E.coli K-12, which approximates the starting point for engineering.
EcoCyc also describes the potential engineering variations that can result
from importing E.coli enzymes into other organisms.
 The Ecocyc Graphical User Interface
 The EcoCyc GUI provides graphical tools for visualizing and navigating
through an integrated collection of metabolic and genomic information
(its retrieval capabilities are described in Retrieval operations). For
each type of biological object in the EcoCyc database the GUI
provides a corresponding visualization tool.
 These tools dynamically query the underlying database to produce
display windows. Other displays are provided for genes, enzymes,
compounds and for browsing genomic maps.
 All the display algorithms are parameterized to allow the user to select
the visual presentation of an object that is most informative.
 For example, the algorithms that produce automatic layouts of
metabolic pathways can suppress the display of enzyme names or
side compound names. They can also draw chemical structures for the
compounds within a pathway.
Circular map browser for the E.coli chromosome.The bar at right shows a magnification
of the region between 51 and 55 centisomes. Bars over gene names indicate counter
clockwise transcription.
 The Ecocyc Data
 The EcoCyc data are stored within a frame knowledge representation system (FRS),
which is similar to an object-oriented database.
 FRSs use an object-oriented data model and have several advantages over relational
database management systems . FRSs organize information within classes,
collections of objects that share similar properties and attributes.
 Each EcoCyc frame contains slots describing attributes or properties of the biological
object that the frame represents or encoding a relationship between that object and
other objects. For example, the slots of a polypeptide frame encode the molecular
weight of the polypeptide, the gene that encodes it and its cellular location.
1. Genes
 Most genes whose product is an enzyme to an EcoCyc object that
represents the polypeptide and added a textual description of the
product for genes whose product is not an EcoCyc object.
 Data taken from EcoCyc v15.0:
2. Compounds
 This discussion focuses on small metabolites, which are instances of the class
Compounds.
 The data on compounds involved in the intermediary metabolism of E.coli , but
data on some other compounds are also present, for example metabolites from
other organisms.
 Among the properties encoded for compounds are synonyms, molecular weight,
empirical formula, lists of bonds and atoms that encode chemical structures and
two-dimensional display coordinates for each atom that permit the drawing of
compound structures. EcoCyc contains 1830 compounds, of which 964 have
recorded two-dimensional structures.
3. Reactions
 The initial set of biochemical reactions in EcoCyc were derived from the ENZYME
database, Because enzyme nomenclature concerns enzymes from all species,
many of the reactions in the ENZYME database, and therefore in EcoCyc, do not
actually occur in E.coli .
 Reaction frames contain information such as lists of reactants and products for
the reaction equation, the EC number of the reaction and AG 0 for the reaction in
the direction it is written. Reaction objects are linked to the pathway(s) that
contains them and to the enzyme(s) that catalyzes them. EcoCyc contains 2901
reactions organized into the 258 classes defined by the enzyme committee. Of
these, 580 reactions are known to occur in E.coli . Only 14 of the reactions have
no EC number.
 Proteins
 Template files organize information as frames (such as enzymes and
pathways) with labeled slots (attributes). The template files also permit
association of chosen literature citations with the appropriate data. In each
frame there are multiple opportunities for liberal comment, to describe the
metabolic functions and the unique complex properties of the reaction or the
enzyme.
 Among the topics covered by comments in EcoCyc are reaction mechanisms,
subreactions of complex reactions, interactions of subunits of complex
enzymes, formation of complexes with other proteins, breadth of substrate
specificity, mode of action of inhibitors and activators, place and function of
reactions in metabolic pathways, other reactions catalyzed by the protein and
the relationship of the protein to other proteins catalyzing the same reaction.
 Karp has developed a computer program that parses the template files to
extract their constituent data items and then inserts those data items into the
EcoCyc database. The parser program also performs consistency checks on
the data and allows interactive correction of problems that are found.
Consistency checkers can correct minor typographical errors and verify, for
example, that the entry in a field that is supposed to contain a gene does in fact
refer to a gene in the database. These tools have proven vital in detecting
errors throughout the database.
In the EcoCyc schema all enzyme objects are instances of the
class Proteins, which is partitioned into two subclasses:
 Protein Complexes and Polypeptides.
 These two classes have a number of common properties, such as molecular
weight, pI, cellular location and a relationship to one or more catalyzed
reactions. They differ in that Protein Complexes have slots that link them to
their subunits, whereas Polypeptides have a slot that identifies their gene.
 We also record known sequence similarity relationships among a set of
isozymes and we provide links to the SWISS-PROT and PDB entries for a
polypeptide.
 Proteins are listed as a subclass of chemical compounds, since in some cases
enzymes themselves are substrates in a reaction (such as phosphorylation
reactions).
 The database contains 468 polypeptides and 238 protein complexes that
comprise a total of 306 enzymes (i.e. 306 of the polypeptides and protein
complexes have defined catalytic activities).
Enzymatic reactions
 We define a high fidelity representation as a formal conceptualization (i.e. a portion of a
schema) that allows a database to accurately capture subtleties of biology. For example, a
number of other metabolic databases do not explicitly distinguish enzymes from reactions
nor polypeptides from protein complexes.
 But the properties of a reaction (such as its Δ G 0 and its substrates) are independent of the
enzyme(s) that catalyzes it and the properties of an enzyme (such as its molecular weight
and amino acid sequence) are independent of the reactions it catalyzes.
 The relationships between enzymes and reactions are many to many, since one enzyme
can catalyze many reactions and one reaction can be catalyzed by more than one enzyme.
This distinction has led to interesting and perhaps counterintuitive observations. EC
numbers are actually a property of reactions, rather than of enzymes, i.e. there is a one to
one correspondence between reactions and EC numbers, but not between enzymes and
EC numbers. An enzyme that catalyzes two reactions will have two EC numbers and two
enzymes that catalyze the same reaction have the same EC number.
 A further distinction is required because some properties of an enzyme are meaningful only
in the context of a particular reaction that the enzyme catalyzes. Properties such as
activators, inhibitors and cofactors pertain to the pairing of an enzyme and a reaction,
because a single enzyme that catalyzes two reactions may be sensitive to different
inhibitors for each reaction and we wish to capture this complex relationship.
Design Requirements
The EcoCyc GUI was created to facilitate the latter types of consultation
— to provide biologists with powerful and intuitive tools for exploring
and comprehending the elements of a complex information space. The
design of the EcoCyc GUI was mainly influenced by the scientific tasks
for which the information will be used, and the properties of the
information itself.
 Molecular biologists and biotechnologists designing a cloning project
will have instant access to the best current information on map
locations, identities of linked genes and their functions, lists of
available clones, and locations of pertinent restriction sites.
 Molecular evolutionists could use such DBs to search out examples of
duplication and divergence, and ancestral relationships among genes,
enzymes, and pathways.
 The metabolic pathways present in an organism can be predicted from
the DNA sequence of the organism plus its nutritional requirements,
given sufficient background knowledge of metabolism.
 A scientist who has found sequence similarities between a gene of
interest and one or more E. coli genes could use EcoCyc to obtain
descriptions of the functions of the E. coli genes. EcoCyc contains
more detailed descriptions of function than do the sequence DBs.
Cont…
 Biotechnologists seek to design novel biochemical pathways that
produce useful chemical products (such as flavor enhancers in food,
amino acids and vitamins, or pharmaceuticals), or that catabolize
unwanted chemicals such as toxins. Metabolic DBs can provide
information about enzymes from other organisms with novel substrate
specificities, kinetics, or regulatory characteristics, that can modify a
metabolic network.
 Scientists who study the metabolism itself will be able to pose novel
questions of broader scope than previously possible. Systematic
computational studies of pathway evolution can compare many related
pathways from different organisms. Those who employ numerical
simulation techniques will benefit from collections of enzyme-kinetics
data. •
 The comprehensive characterizations of enzyme function in metabolic
DBs will make possible systematic computational studies of protein
structure–function relationships.
SoftwareArchitecture
The EcoCyc software architecture is shown in Figure . The
major components of the system are
 A frame knowledge representation system called HyperTHEO that
manages the EcoCyc KB.
 A graphical KB editor and browser called the GKB Editor.
 The EcoCyc GUI. All communication between the EcoCyc GUI and
HyperTHEO flows through a well-defined KB-access library called the
Generic-Frame Protocol (Karp et al., 1995), making the EcoCyc GUI a
modular component that is in principle separable from HyperTHEO.
 A system for manipulating and displaying graphs, called Grasper-CL.
Another reason for the quick development of EcoCyc is the fact that its
pathway displays were implemented using the high-level graph display
and layout capabilities within Grasper-CL.
 CWEST is a tool for retrofitting CLIM applications to run through the
WWW. CWEST dynamically translates CLIM graphics to a combination
of HTML and GIF images for transmission via the WWW. CWEST
allows the EcoCyc GUI to operate over the WWW in a manner that is
very similar to its X-windows operation.
Figure :The architecture of the EcoCyc system
Thank You

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Clustal W - Multiple Sequence alignment
Clustal W - Multiple Sequence alignment   Clustal W - Multiple Sequence alignment
Clustal W - Multiple Sequence alignment
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
 
Gen bank (genetic sequence databank)
Gen bank (genetic sequence databank)Gen bank (genetic sequence databank)
Gen bank (genetic sequence databank)
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
sequence alignment
sequence alignmentsequence alignment
sequence alignment
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Finding ORF
Finding ORFFinding ORF
Finding ORF
 
Kegg databse
Kegg databseKegg databse
Kegg databse
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics
 
2d Page
2d Page2d Page
2d Page
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
Sequence database
Sequence databaseSequence database
Sequence database
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
 
Scop database
Scop databaseScop database
Scop database
 
Proteome databases
Proteome databasesProteome databases
Proteome databases
 
Sequence alignment belgaum
Sequence alignment belgaumSequence alignment belgaum
Sequence alignment belgaum
 

Semelhante a Ecocyc database

Subliminal: exploiting semantic annotations in the reconstruction of metaboli...
Subliminal: exploiting semantic annotations in the reconstruction of metaboli...Subliminal: exploiting semantic annotations in the reconstruction of metaboli...
Subliminal: exploiting semantic annotations in the reconstruction of metaboli...
Neil Swainston
 
1PhylogeneticAnalysisHomeworkassignmentThisa.docx
1PhylogeneticAnalysisHomeworkassignmentThisa.docx1PhylogeneticAnalysisHomeworkassignmentThisa.docx
1PhylogeneticAnalysisHomeworkassignmentThisa.docx
felicidaddinwoodie
 
57 bio infomark
57 bio infomark57 bio infomark
57 bio infomark
phdcao
 
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
dolleyj
 

Semelhante a Ecocyc database (20)

BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequences
 
Protein database ..... of NCBI
Protein database ..... of NCBI Protein database ..... of NCBI
Protein database ..... of NCBI
 
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDB
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDBMetabolic pathway mapping against KEGG, Reactome, HMDB and CPDB
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDB
 
Project report: Investigating the effect of cellular objectives on genome-sca...
Project report: Investigating the effect of cellular objectives on genome-sca...Project report: Investigating the effect of cellular objectives on genome-sca...
Project report: Investigating the effect of cellular objectives on genome-sca...
 
Subliminal: exploiting semantic annotations in the reconstruction of metaboli...
Subliminal: exploiting semantic annotations in the reconstruction of metaboli...Subliminal: exploiting semantic annotations in the reconstruction of metaboli...
Subliminal: exploiting semantic annotations in the reconstruction of metaboli...
 
A Cell-Cycle Knowledge Integration Framework
A Cell-Cycle Knowledge Integration FrameworkA Cell-Cycle Knowledge Integration Framework
A Cell-Cycle Knowledge Integration Framework
 
Databases
DatabasesDatabases
Databases
 
Types of biological databases-protein database
Types of biological databases-protein databaseTypes of biological databases-protein database
Types of biological databases-protein database
 
Synthetic Biology
Synthetic BiologySynthetic Biology
Synthetic Biology
 
evolutionary game theory presentation
evolutionary game theory presentationevolutionary game theory presentation
evolutionary game theory presentation
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
 
1PhylogeneticAnalysisHomeworkassignmentThisa.docx
1PhylogeneticAnalysisHomeworkassignmentThisa.docx1PhylogeneticAnalysisHomeworkassignmentThisa.docx
1PhylogeneticAnalysisHomeworkassignmentThisa.docx
 
Mapping metabolites against pathway databases
Mapping metabolites against pathway databases Mapping metabolites against pathway databases
Mapping metabolites against pathway databases
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
57 bio infomark
57 bio infomark57 bio infomark
57 bio infomark
 
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
 
MoM2010: Bioinformatics
MoM2010: BioinformaticsMoM2010: Bioinformatics
MoM2010: Bioinformatics
 
Deliverable_5.1.2
Deliverable_5.1.2Deliverable_5.1.2
Deliverable_5.1.2
 
Metabolomics
MetabolomicsMetabolomics
Metabolomics
 
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
 

Último

biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 

Ecocyc database

  • 1. ECOCYC DATABASE Presented by: Shiv kumar M.Sc (Microbial Biotechnology)
  • 2. Introduction:  EcoCyc word comes from E.coli (Eco) and Encyclopaedia (cyc).  The encyclopedia of Escherichia coli K-12 genes and metabolism (EcoCyc) is a database that combines information about the genome and the intermediary metabolism of E.coli .  It describes the known genes of E.coli , the enzymes encoded by these genes, the reactions catalyzed by each enzyme and the organization of these reactions into metabolic pathways.  It is a freely accessible, comprehensive database that collects and summarizes experimental data for Escherichia coli K-12 at EcoCyc.org .  It has graphical user interface (GUI) which allows query, exploration and visualization of the EcoCyc database. EcoCyc spans the space from sequence to function to allow investigatation of an unusually broad range of questions .  It is Curation based database in which Curation is the process of manually refining and updating a bioinformatics database. The EcoCyc project uses a literature-based Curation approach in which database updates are based on evidence in the experimental literature.
  • 3. Cont…  The EcoCyc system consists of a knowledge base (KB) that describes the genes and intermediary metabolism of E. coli, and a graphical user interface for accessing that knowledge.  EcoCyc is joint work between our group at SRI International, and a group at the Marine Biological Laboratory (MBL) led by Monica Riley.  Taken together, the knowledge base and the graphical user interface (GUI) constitute an electronic encyclopedia that allows scientists to visualize and explore an integrated collection of genomic and biochemical information.  By integrating genomic data with comprehensive knowledge of the metabolic functions of gene products, we greatly increase the value of the genomic data and the types of analyses that can be applied to it.
  • 4. Why it is designed? It is designed for several different modes of interactive use via both the EcoCyc.org web site and in conjunction with the downloadable Pathway Tools software:  It is a Visualization tools in which genome browser, metabolic map display, and regulatory network diagram and aid in the comprehension of these complex data found.  EcoCyc facilitates analysis of high-throughput data such as gene- expression and metabolomics data via tools for enrichment analysis, and for visualizing omics data on a metabolic map diagram, complete genome diagram, or regulatory network diagram.  The EcoCyc metabolic flux model can predict growth or no-growth of wildtype and knock-out E. coli strains under different nutrient conditions.
  • 5. Problems that might be solved by using EcoCyc:  Genomic investigations Coupled with sequence databases EcoCyc could be used to perform function-based retrieval of DNA or protein sequences, such as to prepare data sets for studies of protein structure-function relationships. Microbial geneticists who work with bacteria other than E.coli will find EcoCyc useful as a point of reference for gene functions, as well as for similarities and differences in gene-product relationships.  Studies of the metabolism Scientists who study the evolution of the metabolism could use EcoCyc to search out examples of duplication and divergence of enzymes and pathways. Systematic computational studies of pathway evolution can compare related pathways from different organisms. EcoCyc provides a foundation for automatically generating simulations of the metabolism, although it lacks the kinetics data needed by most simulation techniques.  Pathway design for biotechnology Biotechnologists seek to design novel biochemical pathways that produce useful chemical products (such as pharmaceuticals) or that catabolize unwanted chemicals such as toxins. EcoCyc provides the wiring diagram of E.coli K-12, which approximates the starting point for engineering. EcoCyc also describes the potential engineering variations that can result from importing E.coli enzymes into other organisms.
  • 6.  The Ecocyc Graphical User Interface  The EcoCyc GUI provides graphical tools for visualizing and navigating through an integrated collection of metabolic and genomic information (its retrieval capabilities are described in Retrieval operations). For each type of biological object in the EcoCyc database the GUI provides a corresponding visualization tool.  These tools dynamically query the underlying database to produce display windows. Other displays are provided for genes, enzymes, compounds and for browsing genomic maps.  All the display algorithms are parameterized to allow the user to select the visual presentation of an object that is most informative.  For example, the algorithms that produce automatic layouts of metabolic pathways can suppress the display of enzyme names or side compound names. They can also draw chemical structures for the compounds within a pathway.
  • 7. Circular map browser for the E.coli chromosome.The bar at right shows a magnification of the region between 51 and 55 centisomes. Bars over gene names indicate counter clockwise transcription.
  • 8.  The Ecocyc Data  The EcoCyc data are stored within a frame knowledge representation system (FRS), which is similar to an object-oriented database.  FRSs use an object-oriented data model and have several advantages over relational database management systems . FRSs organize information within classes, collections of objects that share similar properties and attributes.  Each EcoCyc frame contains slots describing attributes or properties of the biological object that the frame represents or encoding a relationship between that object and other objects. For example, the slots of a polypeptide frame encode the molecular weight of the polypeptide, the gene that encodes it and its cellular location.
  • 9. 1. Genes  Most genes whose product is an enzyme to an EcoCyc object that represents the polypeptide and added a textual description of the product for genes whose product is not an EcoCyc object.  Data taken from EcoCyc v15.0:
  • 10. 2. Compounds  This discussion focuses on small metabolites, which are instances of the class Compounds.  The data on compounds involved in the intermediary metabolism of E.coli , but data on some other compounds are also present, for example metabolites from other organisms.  Among the properties encoded for compounds are synonyms, molecular weight, empirical formula, lists of bonds and atoms that encode chemical structures and two-dimensional display coordinates for each atom that permit the drawing of compound structures. EcoCyc contains 1830 compounds, of which 964 have recorded two-dimensional structures. 3. Reactions  The initial set of biochemical reactions in EcoCyc were derived from the ENZYME database, Because enzyme nomenclature concerns enzymes from all species, many of the reactions in the ENZYME database, and therefore in EcoCyc, do not actually occur in E.coli .  Reaction frames contain information such as lists of reactants and products for the reaction equation, the EC number of the reaction and AG 0 for the reaction in the direction it is written. Reaction objects are linked to the pathway(s) that contains them and to the enzyme(s) that catalyzes them. EcoCyc contains 2901 reactions organized into the 258 classes defined by the enzyme committee. Of these, 580 reactions are known to occur in E.coli . Only 14 of the reactions have no EC number.
  • 11.  Proteins  Template files organize information as frames (such as enzymes and pathways) with labeled slots (attributes). The template files also permit association of chosen literature citations with the appropriate data. In each frame there are multiple opportunities for liberal comment, to describe the metabolic functions and the unique complex properties of the reaction or the enzyme.  Among the topics covered by comments in EcoCyc are reaction mechanisms, subreactions of complex reactions, interactions of subunits of complex enzymes, formation of complexes with other proteins, breadth of substrate specificity, mode of action of inhibitors and activators, place and function of reactions in metabolic pathways, other reactions catalyzed by the protein and the relationship of the protein to other proteins catalyzing the same reaction.  Karp has developed a computer program that parses the template files to extract their constituent data items and then inserts those data items into the EcoCyc database. The parser program also performs consistency checks on the data and allows interactive correction of problems that are found. Consistency checkers can correct minor typographical errors and verify, for example, that the entry in a field that is supposed to contain a gene does in fact refer to a gene in the database. These tools have proven vital in detecting errors throughout the database.
  • 12. In the EcoCyc schema all enzyme objects are instances of the class Proteins, which is partitioned into two subclasses:  Protein Complexes and Polypeptides.  These two classes have a number of common properties, such as molecular weight, pI, cellular location and a relationship to one or more catalyzed reactions. They differ in that Protein Complexes have slots that link them to their subunits, whereas Polypeptides have a slot that identifies their gene.  We also record known sequence similarity relationships among a set of isozymes and we provide links to the SWISS-PROT and PDB entries for a polypeptide.  Proteins are listed as a subclass of chemical compounds, since in some cases enzymes themselves are substrates in a reaction (such as phosphorylation reactions).  The database contains 468 polypeptides and 238 protein complexes that comprise a total of 306 enzymes (i.e. 306 of the polypeptides and protein complexes have defined catalytic activities).
  • 13. Enzymatic reactions  We define a high fidelity representation as a formal conceptualization (i.e. a portion of a schema) that allows a database to accurately capture subtleties of biology. For example, a number of other metabolic databases do not explicitly distinguish enzymes from reactions nor polypeptides from protein complexes.  But the properties of a reaction (such as its Δ G 0 and its substrates) are independent of the enzyme(s) that catalyzes it and the properties of an enzyme (such as its molecular weight and amino acid sequence) are independent of the reactions it catalyzes.  The relationships between enzymes and reactions are many to many, since one enzyme can catalyze many reactions and one reaction can be catalyzed by more than one enzyme. This distinction has led to interesting and perhaps counterintuitive observations. EC numbers are actually a property of reactions, rather than of enzymes, i.e. there is a one to one correspondence between reactions and EC numbers, but not between enzymes and EC numbers. An enzyme that catalyzes two reactions will have two EC numbers and two enzymes that catalyze the same reaction have the same EC number.  A further distinction is required because some properties of an enzyme are meaningful only in the context of a particular reaction that the enzyme catalyzes. Properties such as activators, inhibitors and cofactors pertain to the pairing of an enzyme and a reaction, because a single enzyme that catalyzes two reactions may be sensitive to different inhibitors for each reaction and we wish to capture this complex relationship.
  • 14. Design Requirements The EcoCyc GUI was created to facilitate the latter types of consultation — to provide biologists with powerful and intuitive tools for exploring and comprehending the elements of a complex information space. The design of the EcoCyc GUI was mainly influenced by the scientific tasks for which the information will be used, and the properties of the information itself.  Molecular biologists and biotechnologists designing a cloning project will have instant access to the best current information on map locations, identities of linked genes and their functions, lists of available clones, and locations of pertinent restriction sites.  Molecular evolutionists could use such DBs to search out examples of duplication and divergence, and ancestral relationships among genes, enzymes, and pathways.  The metabolic pathways present in an organism can be predicted from the DNA sequence of the organism plus its nutritional requirements, given sufficient background knowledge of metabolism.  A scientist who has found sequence similarities between a gene of interest and one or more E. coli genes could use EcoCyc to obtain descriptions of the functions of the E. coli genes. EcoCyc contains more detailed descriptions of function than do the sequence DBs.
  • 15. Cont…  Biotechnologists seek to design novel biochemical pathways that produce useful chemical products (such as flavor enhancers in food, amino acids and vitamins, or pharmaceuticals), or that catabolize unwanted chemicals such as toxins. Metabolic DBs can provide information about enzymes from other organisms with novel substrate specificities, kinetics, or regulatory characteristics, that can modify a metabolic network.  Scientists who study the metabolism itself will be able to pose novel questions of broader scope than previously possible. Systematic computational studies of pathway evolution can compare many related pathways from different organisms. Those who employ numerical simulation techniques will benefit from collections of enzyme-kinetics data. •  The comprehensive characterizations of enzyme function in metabolic DBs will make possible systematic computational studies of protein structure–function relationships.
  • 16. SoftwareArchitecture The EcoCyc software architecture is shown in Figure . The major components of the system are  A frame knowledge representation system called HyperTHEO that manages the EcoCyc KB.  A graphical KB editor and browser called the GKB Editor.  The EcoCyc GUI. All communication between the EcoCyc GUI and HyperTHEO flows through a well-defined KB-access library called the Generic-Frame Protocol (Karp et al., 1995), making the EcoCyc GUI a modular component that is in principle separable from HyperTHEO.  A system for manipulating and displaying graphs, called Grasper-CL. Another reason for the quick development of EcoCyc is the fact that its pathway displays were implemented using the high-level graph display and layout capabilities within Grasper-CL.  CWEST is a tool for retrofitting CLIM applications to run through the WWW. CWEST dynamically translates CLIM graphics to a combination of HTML and GIF images for transmission via the WWW. CWEST allows the EcoCyc GUI to operate over the WWW in a manner that is very similar to its X-windows operation.
  • 17. Figure :The architecture of the EcoCyc system