SlideShare uma empresa Scribd logo
1 de 60
Exploring proteins,
  chemicals and their
      interactions
with STRING and STITCH
       Michael Kuhn
(talk and practical session)
interactions of proteins
     and chemicals
example



Tryptophan synthase beta chain
         E. Coli K12
example



  aspirin
Homo sapiens
STRING: version 8.3
    soon: version 9
interactions of proteins


  STITCH: version 2
    interactions of
proteins and chemicals
content




STRING 8
630 genomes



only completely sequenced genomes

    STRING 9: >1100 genomes
2.5 million genes




   (not proteins)
74,000 chemicals




(including 2200 drugs)
many sources of
 interactions
genomic context methods
gene neighborhood
gene fusion
phylogenetic profiles
curated knowledge
experimental evidence




      T
co-expression




GEO: Gene Expression Omnibus
experimental databases
literature
variable quality




different “raw scores”
benchmarking



calibrate against “gold standard”
             (KEGG)
probabilistic scores




e.g. “70% chance for an association”
combine all evidence
Bayesian scoring scheme
e.g.: two scores of 0.7
combined probability: ?
e.g.: two scores of 0.7
combined probability: 0.91



      1-   (1-0.7) 2   = 0.91
evidence spread
over many species
evidence transfer
transfer by orthology




  (or “fuzzy orthology”)
von Mering et al., Nucleic Acids Research, 2005
von Mering et al., Nucleic Acids Research, 2005
two modes
proteins mode
von Mering et al., Nucleic Acids Research, 2005
maximum specificity
   lower coverage



information will be relevant for
       selected species
COG mode




“clusters of orthologous groups”
von Mering et al., Nucleic Acids Research, 2005
higher coverage
     lower specificity


 includes all available evidence

some orthologous groups are too
     large to be meaningful
STRING plans

• next big release (9.0):
 • coming end of 2010 / early 2011
 • more genomes
 • allow users to add more data to the
    network
STITCH plans

• next minor release (2.1):
 • add ChEMBLdb
• next big release (3.0):
 • “zoom” into stereo-isomers, salt forms
Acknowledgements
STRING
Christian von Mering   STITCH
Lars Juhl Jensen       Damian Szklarczyk
Manuel Stark           Andrea Franceschini
Samuel Chaffron        Monica Campillos
Chris Creevey          Christian von Mering
Jean Muller            Lars Juhl Jensen
Tobias Doerks          Andreas Beyer
Philippe Julien        Peer Bork
Alexander Roth
Milan Simonovic
Peer Bork
string-db.org
Jensen et al., NAR Database Issue 2009




      stitch-db.org
Kuhn et al., NAR Database Issue 2010

Mais conteúdo relacionado

Mais procurados

Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelLars Juhl Jensen
 
Principles of Protein Structure
Principles of Protein StructurePrinciples of Protein Structure
Principles of Protein StructureAsheesh Pandey
 
Protein function prediction
Protein function predictionProtein function prediction
Protein function predictionLars Juhl Jensen
 
GFP For Exploring Protein-Protein Interactions - Nelson Giovanny Rincon Silva
GFP For Exploring Protein-Protein Interactions - Nelson Giovanny Rincon Silva  GFP For Exploring Protein-Protein Interactions - Nelson Giovanny Rincon Silva
GFP For Exploring Protein-Protein Interactions - Nelson Giovanny Rincon Silva Nelson Giovanny Rincon S
 
Secondary structure prediction
Secondary structure predictionSecondary structure prediction
Secondary structure predictionsamantlalit
 
Clinical proteomics in diseases lecture, 2014
Clinical proteomics in diseases lecture, 2014Clinical proteomics in diseases lecture, 2014
Clinical proteomics in diseases lecture, 2014Mohammad Hessam Rafiee
 
Drug and Chemical Databases 2018 - Drug Discovery
Drug and Chemical Databases 2018 - Drug DiscoveryDrug and Chemical Databases 2018 - Drug Discovery
Drug and Chemical Databases 2018 - Drug DiscoveryGirinath Pillai
 
Chou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionChou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionRoshan Karunarathna
 
RNA structure analysis
RNA structure analysis RNA structure analysis
RNA structure analysis Afra Fathima
 
Brief Introduction of Protein-Protein Interactions (PPIs)
Brief Introduction of Protein-Protein Interactions (PPIs)Brief Introduction of Protein-Protein Interactions (PPIs)
Brief Introduction of Protein-Protein Interactions (PPIs)Creative Proteomics
 
molecular file formats in bioinformatics
molecular file formats in bioinformaticsmolecular file formats in bioinformatics
molecular file formats in bioinformaticsnadeem akhter
 

Mais procurados (20)

Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems level
 
Principles of Protein Structure
Principles of Protein StructurePrinciples of Protein Structure
Principles of Protein Structure
 
Protein function prediction
Protein function predictionProtein function prediction
Protein function prediction
 
GFP For Exploring Protein-Protein Interactions - Nelson Giovanny Rincon Silva
GFP For Exploring Protein-Protein Interactions - Nelson Giovanny Rincon Silva  GFP For Exploring Protein-Protein Interactions - Nelson Giovanny Rincon Silva
GFP For Exploring Protein-Protein Interactions - Nelson Giovanny Rincon Silva
 
Protein protein interaction
Protein protein interactionProtein protein interaction
Protein protein interaction
 
Biological networks
Biological networksBiological networks
Biological networks
 
Secondary structure prediction
Secondary structure predictionSecondary structure prediction
Secondary structure prediction
 
Clinical proteomics in diseases lecture, 2014
Clinical proteomics in diseases lecture, 2014Clinical proteomics in diseases lecture, 2014
Clinical proteomics in diseases lecture, 2014
 
Dot matrix seminar
Dot matrix seminarDot matrix seminar
Dot matrix seminar
 
Drug and Chemical Databases 2018 - Drug Discovery
Drug and Chemical Databases 2018 - Drug DiscoveryDrug and Chemical Databases 2018 - Drug Discovery
Drug and Chemical Databases 2018 - Drug Discovery
 
Chou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionChou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure prediction
 
Cross linking pp
Cross linking ppCross linking pp
Cross linking pp
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactions
 
RNA structure analysis
RNA structure analysis RNA structure analysis
RNA structure analysis
 
Data formats
Data formatsData formats
Data formats
 
DNA biosensors
DNA biosensorsDNA biosensors
DNA biosensors
 
Brief Introduction of Protein-Protein Interactions (PPIs)
Brief Introduction of Protein-Protein Interactions (PPIs)Brief Introduction of Protein-Protein Interactions (PPIs)
Brief Introduction of Protein-Protein Interactions (PPIs)
 
molecular file formats in bioinformatics
molecular file formats in bioinformaticsmolecular file formats in bioinformatics
molecular file formats in bioinformatics
 
ZINC database
ZINC databaseZINC database
ZINC database
 
ProCheck
ProCheckProCheck
ProCheck
 

Semelhante a STRING/STITCH tutorial

Exploring proteins, chemicals and their interactions with STRING and STITCH
Exploring proteins, chemicals and their interactions with STRING and STITCHExploring proteins, chemicals and their interactions with STRING and STITCH
Exploring proteins, chemicals and their interactions with STRING and STITCHbiocs
 
STRING & related databases: Large-scale integration of heterogeneous data
STRING & related databases: Large-scale integration of heterogeneous dataSTRING & related databases: Large-scale integration of heterogeneous data
STRING & related databases: Large-scale integration of heterogeneous dataLars Juhl Jensen
 
Network biology: Large-scale data and text mining
Network biology: Large-scale data and text miningNetwork biology: Large-scale data and text mining
Network biology: Large-scale data and text miningLars Juhl Jensen
 
STRING: Large-scale data and text mining
STRING: Large-scale data and text miningSTRING: Large-scale data and text mining
STRING: Large-scale data and text miningLars Juhl Jensen
 
Large-scale integration of data and text
Large-scale integration of data and textLarge-scale integration of data and text
Large-scale integration of data and textLars Juhl Jensen
 
The STRING database and related tools
The STRING database and related toolsThe STRING database and related tools
The STRING database and related toolsLars Juhl Jensen
 
Large-scale data and text mining
Large-scale data and text miningLarge-scale data and text mining
Large-scale data and text miningLars Juhl Jensen
 
Systems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systemsSystems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systemsLars Juhl Jensen
 
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textNetwork biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textLars Juhl Jensen
 
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textNetwork biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textLars Juhl Jensen
 
STRING: Protein association networks
STRING: Protein association networksSTRING: Protein association networks
STRING: Protein association networksLars Juhl Jensen
 
STRING: protein association networks
STRING: protein association networksSTRING: protein association networks
STRING: protein association networksLars Juhl Jensen
 
Large-scale integration of data and text
Large-scale integration of data and textLarge-scale integration of data and text
Large-scale integration of data and textLars Juhl Jensen
 
Systems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systemsSystems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systemsLars Juhl Jensen
 
Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biologylemberger
 
Network biology - Large-scale integration of data and text
Network biology - Large-scale integration of data and textNetwork biology - Large-scale integration of data and text
Network biology - Large-scale integration of data and textLars Juhl Jensen
 
STRING - Modeling of biological systems through cross-species data integ...
STRING - Modeling of biological systems through cross-species data integ...STRING - Modeling of biological systems through cross-species data integ...
STRING - Modeling of biological systems through cross-species data integ...Lars Juhl Jensen
 
Protein association networks with STRING
Protein association networks with STRINGProtein association networks with STRING
Protein association networks with STRINGLars Juhl Jensen
 

Semelhante a STRING/STITCH tutorial (20)

Exploring proteins, chemicals and their interactions with STRING and STITCH
Exploring proteins, chemicals and their interactions with STRING and STITCHExploring proteins, chemicals and their interactions with STRING and STITCH
Exploring proteins, chemicals and their interactions with STRING and STITCH
 
STRING & related databases: Large-scale integration of heterogeneous data
STRING & related databases: Large-scale integration of heterogeneous dataSTRING & related databases: Large-scale integration of heterogeneous data
STRING & related databases: Large-scale integration of heterogeneous data
 
Network biology: Large-scale data and text mining
Network biology: Large-scale data and text miningNetwork biology: Large-scale data and text mining
Network biology: Large-scale data and text mining
 
STRING: Large-scale data and text mining
STRING: Large-scale data and text miningSTRING: Large-scale data and text mining
STRING: Large-scale data and text mining
 
Large-scale integration of data and text
Large-scale integration of data and textLarge-scale integration of data and text
Large-scale integration of data and text
 
Text and data integration
Text and data integrationText and data integration
Text and data integration
 
The STRING database and related tools
The STRING database and related toolsThe STRING database and related tools
The STRING database and related tools
 
Large-scale data and text mining
Large-scale data and text miningLarge-scale data and text mining
Large-scale data and text mining
 
Systems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systemsSystems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systems
 
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textNetwork biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and text
 
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textNetwork biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and text
 
STRING: Protein association networks
STRING: Protein association networksSTRING: Protein association networks
STRING: Protein association networks
 
STRING: protein association networks
STRING: protein association networksSTRING: protein association networks
STRING: protein association networks
 
Large-scale integration of data and text
Large-scale integration of data and textLarge-scale integration of data and text
Large-scale integration of data and text
 
Systems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systemsSystems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systems
 
Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biology
 
Network biology - Large-scale integration of data and text
Network biology - Large-scale integration of data and textNetwork biology - Large-scale integration of data and text
Network biology - Large-scale integration of data and text
 
STRING - Modeling of biological systems through cross-species data integ...
STRING - Modeling of biological systems through cross-species data integ...STRING - Modeling of biological systems through cross-species data integ...
STRING - Modeling of biological systems through cross-species data integ...
 
Protein association networks with STRING
Protein association networks with STRINGProtein association networks with STRING
Protein association networks with STRING
 
Slides_SB3.ppt
Slides_SB3.pptSlides_SB3.ppt
Slides_SB3.ppt
 

Último

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 

Último (20)

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 

STRING/STITCH tutorial

Notas do Editor

  1. next: neighborhood view
  2. next: text mining
  3. next: actions aspirin -- PTGS1
  4. four different areas of sources
  5. especially for procaryotes
  6. In this mode, STRING predicts interaction partners for one protein in a specific species. This allows for maximum specificity, but has slightly lower coverage. Why? Because, in protein mode, STRING does not precisely know about orthologs in other species - instead, it resorts to estimating orthology through sequence similarity searches. In short, interaction information is transferred between species based on 'degree of orthology' (whereby 'degree of orthology' is a measure of how confident STRING is that two proteins are orthologs. The measure is derived from all-against-all similarity searches, and takes into account putative paralogs in both species. The fewer paralogs there are, the more confident STRING is about orthology).
  7. In this mode, STRING predicts interaction partners for a group of orthologous proteins. This generally has higher coverage, but may result in slightly lower specificity. Again, the reason is in how STRING derives orthologs. In COG-Mode, information about orthology is derived from the database 'Clusters of Orthologous Groups' (Tatusov & Koonin, NCBI). There, orthology is an 'all-or-nothing' decision, and all proteins considered orthologous are grouped into a single entity. Therefore, a prediction made for one protein applies to all proteins in the group - which is why STRING shows its predictions at the level of the groups. Coverage is higher, because the groups are partly based on manual curation and contain orthology assignments which are difficult to derive through an automated procedure. Specificity is lower, however, because some groups are (for technical reasons) relatively 'inclusive' - i.e. they contain a large number of proteins which cannot be resolved further. For example, almost all Serine/Threonine kinases are grouped into one COG - making predictions for a specific subset impossible. Nevertheless, COGs are very powerful and are the first choice for proteins which do not show much lineage-specific expansions, especially in prokaryotes.