SlideShare uma empresa Scribd logo
1 de 1
Baixar para ler offline
Aligning Subunits of Internally Symmetric Proteins 
(Left) Fibroblast growth factor 1 [3JUT], colored to show internal symmetry. (Right) Dot plot 
showing equivalent residues within the protein. Red lines correspond to a 120° clockwise 
rotation of the protein around the 3-fold axis, and cyan to the 240° rotation. After 
duplicating the matrix, each alignment forms a sequential diagonal line which can be fully 
detected by CE. Gray shading indicates regions near the diagonal which are penalized by the 
scoring function. 
References 
Screenshot of the CE-Symm interface, showing a 
two-fold axis of EPSP synthase [1G6S]. 
This work is licensed under a Creative Commons Attribution 3.0 Unported License. 
Background 
with CE-Symm 
Proteins can have quaternary symmetry and/or internal 
symmetry 
Symmetry is widespread in proteins and can be observed at a number of levels, from 
crystal symmetry within complexes to pseudo-symmetry in individual chains and 
domains. Symmetry is known to play a role in protein evolution,1 
allosteric regulation, 2 DNA binding,3 and cooperative enzyme effects.4 Symmetry has 
also been utilized to understand protein folding5 and to aid the computational design 
of large proteins.6 
Quaternary symmetry consists of multiple identical polypeptide chains arranged in 
a symmetric fashion. Such symmetry is extremely common in proteins, occurring in 
approximately 80% of structures in the Protein Data Bank (PDB). Detecting 
quaternary symmetry relies on accurate assignment of the correct biological assembly 
for each protein. The PDB now annotates protein structures with their quaternary symmetry (Peter Rose et al., in 
preparation). 
Proteins can also have internal or ternary symmetry, when a single chain contains two or more equivalent 
subunits. The subunits generally will differ in the exact sequence, but have substantially similar structures. Internal 
symmetry i s sometimes styled as 
pseudosymmetry to reflect that the 
equivalence between subunits is generally at 
the level of residues or secondary structure 
elements rather than atoms or electron 
density, as is common with quaternary 
symmetry. 
Internal symmetr y can arise from 
quaternary by gene duplication or fusion. 
Thus, in addition to the many functional 
implications of symmetry, identifying 
protein symmetry can provide information 
about the evolutionary history of a protein. 
Such fission and fusion events often 
preserve the overall structure and function 
of the active complex. 
Existing methods for finding internal symmetry 
Several computational methods are available to detect symmetry. Some methods search for periodic sequences or 
structure (e.g. DAVROS7). These are generally limited in their ability to handle large insertions. Methods based on 
structural alignment algorithms (SymD,8 GANGSTA+9) can tolerate large insertions, but produce pairwise 
alignments between adjacent symmetric subunits rather than a global alignment of all subunits. This leads to 
ambiguous alignments, where a single residue could be aligned to several residues in each other subunit, depending 
on the order in which rotation operations are performed. 
Conclusion 
CE-Symm was run over a large hand-curated benchmark, and is able to 
detect symmetric proteins with a high degree of accuracy, even in the 
presence of large insertions. The resulting alignment includes exactly 
one residue from each subunit, as expected for a multiple alignment. It 
runs quickly and is able to detect symmetry broadly across a variety of 
folds. 
The refinement stage can also be used as an independent tool in 
conjunction with seed alignments from other tools. This allows the 
circularly permuted alignments from tools such as SymD8 to be refined 
into multiple alignments between individual subunits. 
Because symmetry is hypothesized to derive from gene duplications and 
fusions,12 aligning subunits within symmetric proteins can reveal ancient 
homologies and conserved sequences. CE-Symm is useful both for 
identifying symmetric proteins and for aligning the subunits for further 
study. 
Availability: 
CE-Symm source code is available under the LGPL license from https://github.com/rcsb/symmetry 
An online server is available at http://source.rcsb.org/jfatcatserver/symmetry.jsp 
Spencer Bliven 
Bioinformatics and Systems Biology Program 
University of California San Diego 
Douglas Myers-Turnbull 
Dept. of Computer Science & Engineering 
University of California San Diego 
Philip Bourne 
Skaggs School of Pharmacy and Pharmaceutical Sciences 
University of California San Diego 
Andreas Prlić 
San Diego Supercomputer Center 
University of California San Diego 
(Left) Beta-carbonic anhydrase from Porphyridium purpureum [1I6O] is a quatramer with D2 
quaternary symmetry. (Right) The beta-carbonic anhydrase in E. coli [1DDZ] consists of 
only two chains, which each have internal C2 symmetry in addition to the C2 quaternary 
symmetry. The two halves of the chain have 68% sequence identity, strongly indicating 
that a duplication and fusion event has occured in the evolution of E. coli. 
D5 quaternary symmetry of GTP 
cyclohydrolase I [1A8R]. The 
main 5-fold axis is shown in red; 
the five 2-fold axes are in blue. 
Methods 
The CE-Symm program is able to detect internal symmetry in proteins. It first identifies structurally similar 
regions within the protein structure. It then refines this alignment to improve the correspondence between 
subunits. 
1. Identify structurally similar regions 
The CE-Symm algorithm starts by 
identifying a non-trivial structural 
alignment between a protein and itself 
using Combinatorial Extension10 (CE). 
This uses the dynamic programming and 
progressive refinement of CE, but with 
two modifications. 
1.A strong penalty term is added to self-aligned 
residues to prevent the trivial 0° 
rotation from dominating. 
2.The alignment matrix is duplicated in the 
manner of Uliel et al.11 to account for the 
circular permutation which is introduced 
when comparing a symmetric protein 
against a rotated copy of itself. 
2. Refinement to ensure transitivity 
The structural alignment from the first step is then refined to produce a residue-level equivalence map between 
subunits. Refinement produces a consistent multiple alignment between all identified subunits. 
The order, k, of rotational symmetry present in the protein (if any) is determined by successively applying the seed 
alignment until the original orientation is found. 
Let f be a function over all residues in the protein, such that f(i)=j when i is aligned to j. The goal is to modify f such 
that k applications of f (i.e. rotations of the protein) give a trivial alignment. Formally, ∀i f k(i)=i. To constrain the 
modifications, we introduce a penalty function σ(i) which goes to zero when the previous condition is met. Two 
such penalty functions were considered: 
1. σ(i) = |f k(i)-i|. This measures the number of insertions or deletions which would need to be added to be 
made in order to bring residue i into alignment 
2. σ(i) = |d( f k-1(i), f k(i)) - d(i,f k-1(i))|, where d(i,j) gives the distance between alpha carbons of residues i and j. 
This minimizes the changes in RMSD required during refinement. 
The algorithm works by choosing the residue with minimal score and modifying the alignment such that f k(i)=i. To 
ensure that the alignment remains sequential and well-formed, the selection of residue to modify is limited by the 
following “eligibility criteria.” 
1. f k-1(i) is defined (f k(i) may be undefined) 
2. σ(i)>0 
3. σ(f k-1(i)) > 0 
4. ∀j s.t. σ(j)=0: sign(f k-1(i)-j ) = sign( i-f(j) ) 
Eligible residues are chosen in order of increasing score, and the alignment modified to set f k-1(i) ⟵i. This 
process is repeated until no eligible residues remain, at which point remaining residues are removed from the 
alignment. 
This algorithm terminates in a multiple alignment between the symmetric subunits with exactly one residue per 
subunit in each aligned column. The process can also be interleaved with structure-based refinement to iteratively 
improve the alignment RMSD while preserving the multiple alignment property. 
Results 
Symmetry detection 
SCOP class Number of 
Superfamilies 
Percentage of SCOP superfamiles with internal symmetry, as detected by CE-Symm 
Refinement 
Trypanosoma sialidase [SCOP domain d2agsa2], a six-bladed 
beta propeller. The alignment shown corresponds to a 120° 
rotation, permuting the structure by two blades. 
Superposition of the structure with itself (a) prior to 
refinement, and (b) after one iteration of refinement. A 
number of extraneous loops not shared by all blades are 
marked as unaligned by the refinement procedure. 
(c) Multiple alignment of the three two-blade subunits 
considered here. 
(c) 
SSRVE---LFKRKNSTVPFEESNGTIRERVVH---SFRIPT-IVNVD----GVMVAIADARYETSFDNSFIETAVKYSVDDGA 
GKPVS---LKP--LFPAEFDGI------LTKE---FIGGVGAAIVASN---GNLVYPVQIADMG----GRVFTKIMYSEDDGN 
WVEALGTLSHV--WTN------------SPTSNQQDCQSS--FVAVTIEGKRVMLFTHPLNLKGRW--MRDRLHLWMTD--NQ 
TWNTQIAIKNSRASSVSRVMDATVIVKGNKLYILVGSFNKTRNSWTQHRDGSDWEPLLVVGE-----VTKSAANGKTTATISW 
TWKFAEGRSKF------GCSEPAVLEWEGKLIINNRVD--------------GNRRLVYESS-----DMGKT----------- 
RIFDVGQISIGDE----NSGYSSVLYKDDKLYSLHEINTND-----------VYSLVFVRLIGELQLM--------------- 
Poster first presented at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (2013). 
The RCSB PDB is supported by the National Science Foundation [NSF DBI 0829586]; National Institute of General Medical Sciences; Office 
of Science, Department of Energy; National Library of Medicine; National Cancer Institute; National Institute of Neurological Disorders 
and Stroke; and the National Institute of Diabetes & Digestive & Kidney Diseases. The RCSB PDB is a member of the wwPDB. 
(a) (b) 
% symmetric 
α 503 17.4% 
β 354 17.5% 
α/β 244 17.6% 
α+β 549 12.5% 
multi-domain 66 3.0% 
membrane 108 22.0% 
All classes 1,832 16.0% ROC curves showing the performance of CE-Symm for 
detecting symmetry, on a benchmark of 1000 randomly 
selected and manually annotated SCOP superfamilies. Two 
scoring functions were considered for classification power: 
TM-Score,13 and an alternate score incorporating the 
detection of symmetry order. The TM-Score classifier has an 
AUC of 0.94. 
Abstract 
The CE-Symm algorithm has been developed to detect internal symmetry within protein chains. Symmetry is 
common across protein fold space and is tied to a number of important biological functions. Using CE-Symm we 
find that 16% of SCOP superfamilies contain internal symmetry. 
The algorithm can produce unambiguous multiple alignments between symmetric subunits. It can also be applied 
to the output of other symmetry detection algorithms to refine alignments and identify conserved regions between 
all subunits. 
1. Lee, J. & Blaber, M. PNAS 108, 126–130 (2011). 
2. Monod, J. et al. J Mol Biol 12, 88–118 (1965). 
3. Juo, Z. S. et al. J Mol Biol 261, 239–254 (1996). 
4. Goodsell, D. S. & Olson, A. J. Annu Rev Biophys Biomol 
Struct 29, 105–153 (2000). 
5. Gosavi, S. et al. J Mol Biol 357, 986–996 (2006). 
6. Fortenberry, C. et al. J Am Chem Soc 133, 18026–18029 
(2011). 
7. Murray, K. B. et al. J Mol Biol 316, 341–363 (2002). 
8. Kim, C. et al. BMC Bioinformatics 11, 303 (2010). 
9. Guerler, A. et al. J Chem Inf Model 49, 2147–2151 
(2009). 
10. Shindyalov, I. N. & Bourne, P. E. Protein Eng 11, 739– 
747 (1998). 
11. Uliel, S. et al. Bioinformatics 15, 930–936 (1999). 
12. Abraham, A.-L. et al. J Mol Biol 394, 522–534 (2009). 
13. Zhang, Y., & Skolnick, J. (2004). Proteins: Structure, 
Function, and Bioinformatics, 57(4), 702–710

Mais conteúdo relacionado

Mais procurados

(July 2016) Family-specific Kinesin Structures Reveal Neck-Linker Length Base...
(July 2016) Family-specific Kinesin Structures Reveal Neck-Linker Length Base...(July 2016) Family-specific Kinesin Structures Reveal Neck-Linker Length Base...
(July 2016) Family-specific Kinesin Structures Reveal Neck-Linker Length Base...Logan Peter
 
levels of protein structure , Domains ,motifs & Folds in protein structure
levels of protein structure , Domains ,motifs & Folds in protein structurelevels of protein structure , Domains ,motifs & Folds in protein structure
levels of protein structure , Domains ,motifs & Folds in protein structureAaqib Naseer
 
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...IOSR Journals
 
Molecular Modeling of Metalloreductase STEAP2 Protein and Docking Interaction...
Molecular Modeling of Metalloreductase STEAP2 Protein and Docking Interaction...Molecular Modeling of Metalloreductase STEAP2 Protein and Docking Interaction...
Molecular Modeling of Metalloreductase STEAP2 Protein and Docking Interaction...BRNSS Publication Hub
 
Characterising the Interactome of EZH2 in Embryonic Stem Cells (3)
Characterising the Interactome of EZH2 in Embryonic Stem Cells (3)Characterising the Interactome of EZH2 in Embryonic Stem Cells (3)
Characterising the Interactome of EZH2 in Embryonic Stem Cells (3)Daire Murphy
 
protein sequence analysis
protein sequence analysisprotein sequence analysis
protein sequence analysisRamikaSingla
 
Chapter 003 cell & tissue
Chapter 003 cell & tissueChapter 003 cell & tissue
Chapter 003 cell & tissueAdZRyfAh eFFa
 
PROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINE
PROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINEPROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINE
PROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINEijsc
 
Secondary protein structure prediction
Secondary protein structure predictionSecondary protein structure prediction
Secondary protein structure predictionSiva Dharshini R
 
Computational Analysis with ICM
Computational Analysis with ICMComputational Analysis with ICM
Computational Analysis with ICMVernon D Dutch Jr
 
A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Fami...
A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Fami...A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Fami...
A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Fami...Miguel E. Rentería, PhD
 

Mais procurados (14)

Seminar2
Seminar2Seminar2
Seminar2
 
(July 2016) Family-specific Kinesin Structures Reveal Neck-Linker Length Base...
(July 2016) Family-specific Kinesin Structures Reveal Neck-Linker Length Base...(July 2016) Family-specific Kinesin Structures Reveal Neck-Linker Length Base...
(July 2016) Family-specific Kinesin Structures Reveal Neck-Linker Length Base...
 
levels of protein structure , Domains ,motifs & Folds in protein structure
levels of protein structure , Domains ,motifs & Folds in protein structurelevels of protein structure , Domains ,motifs & Folds in protein structure
levels of protein structure , Domains ,motifs & Folds in protein structure
 
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...
 
Molecular Modeling of Metalloreductase STEAP2 Protein and Docking Interaction...
Molecular Modeling of Metalloreductase STEAP2 Protein and Docking Interaction...Molecular Modeling of Metalloreductase STEAP2 Protein and Docking Interaction...
Molecular Modeling of Metalloreductase STEAP2 Protein and Docking Interaction...
 
Characterising the Interactome of EZH2 in Embryonic Stem Cells (3)
Characterising the Interactome of EZH2 in Embryonic Stem Cells (3)Characterising the Interactome of EZH2 in Embryonic Stem Cells (3)
Characterising the Interactome of EZH2 in Embryonic Stem Cells (3)
 
protein sequence analysis
protein sequence analysisprotein sequence analysis
protein sequence analysis
 
Chapter 003 cell & tissue
Chapter 003 cell & tissueChapter 003 cell & tissue
Chapter 003 cell & tissue
 
PROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINE
PROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINEPROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINE
PROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINE
 
Secondary protein structure prediction
Secondary protein structure predictionSecondary protein structure prediction
Secondary protein structure prediction
 
Computational Analysis with ICM
Computational Analysis with ICMComputational Analysis with ICM
Computational Analysis with ICM
 
Msc thesis
Msc thesisMsc thesis
Msc thesis
 
MoMo
MoMoMoMo
MoMo
 
A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Fami...
A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Fami...A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Fami...
A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Fami...
 

Semelhante a Aligning Subunits of Internally Symmetric Proteins with CE-Symm

A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...
A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...
A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...sipij
 
AMINO ACID INTERACTION NETWORK PREDICTION USING MULTI-OBJECTIVE OPTIMIZATION
AMINO ACID INTERACTION NETWORK PREDICTION USING MULTI-OBJECTIVE OPTIMIZATIONAMINO ACID INTERACTION NETWORK PREDICTION USING MULTI-OBJECTIVE OPTIMIZATION
AMINO ACID INTERACTION NETWORK PREDICTION USING MULTI-OBJECTIVE OPTIMIZATIONcscpconf
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformaticsAbhishek Vatsa
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdfAliAhamd7
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdfAliAhamd7
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-naveed ul mushtaq
 
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISHMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISijcseit
 
HOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptxHOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptxMO.SHAHANAWAZ
 
Multi objective approach in predicting
Multi objective approach in predictingMulti objective approach in predicting
Multi objective approach in predictingijaia
 
4. sequence alignment.pptx
4. sequence alignment.pptx4. sequence alignment.pptx
4. sequence alignment.pptxArupKhakhlari1
 
CE-Symm, protein symmetry, and the evolution of protein folds
CE-Symm, protein symmetry, and the evolution of protein foldsCE-Symm, protein symmetry, and the evolution of protein folds
CE-Symm, protein symmetry, and the evolution of protein foldsDouglas Myers-Turnbull
 
Presentation1
Presentation1Presentation1
Presentation1firesea
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Scienceresearchinventy
 
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsseham15
 
Homology Modeling.pptx
Homology Modeling.pptxHomology Modeling.pptx
Homology Modeling.pptxAmnaAkram29
 
Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Robin Gutell
 

Semelhante a Aligning Subunits of Internally Symmetric Proteins with CE-Symm (20)

A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...
A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...
A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...
 
AMINO ACID INTERACTION NETWORK PREDICTION USING MULTI-OBJECTIVE OPTIMIZATION
AMINO ACID INTERACTION NETWORK PREDICTION USING MULTI-OBJECTIVE OPTIMIZATIONAMINO ACID INTERACTION NETWORK PREDICTION USING MULTI-OBJECTIVE OPTIMIZATION
AMINO ACID INTERACTION NETWORK PREDICTION USING MULTI-OBJECTIVE OPTIMIZATION
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdf
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdf
 
Homology modelling
Homology modellingHomology modelling
Homology modelling
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
 
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISHMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
 
HOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptxHOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptx
 
Structure alignment methods
Structure alignment methodsStructure alignment methods
Structure alignment methods
 
Multi objective approach in predicting
Multi objective approach in predictingMulti objective approach in predicting
Multi objective approach in predicting
 
4. sequence alignment.pptx
4. sequence alignment.pptx4. sequence alignment.pptx
4. sequence alignment.pptx
 
CE-Symm, protein symmetry, and the evolution of protein folds
CE-Symm, protein symmetry, and the evolution of protein foldsCE-Symm, protein symmetry, and the evolution of protein folds
CE-Symm, protein symmetry, and the evolution of protein folds
 
Presentation1
Presentation1Presentation1
Presentation1
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
 
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignments
 
Homology Modeling.pptx
Homology Modeling.pptxHomology Modeling.pptx
Homology Modeling.pptx
 
Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676
 
dot plot analysis
dot plot analysisdot plot analysis
dot plot analysis
 
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
 

Mais de Spencer Bliven

2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...Spencer Bliven
 
3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...
3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...
3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...Spencer Bliven
 
Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...
Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...
Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...Spencer Bliven
 
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteinsSpencer Bliven
 
Journal Club 2013-09-10: Pandya et al
Journal Club 2013-09-10: Pandya et alJournal Club 2013-09-10: Pandya et al
Journal Club 2013-09-10: Pandya et alSpencer Bliven
 
Following the Evolution of New Protein Folds via Protodomains [Report]
Following the Evolution of New Protein Folds via Protodomains [Report]Following the Evolution of New Protein Folds via Protodomains [Report]
Following the Evolution of New Protein Folds via Protodomains [Report]Spencer Bliven
 
Following the Evolution of New Protein Folds via Protodomains
Following the Evolution of New Protein Folds via ProtodomainsFollowing the Evolution of New Protein Folds via Protodomains
Following the Evolution of New Protein Folds via ProtodomainsSpencer Bliven
 
Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)
Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)
Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)Spencer Bliven
 

Mais de Spencer Bliven (9)

2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
 
3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...
3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...
3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...
 
CE-Symm jLBR talk
CE-Symm jLBR talkCE-Symm jLBR talk
CE-Symm jLBR talk
 
Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...
Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...
Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...
 
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
 
Journal Club 2013-09-10: Pandya et al
Journal Club 2013-09-10: Pandya et alJournal Club 2013-09-10: Pandya et al
Journal Club 2013-09-10: Pandya et al
 
Following the Evolution of New Protein Folds via Protodomains [Report]
Following the Evolution of New Protein Folds via Protodomains [Report]Following the Evolution of New Protein Folds via Protodomains [Report]
Following the Evolution of New Protein Folds via Protodomains [Report]
 
Following the Evolution of New Protein Folds via Protodomains
Following the Evolution of New Protein Folds via ProtodomainsFollowing the Evolution of New Protein Folds via Protodomains
Following the Evolution of New Protein Folds via Protodomains
 
Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)
Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)
Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)
 

Último

FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxBhagirath Gogikar
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONrouseeyyy
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 

Último (20)

FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 

Aligning Subunits of Internally Symmetric Proteins with CE-Symm

  • 1. Aligning Subunits of Internally Symmetric Proteins (Left) Fibroblast growth factor 1 [3JUT], colored to show internal symmetry. (Right) Dot plot showing equivalent residues within the protein. Red lines correspond to a 120° clockwise rotation of the protein around the 3-fold axis, and cyan to the 240° rotation. After duplicating the matrix, each alignment forms a sequential diagonal line which can be fully detected by CE. Gray shading indicates regions near the diagonal which are penalized by the scoring function. References Screenshot of the CE-Symm interface, showing a two-fold axis of EPSP synthase [1G6S]. This work is licensed under a Creative Commons Attribution 3.0 Unported License. Background with CE-Symm Proteins can have quaternary symmetry and/or internal symmetry Symmetry is widespread in proteins and can be observed at a number of levels, from crystal symmetry within complexes to pseudo-symmetry in individual chains and domains. Symmetry is known to play a role in protein evolution,1 allosteric regulation, 2 DNA binding,3 and cooperative enzyme effects.4 Symmetry has also been utilized to understand protein folding5 and to aid the computational design of large proteins.6 Quaternary symmetry consists of multiple identical polypeptide chains arranged in a symmetric fashion. Such symmetry is extremely common in proteins, occurring in approximately 80% of structures in the Protein Data Bank (PDB). Detecting quaternary symmetry relies on accurate assignment of the correct biological assembly for each protein. The PDB now annotates protein structures with their quaternary symmetry (Peter Rose et al., in preparation). Proteins can also have internal or ternary symmetry, when a single chain contains two or more equivalent subunits. The subunits generally will differ in the exact sequence, but have substantially similar structures. Internal symmetry i s sometimes styled as pseudosymmetry to reflect that the equivalence between subunits is generally at the level of residues or secondary structure elements rather than atoms or electron density, as is common with quaternary symmetry. Internal symmetr y can arise from quaternary by gene duplication or fusion. Thus, in addition to the many functional implications of symmetry, identifying protein symmetry can provide information about the evolutionary history of a protein. Such fission and fusion events often preserve the overall structure and function of the active complex. Existing methods for finding internal symmetry Several computational methods are available to detect symmetry. Some methods search for periodic sequences or structure (e.g. DAVROS7). These are generally limited in their ability to handle large insertions. Methods based on structural alignment algorithms (SymD,8 GANGSTA+9) can tolerate large insertions, but produce pairwise alignments between adjacent symmetric subunits rather than a global alignment of all subunits. This leads to ambiguous alignments, where a single residue could be aligned to several residues in each other subunit, depending on the order in which rotation operations are performed. Conclusion CE-Symm was run over a large hand-curated benchmark, and is able to detect symmetric proteins with a high degree of accuracy, even in the presence of large insertions. The resulting alignment includes exactly one residue from each subunit, as expected for a multiple alignment. It runs quickly and is able to detect symmetry broadly across a variety of folds. The refinement stage can also be used as an independent tool in conjunction with seed alignments from other tools. This allows the circularly permuted alignments from tools such as SymD8 to be refined into multiple alignments between individual subunits. Because symmetry is hypothesized to derive from gene duplications and fusions,12 aligning subunits within symmetric proteins can reveal ancient homologies and conserved sequences. CE-Symm is useful both for identifying symmetric proteins and for aligning the subunits for further study. Availability: CE-Symm source code is available under the LGPL license from https://github.com/rcsb/symmetry An online server is available at http://source.rcsb.org/jfatcatserver/symmetry.jsp Spencer Bliven Bioinformatics and Systems Biology Program University of California San Diego Douglas Myers-Turnbull Dept. of Computer Science & Engineering University of California San Diego Philip Bourne Skaggs School of Pharmacy and Pharmaceutical Sciences University of California San Diego Andreas Prlić San Diego Supercomputer Center University of California San Diego (Left) Beta-carbonic anhydrase from Porphyridium purpureum [1I6O] is a quatramer with D2 quaternary symmetry. (Right) The beta-carbonic anhydrase in E. coli [1DDZ] consists of only two chains, which each have internal C2 symmetry in addition to the C2 quaternary symmetry. The two halves of the chain have 68% sequence identity, strongly indicating that a duplication and fusion event has occured in the evolution of E. coli. D5 quaternary symmetry of GTP cyclohydrolase I [1A8R]. The main 5-fold axis is shown in red; the five 2-fold axes are in blue. Methods The CE-Symm program is able to detect internal symmetry in proteins. It first identifies structurally similar regions within the protein structure. It then refines this alignment to improve the correspondence between subunits. 1. Identify structurally similar regions The CE-Symm algorithm starts by identifying a non-trivial structural alignment between a protein and itself using Combinatorial Extension10 (CE). This uses the dynamic programming and progressive refinement of CE, but with two modifications. 1.A strong penalty term is added to self-aligned residues to prevent the trivial 0° rotation from dominating. 2.The alignment matrix is duplicated in the manner of Uliel et al.11 to account for the circular permutation which is introduced when comparing a symmetric protein against a rotated copy of itself. 2. Refinement to ensure transitivity The structural alignment from the first step is then refined to produce a residue-level equivalence map between subunits. Refinement produces a consistent multiple alignment between all identified subunits. The order, k, of rotational symmetry present in the protein (if any) is determined by successively applying the seed alignment until the original orientation is found. Let f be a function over all residues in the protein, such that f(i)=j when i is aligned to j. The goal is to modify f such that k applications of f (i.e. rotations of the protein) give a trivial alignment. Formally, ∀i f k(i)=i. To constrain the modifications, we introduce a penalty function σ(i) which goes to zero when the previous condition is met. Two such penalty functions were considered: 1. σ(i) = |f k(i)-i|. This measures the number of insertions or deletions which would need to be added to be made in order to bring residue i into alignment 2. σ(i) = |d( f k-1(i), f k(i)) - d(i,f k-1(i))|, where d(i,j) gives the distance between alpha carbons of residues i and j. This minimizes the changes in RMSD required during refinement. The algorithm works by choosing the residue with minimal score and modifying the alignment such that f k(i)=i. To ensure that the alignment remains sequential and well-formed, the selection of residue to modify is limited by the following “eligibility criteria.” 1. f k-1(i) is defined (f k(i) may be undefined) 2. σ(i)>0 3. σ(f k-1(i)) > 0 4. ∀j s.t. σ(j)=0: sign(f k-1(i)-j ) = sign( i-f(j) ) Eligible residues are chosen in order of increasing score, and the alignment modified to set f k-1(i) ⟵i. This process is repeated until no eligible residues remain, at which point remaining residues are removed from the alignment. This algorithm terminates in a multiple alignment between the symmetric subunits with exactly one residue per subunit in each aligned column. The process can also be interleaved with structure-based refinement to iteratively improve the alignment RMSD while preserving the multiple alignment property. Results Symmetry detection SCOP class Number of Superfamilies Percentage of SCOP superfamiles with internal symmetry, as detected by CE-Symm Refinement Trypanosoma sialidase [SCOP domain d2agsa2], a six-bladed beta propeller. The alignment shown corresponds to a 120° rotation, permuting the structure by two blades. Superposition of the structure with itself (a) prior to refinement, and (b) after one iteration of refinement. A number of extraneous loops not shared by all blades are marked as unaligned by the refinement procedure. (c) Multiple alignment of the three two-blade subunits considered here. (c) SSRVE---LFKRKNSTVPFEESNGTIRERVVH---SFRIPT-IVNVD----GVMVAIADARYETSFDNSFIETAVKYSVDDGA GKPVS---LKP--LFPAEFDGI------LTKE---FIGGVGAAIVASN---GNLVYPVQIADMG----GRVFTKIMYSEDDGN WVEALGTLSHV--WTN------------SPTSNQQDCQSS--FVAVTIEGKRVMLFTHPLNLKGRW--MRDRLHLWMTD--NQ TWNTQIAIKNSRASSVSRVMDATVIVKGNKLYILVGSFNKTRNSWTQHRDGSDWEPLLVVGE-----VTKSAANGKTTATISW TWKFAEGRSKF------GCSEPAVLEWEGKLIINNRVD--------------GNRRLVYESS-----DMGKT----------- RIFDVGQISIGDE----NSGYSSVLYKDDKLYSLHEINTND-----------VYSLVFVRLIGELQLM--------------- Poster first presented at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (2013). The RCSB PDB is supported by the National Science Foundation [NSF DBI 0829586]; National Institute of General Medical Sciences; Office of Science, Department of Energy; National Library of Medicine; National Cancer Institute; National Institute of Neurological Disorders and Stroke; and the National Institute of Diabetes & Digestive & Kidney Diseases. The RCSB PDB is a member of the wwPDB. (a) (b) % symmetric α 503 17.4% β 354 17.5% α/β 244 17.6% α+β 549 12.5% multi-domain 66 3.0% membrane 108 22.0% All classes 1,832 16.0% ROC curves showing the performance of CE-Symm for detecting symmetry, on a benchmark of 1000 randomly selected and manually annotated SCOP superfamilies. Two scoring functions were considered for classification power: TM-Score,13 and an alternate score incorporating the detection of symmetry order. The TM-Score classifier has an AUC of 0.94. Abstract The CE-Symm algorithm has been developed to detect internal symmetry within protein chains. Symmetry is common across protein fold space and is tied to a number of important biological functions. Using CE-Symm we find that 16% of SCOP superfamilies contain internal symmetry. The algorithm can produce unambiguous multiple alignments between symmetric subunits. It can also be applied to the output of other symmetry detection algorithms to refine alignments and identify conserved regions between all subunits. 1. Lee, J. & Blaber, M. PNAS 108, 126–130 (2011). 2. Monod, J. et al. J Mol Biol 12, 88–118 (1965). 3. Juo, Z. S. et al. J Mol Biol 261, 239–254 (1996). 4. Goodsell, D. S. & Olson, A. J. Annu Rev Biophys Biomol Struct 29, 105–153 (2000). 5. Gosavi, S. et al. J Mol Biol 357, 986–996 (2006). 6. Fortenberry, C. et al. J Am Chem Soc 133, 18026–18029 (2011). 7. Murray, K. B. et al. J Mol Biol 316, 341–363 (2002). 8. Kim, C. et al. BMC Bioinformatics 11, 303 (2010). 9. Guerler, A. et al. J Chem Inf Model 49, 2147–2151 (2009). 10. Shindyalov, I. N. & Bourne, P. E. Protein Eng 11, 739– 747 (1998). 11. Uliel, S. et al. Bioinformatics 15, 930–936 (1999). 12. Abraham, A.-L. et al. J Mol Biol 394, 522–534 (2009). 13. Zhang, Y., & Skolnick, J. (2004). Proteins: Structure, Function, and Bioinformatics, 57(4), 702–710