2. Overview About me Software projects What is TOPSAN? TOPSAN and the semantic web 2
3. About me Work Currently work at the Sanford-Burnham Medical Research Institute Previously worked at Genomics Institute of the Novartis Research Foundation (GNF) PhD in Prof. Sean Eddy’s lab at Washington University Medical School Research Interests Regulatory pathway evolution Comparative functional genomics Sequence function analysis and prediction Phylogenetic inference Genome evolution Protein structure evolution Evo-devo Computer Java Ruby (BioRuby) Learning Scala (http://www.scala-lang.org/) and re-learning C++ Graphics, Visualization, Linked Data Website: http://www.cmzmasek.net/ 3
4. Software projects forester Open source libraries ofJava andRuby software for Comparative Genomics and Evolutionary Biology research http://www.phylosoft.org/forester/ phyloXML XML for evolutionary biology and comparative genomics http://www.phyloxml.org/ 4
6. What is TOPSAN? TOPSAN: The Open Protein Structure Annotation Network http://www.topsan.org/ Ten-thousands of protein structures have been determined by structural genomics (SG) centers and many more are expected While these structures are available in PDB (Protein Data Bank)… … annotations for most of them a limited to one-line PDB titles TOPSAN is the first database that specifically focuses on proving extensive annotations for almost 10,000 structures solved by the SG centers 6
7. What is TOPSAN? TOPSAN contains collaborative annotations of protein structures Combines automated with human edited elements Spans the range of analysis of single proteins characterization of protein families reconstruction of entire genomes Created by SG staff and over 400 external users Containing over 7,250 proteins Collaborating with PFAM to use JCSG structures to refine and create PFAM families 7
8. “Semantic” TOPSAN Using the principles of the semantic web to turn TOPSAN into a database that can be: Searched Exported Thus – by linking curated and automatic annotations to the semantic web, TOPSAN content can be connected to a larger set of analysis tools and databases 8
9. Acknowledgements Kyle Ellrott Dana Weekes ConstantinaBakolitsa John Wooley Adam Godzik Joint Center for Structural Genomics Sanford-Burnham Medical Research Institute, La Jolla, California, USA University of California, San Diego, La Jolla, California, USA Joint Center for Molecular Modeling 9
10. Thank You!! 10 Biohackathon organizers JST – Japanese funding agency for Science and Technology NBDC – National Bioscience Database Center DBCLS – Database Center for Life Science