SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
n
                                                               o
    Kazuharu Arakawa
0




                                                               i
    Institute for Advanced Biosciences, Keio University
    Graduate School of Media and Governance
    Expertise: Bioinformatics, Systems Biology




                                                               t
                                                          g.



                                                               c
                                                               u
                                                               d
                                                               o
                                                               r
                                                               p
G-language
Web Service Interface
   Institute for Advanced
 Biosciences, Keio University
      KAZUHARU ARAKAWA
         NOBUHIRO KIDO
         KAZUKI OSHITA
         MASARU TOMITA


         2009.06.27
G-language Project
•   First release in 2001 (Now 1.8.8)
•   Perl library, interactive shell, 100+ applications, GUI
•   Focus on analysis of bacterial genomes.
•   compatible with BioPerl (10~50x faster for manipulating
    genome flatfile)                      Arakawa et al. (2003) Bioinformatics
                                        Arakawa et al. (2006) Journal of Pesticide Science
•   http://www.g-language.org/          Arakawa et al. (2008) Genes Genomes Genomics




       Arakawa et al. (2009) BMC Bioinformatics
BAS_engine                  _eri_reader                          blastcutting         graphical_LTR_search       redundancy_fasta
BAS_parser                  _eri_update_with_kegg                blastparser          icdi                       redundancy_sim4
BAS_scripter                _fasta                               bui                  leading_strand             rep_ori_ter
CHI_engine                  _file_list_for_mapping                cai                  load_kegg_api              rmpolya
CHI_parser                  _find_bad_substance                   calc_pI              load_kegg_api3             rscu
CHI_scripter                _find_pathway_gap                     cap3_parse           load_rcluster              run_glimmerM
COMGA_correlation           _foreach_blastpointer_for_mapping    cbi                  longest_ORF                sdb_load
COMGA_engine                _foreach_mask_repeat_for_mapping     cds_echo             ma_filter                   sdb_save
COMGA_parser                _formatdb                            cei                  ma_normalize               seq2png
COMGA_scripter              _formatdb_for_mapping                cluster              ma_rfilter                  seqinfo
COMGA_table_maker           _gblaster                            codon_compiler       mapping_blast2             set_cogpath
DONT_USE_ERRO               _h2v                                 codon_counter        mapping_sim4               set_goa
GEMS_engine                 _hmmpfam                             codon_usage          markov                     set_gpac
GEMS_parser                 _jstat_for_STeP




                        100+
                                                                 cognitor             maskseq                    set_operon
GEMS_scripter               _jstat_for_mapping                   complement           match_test                 shannon_cu
KeySearch                   _key_printer                         consensus_z          molecular_weight           sim4_parse
PubMedSearch                _list_clusterer                      cum_gcskew           msg_ask_interface          splitprintseq
RNAfold                     _list_sorter                         diffseq              msg_error                  ss2er
STeP_engine                 _makegaplist                         dignitor             msg_gimv                   stderr
STeP_parser                 _mask_repeat_for_mapping             ecell                msg_interface              stdin
STeP_scripter               _oligomer_translation                eliminate_atg        msg_percent                stdout
_R_RNA_graph                _over_lapping_printer                eliminate_pat        msg_progress               substance_layout
_R_base_graph               _post_blast_clusterer                enc                  msg_send                   substance_layout2
_STS_divider_for_STeP       _print_tandem                        enzyme_layout        msg_set_gimv               test_gpac
_STS_modifer_for_STeP       _repeatmasker                        equitability         msg_system_console         translate
_UniMultiGrapher            _sdb_path                            er2eri               msg_term_console           usage_dist
_UniUniGrapher              _set_sdb_path                        fasta_parse          oligomer_counter           valid_CDS
_acc2ftp_bacteria           _sim4                                file_maker            opt_as_gb                  view_cds
_base_printer               _sts2pg_for_STeP                     file_maker_fasta      opt_default                w_value
_blast                      _translate                           find_dnaAbox          opt_get
_blast_db_for_mapping       _trf                                 find_identical_gene   opt_val
_blast_for_mapping          _value_printer                       find_king_of_gene     ori_search
_blast_tp_finder             aa_codon_compiler                    find_ori_ter          output_maker
_blastpointer_for_mapping   aa_codon_usage                       find_seq              over_lapping_finder
_cap3                       aaui                                 find_tandem           palindrome
_clustalw                   alignment                            fop                  pasteseq
_codon_amino_printer        amino_counter                        foreach_RNAfold      peptide_mass
_codon_table                amino_info                           foreach_tandem       phx
_codon_usage_printer        annotate_with_glimmerM               form_sim4            plasmid_map
_codon_usage_table          atcgcon                              funcD                print_gene_function_list
_complement                 base_counter                         gcskew               pseudo_atg
_csv_h2v                    base_entropy                         gcwin                qstat
_cutquery_for_mapping       base_individual_information_matrix   genome_map           qsub
_distance_cu                base_information_content             genome_map2          query_strand
_ePCR_for_STeP              base_relative_entropy                genomicskew          read_goa
_ecell_name2kegg_compound   base_z_value                         gopac                redundancy
_eri_extracter              blast_parse                          gpac                 redundancy_cap3
Perl API: BioPerl vs G
use Bio::SeqIO;

$in = Bio::SeqIO->new(-file=>"ecoli.gbk", '-format'=>'GenBank');
$seq = $in->next_seq();

foreach $feat ($seq->all_SeqFeatures()){
   next unless($feat->primary_tag eq ‘CDS’);
   print $feat->each_tag_value(“note”), “¥n”;
}


use Bio::DB::GenBank;
use Bio::Seq;

$gb = new Bio::DB::GenBank;
$seq = $gb->get_Seq_by_acc(“NC_000913”);


use G;

$gb = load ecoli; # $gb = load(“genbank:NC_000913”);

foreach $cds ($gb->cds()){
    say $gb->{$cds}->{note};
}
Interactive Shell
• fully functional Perl shell
• basic UNIX commands
• mix of the above (weird)
    • print togoWS(‘NC_000908’) |head -n 10 |wc > out.txt
• tab completion (file, functions), history, editing with
    EMACS key binding
•   persistent data
•   logging
•   search for functions (like wossname in EMBOSS) and
    reading documentations (like tfm in EMBOSS), both for
    G-language API and BioPerl classes
•   database search (NCBI, KEGG, UniProt ... and more)
•   sequence and data retrieval
Web Service Interface - Overview
            Deveopment supported by BioHackathon 2009 in Okinawa, Japan
REST Interface                                  http://rest.g-language.org
                                                  http://useG.jp
1. Accessing genome flatfile data
    http://useG.jp/[species]/[gene]/[feature]
 a. http://useG.jp/ecoli/                 - Nucleotide composition of E.coli genome
b. http://useG.jp/ecoli/recA              - Feature information about recA gene
c. http://useG.jp/ecoli/recA/start        - Start position of recA gene
d. http://useG.jp/ecoli/*/translation - Amino acid sequence of all genes (FASTA)

2. Manipulating genome data
    http://useG.jp/[species]/[gene]/[method]/[option=value]/...
a. http://useG.jp/method_list/gb                     - List all available methods
b. http://useG.jp/NC_000913/*/before_startcodon - Retrieve upstream sequence of all genes

3. Genome sequence analysis
    http://useG.jp/[species]/[method]/[option=value]/...
a. http://useG.jp/method_list/                               - List all available methods
b. http://useG.jp/mgen/gcskew/window=1000/                   - GC skew of M.genitalium with
c. http://useG.jp/mgen/gcskew/cumulative=1/output=f/ 1000bpwindows
                                                             - Get the raw GC skew result as
                                                             CSV data
4. Other methods (not requiring genome sequence input)
    http://useG.jp/[method]/[option=value]/...
a. http://useG.jp/togoWS/C00001 - Retrieve KEGG C00001 through togoWS
b. http://useG.jp/help/gcskew          - Show manual for gcskew method
AJAX/CGI Interface http://ws.g-language.org/atelier/
Bio::Glite - light weight version using the REST service




                         32kb in size,
                         only requires LWP::UserAgent
                         easy install via “cpan Bio::Glite”
SOAP Service
    http://soap.g-language.org/g-language.wsdl
Works with
Taverna2
9 example workflows
are already available
at
ISMB Posters
Poster U16
Command-line-based integration of online bioinformatics resources
Kazuki Oshita, Kazuharu Arakawa, Masaru Tomita

Poster U22
G-language Genome Analysis Environment Version 2: Integrated
workbench for computational genome sequence analysis
Kazuharu Arakawa, Masaru Tomita

Poster X023
Automatic layout tool for large-scale metabolic pathway models based
on KEGG Atlas and SBML/SBGN
Nobuhiro Kido, Nobuaki Kono, Kazuharu Arakawa, Masaru Tomita

Poster X031
Pathway Projector: Web-based Zoomable Pathway Browser using
KEGG Atlas and Google Maps API
Nobuaki Kono, Kazuharu Arakawa, Nobuhiro Kido, Ryu Ogawa, Kazuki
Oshita, Keita Ikegami, Satoshi Tamaki, Masaru Tomita
Nobuhiro the REST/CGI server
   deveoloped
              Kido                             Kazuki Oshitaserver
                                                  deveoloped the SOAP




                         Acknowledgements
                           BioHackathon 2009 sponsored by DBCLS and OIST of Japan

                           Yamagata Prefectural Government and Tsuruoka City.

Mais conteúdo relacionado

Semelhante a Arakawa_Glanguage_BOSC2009

Legacy Analysis: How Hadoop Streaming Enables Software Reuse – A Genomics Cas...
Legacy Analysis: How Hadoop Streaming Enables Software Reuse – A Genomics Cas...Legacy Analysis: How Hadoop Streaming Enables Software Reuse – A Genomics Cas...
Legacy Analysis: How Hadoop Streaming Enables Software Reuse – A Genomics Cas...StampedeCon
 
Benchmarking Perl Lightning Talk (NPW 2007)
Benchmarking Perl Lightning Talk (NPW 2007)Benchmarking Perl Lightning Talk (NPW 2007)
Benchmarking Perl Lightning Talk (NPW 2007)brian d foy
 
20141219 workshop methylation sequencing analysis
20141219 workshop methylation sequencing analysis20141219 workshop methylation sequencing analysis
20141219 workshop methylation sequencing analysisYi-Feng Chang
 
Biomart Update
Biomart UpdateBiomart Update
Biomart Updatebosc
 
2014 khmer protocols
2014 khmer protocols2014 khmer protocols
2014 khmer protocolsc.titus.brown
 
Damage Control
Damage ControlDamage Control
Damage Controlsintaxi
 

Semelhante a Arakawa_Glanguage_BOSC2009 (9)

M Sc Project
M Sc ProjectM Sc Project
M Sc Project
 
Legacy Analysis: How Hadoop Streaming Enables Software Reuse – A Genomics Cas...
Legacy Analysis: How Hadoop Streaming Enables Software Reuse – A Genomics Cas...Legacy Analysis: How Hadoop Streaming Enables Software Reuse – A Genomics Cas...
Legacy Analysis: How Hadoop Streaming Enables Software Reuse – A Genomics Cas...
 
Benchmarking Perl Lightning Talk (NPW 2007)
Benchmarking Perl Lightning Talk (NPW 2007)Benchmarking Perl Lightning Talk (NPW 2007)
Benchmarking Perl Lightning Talk (NPW 2007)
 
20141219 workshop methylation sequencing analysis
20141219 workshop methylation sequencing analysis20141219 workshop methylation sequencing analysis
20141219 workshop methylation sequencing analysis
 
Rcpp is-ready
Rcpp is-readyRcpp is-ready
Rcpp is-ready
 
Biomart Update
Biomart UpdateBiomart Update
Biomart Update
 
2014 khmer protocols
2014 khmer protocols2014 khmer protocols
2014 khmer protocols
 
Primer design
Primer designPrimer design
Primer design
 
Damage Control
Damage ControlDamage Control
Damage Control
 

Mais de bosc

Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009bosc
 
Bosc Intro 20090627
Bosc Intro 20090627Bosc Intro 20090627
Bosc Intro 20090627bosc
 
Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009bosc
 
Schbath Rmes Bosc2009
Schbath Rmes Bosc2009Schbath Rmes Bosc2009
Schbath Rmes Bosc2009bosc
 
Kallio Chipster Bosc2009
Kallio Chipster Bosc2009Kallio Chipster Bosc2009
Kallio Chipster Bosc2009bosc
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009bosc
 
Rice Emboss Bosc2009
Rice Emboss Bosc2009Rice Emboss Bosc2009
Rice Emboss Bosc2009bosc
 
Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009bosc
 
Senger Soaplab Bosc2009
Senger Soaplab Bosc2009Senger Soaplab Bosc2009
Senger Soaplab Bosc2009bosc
 
Cock Biopython Bosc2009
Cock Biopython Bosc2009Cock Biopython Bosc2009
Cock Biopython Bosc2009bosc
 
Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009bosc
 
Snell Psoda Bosc2009
Snell Psoda Bosc2009Snell Psoda Bosc2009
Snell Psoda Bosc2009bosc
 
Procter Vamsas Bosc2009
Procter Vamsas Bosc2009Procter Vamsas Bosc2009
Procter Vamsas Bosc2009bosc
 
Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009bosc
 
Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009bosc
 
Moeller Debian Bosc2009
Moeller Debian Bosc2009Moeller Debian Bosc2009
Moeller Debian Bosc2009bosc
 
Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009bosc
 
Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009bosc
 
Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009bosc
 
Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009bosc
 

Mais de bosc (20)

Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009
 
Bosc Intro 20090627
Bosc Intro 20090627Bosc Intro 20090627
Bosc Intro 20090627
 
Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009
 
Schbath Rmes Bosc2009
Schbath Rmes Bosc2009Schbath Rmes Bosc2009
Schbath Rmes Bosc2009
 
Kallio Chipster Bosc2009
Kallio Chipster Bosc2009Kallio Chipster Bosc2009
Kallio Chipster Bosc2009
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009
 
Rice Emboss Bosc2009
Rice Emboss Bosc2009Rice Emboss Bosc2009
Rice Emboss Bosc2009
 
Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009
 
Senger Soaplab Bosc2009
Senger Soaplab Bosc2009Senger Soaplab Bosc2009
Senger Soaplab Bosc2009
 
Cock Biopython Bosc2009
Cock Biopython Bosc2009Cock Biopython Bosc2009
Cock Biopython Bosc2009
 
Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009
 
Snell Psoda Bosc2009
Snell Psoda Bosc2009Snell Psoda Bosc2009
Snell Psoda Bosc2009
 
Procter Vamsas Bosc2009
Procter Vamsas Bosc2009Procter Vamsas Bosc2009
Procter Vamsas Bosc2009
 
Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009
 
Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009
 
Moeller Debian Bosc2009
Moeller Debian Bosc2009Moeller Debian Bosc2009
Moeller Debian Bosc2009
 
Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009
 
Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009
 
Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009
 
Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009
 

Último

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Último (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Arakawa_Glanguage_BOSC2009

  • 1. n o Kazuharu Arakawa 0 i Institute for Advanced Biosciences, Keio University Graduate School of Media and Governance Expertise: Bioinformatics, Systems Biology t g. c u d o r p
  • 2. G-language Web Service Interface Institute for Advanced Biosciences, Keio University KAZUHARU ARAKAWA NOBUHIRO KIDO KAZUKI OSHITA MASARU TOMITA 2009.06.27
  • 3. G-language Project • First release in 2001 (Now 1.8.8) • Perl library, interactive shell, 100+ applications, GUI • Focus on analysis of bacterial genomes. • compatible with BioPerl (10~50x faster for manipulating genome flatfile) Arakawa et al. (2003) Bioinformatics Arakawa et al. (2006) Journal of Pesticide Science • http://www.g-language.org/ Arakawa et al. (2008) Genes Genomes Genomics Arakawa et al. (2009) BMC Bioinformatics
  • 4.
  • 5. BAS_engine _eri_reader blastcutting graphical_LTR_search redundancy_fasta BAS_parser _eri_update_with_kegg blastparser icdi redundancy_sim4 BAS_scripter _fasta bui leading_strand rep_ori_ter CHI_engine _file_list_for_mapping cai load_kegg_api rmpolya CHI_parser _find_bad_substance calc_pI load_kegg_api3 rscu CHI_scripter _find_pathway_gap cap3_parse load_rcluster run_glimmerM COMGA_correlation _foreach_blastpointer_for_mapping cbi longest_ORF sdb_load COMGA_engine _foreach_mask_repeat_for_mapping cds_echo ma_filter sdb_save COMGA_parser _formatdb cei ma_normalize seq2png COMGA_scripter _formatdb_for_mapping cluster ma_rfilter seqinfo COMGA_table_maker _gblaster codon_compiler mapping_blast2 set_cogpath DONT_USE_ERRO _h2v codon_counter mapping_sim4 set_goa GEMS_engine _hmmpfam codon_usage markov set_gpac GEMS_parser _jstat_for_STeP 100+ cognitor maskseq set_operon GEMS_scripter _jstat_for_mapping complement match_test shannon_cu KeySearch _key_printer consensus_z molecular_weight sim4_parse PubMedSearch _list_clusterer cum_gcskew msg_ask_interface splitprintseq RNAfold _list_sorter diffseq msg_error ss2er STeP_engine _makegaplist dignitor msg_gimv stderr STeP_parser _mask_repeat_for_mapping ecell msg_interface stdin STeP_scripter _oligomer_translation eliminate_atg msg_percent stdout _R_RNA_graph _over_lapping_printer eliminate_pat msg_progress substance_layout _R_base_graph _post_blast_clusterer enc msg_send substance_layout2 _STS_divider_for_STeP _print_tandem enzyme_layout msg_set_gimv test_gpac _STS_modifer_for_STeP _repeatmasker equitability msg_system_console translate _UniMultiGrapher _sdb_path er2eri msg_term_console usage_dist _UniUniGrapher _set_sdb_path fasta_parse oligomer_counter valid_CDS _acc2ftp_bacteria _sim4 file_maker opt_as_gb view_cds _base_printer _sts2pg_for_STeP file_maker_fasta opt_default w_value _blast _translate find_dnaAbox opt_get _blast_db_for_mapping _trf find_identical_gene opt_val _blast_for_mapping _value_printer find_king_of_gene ori_search _blast_tp_finder aa_codon_compiler find_ori_ter output_maker _blastpointer_for_mapping aa_codon_usage find_seq over_lapping_finder _cap3 aaui find_tandem palindrome _clustalw alignment fop pasteseq _codon_amino_printer amino_counter foreach_RNAfold peptide_mass _codon_table amino_info foreach_tandem phx _codon_usage_printer annotate_with_glimmerM form_sim4 plasmid_map _codon_usage_table atcgcon funcD print_gene_function_list _complement base_counter gcskew pseudo_atg _csv_h2v base_entropy gcwin qstat _cutquery_for_mapping base_individual_information_matrix genome_map qsub _distance_cu base_information_content genome_map2 query_strand _ePCR_for_STeP base_relative_entropy genomicskew read_goa _ecell_name2kegg_compound base_z_value gopac redundancy _eri_extracter blast_parse gpac redundancy_cap3
  • 6. Perl API: BioPerl vs G use Bio::SeqIO; $in = Bio::SeqIO->new(-file=>"ecoli.gbk", '-format'=>'GenBank'); $seq = $in->next_seq(); foreach $feat ($seq->all_SeqFeatures()){ next unless($feat->primary_tag eq ‘CDS’); print $feat->each_tag_value(“note”), “¥n”; } use Bio::DB::GenBank; use Bio::Seq; $gb = new Bio::DB::GenBank; $seq = $gb->get_Seq_by_acc(“NC_000913”); use G; $gb = load ecoli; # $gb = load(“genbank:NC_000913”); foreach $cds ($gb->cds()){ say $gb->{$cds}->{note}; }
  • 7. Interactive Shell • fully functional Perl shell • basic UNIX commands • mix of the above (weird) • print togoWS(‘NC_000908’) |head -n 10 |wc > out.txt • tab completion (file, functions), history, editing with EMACS key binding • persistent data • logging • search for functions (like wossname in EMBOSS) and reading documentations (like tfm in EMBOSS), both for G-language API and BioPerl classes • database search (NCBI, KEGG, UniProt ... and more) • sequence and data retrieval
  • 8. Web Service Interface - Overview Deveopment supported by BioHackathon 2009 in Okinawa, Japan
  • 9. REST Interface http://rest.g-language.org http://useG.jp 1. Accessing genome flatfile data http://useG.jp/[species]/[gene]/[feature] a. http://useG.jp/ecoli/ - Nucleotide composition of E.coli genome b. http://useG.jp/ecoli/recA - Feature information about recA gene c. http://useG.jp/ecoli/recA/start - Start position of recA gene d. http://useG.jp/ecoli/*/translation - Amino acid sequence of all genes (FASTA) 2. Manipulating genome data http://useG.jp/[species]/[gene]/[method]/[option=value]/... a. http://useG.jp/method_list/gb - List all available methods b. http://useG.jp/NC_000913/*/before_startcodon - Retrieve upstream sequence of all genes 3. Genome sequence analysis http://useG.jp/[species]/[method]/[option=value]/... a. http://useG.jp/method_list/ - List all available methods b. http://useG.jp/mgen/gcskew/window=1000/ - GC skew of M.genitalium with c. http://useG.jp/mgen/gcskew/cumulative=1/output=f/ 1000bpwindows - Get the raw GC skew result as CSV data 4. Other methods (not requiring genome sequence input) http://useG.jp/[method]/[option=value]/... a. http://useG.jp/togoWS/C00001 - Retrieve KEGG C00001 through togoWS b. http://useG.jp/help/gcskew - Show manual for gcskew method
  • 11. Bio::Glite - light weight version using the REST service 32kb in size, only requires LWP::UserAgent easy install via “cpan Bio::Glite”
  • 12. SOAP Service http://soap.g-language.org/g-language.wsdl
  • 13. Works with Taverna2 9 example workflows are already available at
  • 14.
  • 15. ISMB Posters Poster U16 Command-line-based integration of online bioinformatics resources Kazuki Oshita, Kazuharu Arakawa, Masaru Tomita Poster U22 G-language Genome Analysis Environment Version 2: Integrated workbench for computational genome sequence analysis Kazuharu Arakawa, Masaru Tomita Poster X023 Automatic layout tool for large-scale metabolic pathway models based on KEGG Atlas and SBML/SBGN Nobuhiro Kido, Nobuaki Kono, Kazuharu Arakawa, Masaru Tomita Poster X031 Pathway Projector: Web-based Zoomable Pathway Browser using KEGG Atlas and Google Maps API Nobuaki Kono, Kazuharu Arakawa, Nobuhiro Kido, Ryu Ogawa, Kazuki Oshita, Keita Ikegami, Satoshi Tamaki, Masaru Tomita
  • 16. Nobuhiro the REST/CGI server deveoloped Kido Kazuki Oshitaserver deveoloped the SOAP Acknowledgements BioHackathon 2009 sponsored by DBCLS and OIST of Japan Yamagata Prefectural Government and Tsuruoka City.