1. n
o
Kazuharu Arakawa
0
i
Institute for Advanced Biosciences, Keio University
Graduate School of Media and Governance
Expertise: Bioinformatics, Systems Biology
t
g.
c
u
d
o
r
p
2. G-language
Web Service Interface
Institute for Advanced
Biosciences, Keio University
KAZUHARU ARAKAWA
NOBUHIRO KIDO
KAZUKI OSHITA
MASARU TOMITA
2009.06.27
3. G-language Project
• First release in 2001 (Now 1.8.8)
• Perl library, interactive shell, 100+ applications, GUI
• Focus on analysis of bacterial genomes.
• compatible with BioPerl (10~50x faster for manipulating
genome flatfile) Arakawa et al. (2003) Bioinformatics
Arakawa et al. (2006) Journal of Pesticide Science
• http://www.g-language.org/ Arakawa et al. (2008) Genes Genomes Genomics
Arakawa et al. (2009) BMC Bioinformatics
6. Perl API: BioPerl vs G
use Bio::SeqIO;
$in = Bio::SeqIO->new(-file=>"ecoli.gbk", '-format'=>'GenBank');
$seq = $in->next_seq();
foreach $feat ($seq->all_SeqFeatures()){
next unless($feat->primary_tag eq ‘CDS’);
print $feat->each_tag_value(“note”), “¥n”;
}
use Bio::DB::GenBank;
use Bio::Seq;
$gb = new Bio::DB::GenBank;
$seq = $gb->get_Seq_by_acc(“NC_000913”);
use G;
$gb = load ecoli; # $gb = load(“genbank:NC_000913”);
foreach $cds ($gb->cds()){
say $gb->{$cds}->{note};
}
7. Interactive Shell
• fully functional Perl shell
• basic UNIX commands
• mix of the above (weird)
• print togoWS(‘NC_000908’) |head -n 10 |wc > out.txt
• tab completion (file, functions), history, editing with
EMACS key binding
• persistent data
• logging
• search for functions (like wossname in EMBOSS) and
reading documentations (like tfm in EMBOSS), both for
G-language API and BioPerl classes
• database search (NCBI, KEGG, UniProt ... and more)
• sequence and data retrieval
8. Web Service Interface - Overview
Deveopment supported by BioHackathon 2009 in Okinawa, Japan
9. REST Interface http://rest.g-language.org
http://useG.jp
1. Accessing genome flatfile data
http://useG.jp/[species]/[gene]/[feature]
a. http://useG.jp/ecoli/ - Nucleotide composition of E.coli genome
b. http://useG.jp/ecoli/recA - Feature information about recA gene
c. http://useG.jp/ecoli/recA/start - Start position of recA gene
d. http://useG.jp/ecoli/*/translation - Amino acid sequence of all genes (FASTA)
2. Manipulating genome data
http://useG.jp/[species]/[gene]/[method]/[option=value]/...
a. http://useG.jp/method_list/gb - List all available methods
b. http://useG.jp/NC_000913/*/before_startcodon - Retrieve upstream sequence of all genes
3. Genome sequence analysis
http://useG.jp/[species]/[method]/[option=value]/...
a. http://useG.jp/method_list/ - List all available methods
b. http://useG.jp/mgen/gcskew/window=1000/ - GC skew of M.genitalium with
c. http://useG.jp/mgen/gcskew/cumulative=1/output=f/ 1000bpwindows
- Get the raw GC skew result as
CSV data
4. Other methods (not requiring genome sequence input)
http://useG.jp/[method]/[option=value]/...
a. http://useG.jp/togoWS/C00001 - Retrieve KEGG C00001 through togoWS
b. http://useG.jp/help/gcskew - Show manual for gcskew method
15. ISMB Posters
Poster U16
Command-line-based integration of online bioinformatics resources
Kazuki Oshita, Kazuharu Arakawa, Masaru Tomita
Poster U22
G-language Genome Analysis Environment Version 2: Integrated
workbench for computational genome sequence analysis
Kazuharu Arakawa, Masaru Tomita
Poster X023
Automatic layout tool for large-scale metabolic pathway models based
on KEGG Atlas and SBML/SBGN
Nobuhiro Kido, Nobuaki Kono, Kazuharu Arakawa, Masaru Tomita
Poster X031
Pathway Projector: Web-based Zoomable Pathway Browser using
KEGG Atlas and Google Maps API
Nobuaki Kono, Kazuharu Arakawa, Nobuhiro Kido, Ryu Ogawa, Kazuki
Oshita, Keita Ikegami, Satoshi Tamaki, Masaru Tomita
16. Nobuhiro the REST/CGI server
deveoloped
Kido Kazuki Oshitaserver
deveoloped the SOAP
Acknowledgements
BioHackathon 2009 sponsored by DBCLS and OIST of Japan
Yamagata Prefectural Government and Tsuruoka City.