O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 20 Anúncio

Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes

Baixar para ler offline

Precise elucidation of the many different biological features encoded in a genome requires a careful curation process that involves reviewing all available evidence to allow researchers to resolve discrepancies and validate automated gene models, protein alignments, and other biological elements. Genome annotation is an inherently collaborative task; researchers only rarely work in isolation, turning to colleagues for second opinions and insights from those with expertise in particular domains and gene families.
The i5k initiative seeks to sequence the genomes of 5,000 insect and related arthropod species. The selected species are known to be important to worldwide agriculture, food safety, medicine, and energy production as well as many used as models in biology, those most abundant in world ecosystems, and representatives in every branch of the insect phylogeny in an effort to better understand arthropod evolution and phylogeny. Because computational genome analysis remains an imperfect art, each of these new genomes sequenced will require visualization and curation.
Apollo is an instantaneous, collaborative, genome annotation editor, and the new JavaScript based version allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. The i5K is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process and Apollo is serving as the platform to empower this community. Here we offer details about this collaboration.

Precise elucidation of the many different biological features encoded in a genome requires a careful curation process that involves reviewing all available evidence to allow researchers to resolve discrepancies and validate automated gene models, protein alignments, and other biological elements. Genome annotation is an inherently collaborative task; researchers only rarely work in isolation, turning to colleagues for second opinions and insights from those with expertise in particular domains and gene families.
The i5k initiative seeks to sequence the genomes of 5,000 insect and related arthropod species. The selected species are known to be important to worldwide agriculture, food safety, medicine, and energy production as well as many used as models in biology, those most abundant in world ecosystems, and representatives in every branch of the insect phylogeny in an effort to better understand arthropod evolution and phylogeny. Because computational genome analysis remains an imperfect art, each of these new genomes sequenced will require visualization and curation.
Apollo is an instantaneous, collaborative, genome annotation editor, and the new JavaScript based version allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. The i5K is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process and Apollo is serving as the platform to empower this community. Here we offer details about this collaboration.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Quem viu também gostou (16)

Anúncio

Semelhante a Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes (20)

Mais de Monica Munoz-Torres (10)

Anúncio

Mais recentes (20)

Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes

  1. 1. APOLLO + i5K
 Collaborative Curation and 
 Interactive Analysis of Genomes Monica Munoz-Torres, PhD | @monimunozto
 Nathan Dunn, Monica Poelchau, Ian Holmes, Colin Diesh, Deepak Unni, Christine Elsik, and Suzanna Lewis. 
 Berkeley Bioinformatics Open-Source Projects (BBOP)
 Genomics Division, Lawrence Berkeley National Laboratory
 XXIII Plant and Animal Genome Conference. San Diego, CA. January 14, 2015
  2. 2. OUTLINE
 •  CURATING  GENOMES   steps  involved     •  MANUAL  ANNOTATION   is  necessary,  but  does  not  always  scale     •  WEB  APOLLO   empowering  curators     •  i5K   pursuing  common  goals   Web  Apollo  CollaboraHve  CuraHon  and     InteracHve  Analysis  of  Genomes   2
  3. 3. CURATING GENOMES
 steps involved 1  Crea-on  of  Gene  Models   calling  ORFs,  one  or  more   rounds  of  gene  predicHon,   etc.     2  Annota-on  of  gene  models   Describing  funcHon,   expression  paNerns,    and  metabolic  network    memberships.   3     Manual  annota-on   CURATING GENOMES 3
  4. 4. AUTOMATED ANNOTATION
 remains an imperfect art Unlike  the  more  highly  polished  genomes  of  earlier  projects,  today:   a.  lower  coverage.   b.  more  frequent  assembly  errors  and  annotaHon  of  genes  across   mulHple  scaffolds.   c.  automated  genome  annotaHons  must  be  curated  to  resolve   discrepancies,  providing  clarity  and  validaHon.   CURATING GENOMES 4 Image:  www.BroadInsHtute.org    
  5. 5. ACCURACY OF ANNOTATION 
 … it depends EXAMPLE     v  Eight  methods  for  differenHal  alternaHve   splicing  detecHon  in  plants,  using  RNAseq.   v  Conclusion:  NO  single  method  performs   the  best  in  all  situaHons.       “The  accuracy  of  annota/on  has  a  major   impact  on  which  method  should  be  chosen     for  analysis.”     CURATING GENOMES 5 Liu  et  al.  BMC  BioinformaHcs  2014,  15:364  
  6. 6. 6 MANUAL ANNOTATION
 objectives IdenHfies  elements  that  best   represent  the  underlying  biology   (including  missing  genes)  and   eliminates  elements  that  reflect   systemic  errors  of  automated   analyses.   Assigns  funcHon  through   comparaHve  analysis  of  similar   genome  elements  from  closely   related  species  using  literature,   databases,  and  researchers’  lab   data.   1   2   MANUAL ANNOTATION hNp://GeneOntology.org  
  7. 7. BUT, MANUAL CURATION
 does not always scale A  small  group  of  highly   trained  experts;  e.g.  GO   1   Museum   A  few  very  good   biologists  and  a  few  very   good  bioinformaHcians   camp  together,  during   intense  but  short  periods   of  Hme.   Jamboree  2   Researchers  work  by   themselves,  then  may  or   may  not  publicize   results;  may  be  a  dead-­‐ end  with  very  few   people  ever  aware  of   these  results.   Co?age  3   Elsik  et  al.  2006.  Genome  Res.  16(11):1329-­‐33.   MANUAL ANNOTATION 7 Too  many  sequences  and     not  enough  hands  to  approach  curaHon.  
  8. 8. POWER TO THE CURATORS
 augment existing tools Fill   in   the   gap   for   all   the   things   that   won’t   be   easy   to   cover   with   these   approaches   and   allow   researchers   to   beNer  contribute  their  efforts.   Give  more  people  the  power  to  curate!   Big   data   are   not   a   subs/tute   for,   but  a  supplement  to  tradi/onal  data   collec/on  and  analysis.   The  Parable  of  Google  Flu.  Lazer  et  al.  2014.  Science  343  (6176):  1203-­‐1205.   v Enable  more  curators  to  work   v Enable  beNer  scienHfic  publishing   v Credit  curators  for  their  work     WEB APOLLO 8
  9. 9. GENOME ANNOTATION
 an inherently collaborative task Researchers  ofen  turn  to  colleagues  for  second  opinions  and  insight  from  those   with  experHse  in  parHcular  areas  (e.g.,  domains,  families).  To  facilitate  and   encourage  this,  we  conHnue  to  improve  Apollo.   WEB APOLLO 9 v  Web  based  for  easy  access.     v  Concurrent  access  supports  real  Hme  collaboraHon.     v  Built-­‐in  support  for  standards  (transparently  compliant).     v  AutomaHc  generaHon  of  ready-­‐made  computable  data.     v  Client-­‐side  applicaHon  relieves  server  boNleneck  and  supports  privacy.   v  Supports  annotaHon  of  genes,    pseudogenes,  tRNAs,  snRNAs,   snoRNAs,  ncRNAs,  miRNAs,  TEs,  and  repeats.   The  new  Javascript-­‐based  Apollo              :    
  10. 10. COLLABORATIONS
 also crowdsourcing development v  New  avenues  for  landing  on  Apollo  and  customizaHon  of  addiHonal   applicaHons.   v  Web  services  for  alignment  and  funcHonal  annotaHon  tools.     v  RNAseq  datasets  being  used  to  re-­‐annotate  the  bovine  genome,  finding   genes  that  neither  RefSeq  nor  Ensembl  predicted.  Also  creaHng  track  of   disagreement  between  sets.       v  Bovine  genome  consorHum  making  previous  iteraHons  of  manual  annotaHon   efforts  (from  3  assemblies  ago)  available  for  integraHon  of  curated  models.   WEB APOLLO 10 UNIVERSITY of MISSOURI National Agricultural Library
  11. 11. i5K
 5,000 insects and related Arthropod species v  Species  are  selected  in  an  effort  to  beNer  understand  arthropod  evoluHon   and  phylogeny  through:   v  worldwide  agriculture   v  food  safety   v  medicine   v  energy  producHon     v  models  in  biology     v  those  species  most  abundant  in  world  ecosystems   v  every  branch  of  the  insect  phylogeny     v  Each  new  genome  requires  visualizaHon  and  curaHon!   APOLLO + i5K 11 National Agricultural Library hNp://arthropodgenomes.org/wiki/i5K  
  12. 12. i5K
 who can join? v  All  Arthropods  are  welcome!     v  Pilot  project:  39  species   v  3  with  completed  manual  annotaHon   v  25  undergoing  manual  annotaHon   v  We  offer  a  plaiorm  for     collaboraHve  genome   analysis.     v  We  do  not  offer  funding  for   sequencing  projects.   APOLLO + i5K 12 National Agricultural Library Wasmania  auropunctata   Phlebotomus  papatasi   hNp://arthropodgenomes.org/wiki/i5K  
  13. 13. i5K
 current workflow: pilot project APOLLO + i5K 13 National Agricultural Library Sequencing,   assembly,  &   annotaHon   Research  Plan   Select  genes   of  interest   Calling  all   collaborators   Manual   AnnotaHon   Merge   automated  &   manual   annotaHons   •  Set  Hme  frame   •  Training   •  Q&A   Update  gene  set   for  computaHonal   analysis   •  Gatekeeping   •  More  curaHon   CollaboraHve   ComputaHonal   PublicaHon  
  14. 14. i5K
 tools at workspace@NAL v  Web  Apollo   v  RegistraHon  module   v  DifferenHal  user  permissions   v  Django  BLAST   v  Queries  mulHple  species  at  once   v  Links  directly  to  Apollo   v  Species  pages  &  Gene  pages   v  project  details,  metrics,  staHsHcs     v  Widget  to  track  all  WA  annotaHons   APOLLO + i5K 14 National Agricultural Library Tripal,  Chado,  JBrowse,  Apollo   National Agricultural Library
  15. 15. i5K
 what we have learned v  Enabling  collaboraHon  has  been  very  useful  to  communiHes   v  Data  hosHng  and  administraHon  at  NAL  facilitates  process  for  many  groups   v  You  must  enforce  strict  rules  and  formats   v  Metadata  capture  is  a  must;  standards  must  be  generated  and  enforced   v  Users  prefer  small  bits  of  help  info  at  a  Hme,  instead  of  lengthy  manuals   v  The  ideal  assembly  is  of  high  quality  and  remains  stable   v  InvesHng  Hme  and  effort  on  a  high  quality  set  of  automated  gene  predicHons   will  pay  off   v  Quality  of  manually  annotated  set  will  depend  on  the  coordinator’s  “whip”   APOLLO + i5K 15 National Agricultural Library
  16. 16. i5K
 how to join v  Visit  hNp://arthropodgenomes.org/wiki/i5K  to  sign  up       v  Contact  us!     Please  tell  us  about  your  research  interests  and  comment  on  the  status  and   quality  of  sequencing  /  assembly  /  automated  annotaHon  for  your  genome   of  interest.   @monimunozto  |  mcmunozt  @  lbl.gov     v  Check  out  the  i5K  Workspace@NAL  at  hNps://i5k.nal.usda.gov/     APOLLO + i5K 16 National Agricultural Library
  17. 17. FUTURE PLANS
 educational tools We  are  working  with  educators  to  make  Web  Apollo  part  of  their  curricula.   WEB APOLLO 17 Lecture  Series.   In  the  classroom.   At  the  lab.   Classroom  exercises:  from   genome  sequence  to   hypothesis.   CuraHon  group  dedicated   to  producing  educaHon   materials  for  non-­‐model   organism  communiHes.   Our  team  provides  online   documentaHon,  hands-­‐on   training,  and  rapid   response  to  users.  
  18. 18. ALL ARE WELCOME
 call or email to join the Apollo community Open  Call  for  Developers  on  the  First  Thursday   of  each  month  at  9:00AM  (Pacific  Time).     Message  @monimunozto  for  details.   BBOP Projects 18 Join  the  conversaHon  by  submirng  your  email  at     hNps://lists.lbl.gov/sympa/subscribe/apollo   hNp://GenomeArchitect.org     hNp://ArthropodGenomes.org/wiki/i5K  
  19. 19. •  Berkeley  Bioinforma-cs  Open-­‐source  Projects   (BBOP),  Berkeley  Lab:  Web  Apollo  and  Gene   Ontology  teams.  Suzanna  E.  Lewis  (PI).   •  §  ChrisHne  G.  Elsik  (PI).  University  of  Missouri.     •  *  Ian  Holmes  (PI).  University  of  California  Berkeley.   •  Arthropod  genomics  community:  i5K  Steering   CommiNee  (esp.  Sue  Brown  (Kansas  State)),  Alexie   Papanicolaou  (CSIRO),  Monica  Poelchau,  Christopher   Childers  (USDA/NAL),  fringy  Richards,  Dan  Hughes,   Kim  Worley  (HGSC-­‐BCM),  BGI,  Oliver  Niehuis  (1KITE   hNp://www.1kite.org/),  and  the  Honey  Bee  Genome   Sequencing  ConsorHum.   •  Web  Apollo  is  supported  by  NIH  grants   5R01GM080203  from  NIGMS,  and  5R01HG004483   from  NHGRI,  and  by  the  Director,  Office  of  Science,   Office  of  Basic  Energy  Sciences,  of  the  U.S.   Department  of  Energy  under  Contract  No.  DE-­‐ AC02-­‐05CH11231.   •  Insect  images  used  with  permission:   hNp://AlexanderWild.com   •  For  your  a?en-on,  thank  you!   Thank you. 19 Web  Apollo   Nathan  Dunn   Colin  Diesh  §   Deepak  Unni  §       Gene  Ontology   Chris  Mungall   Seth  Carbon   Heiko  Dietze     BBOP   Web  Apollo:  hNp://GenomeArchitect.org     i5K:  hNp://arthropodgenomes.org/wiki/i5K   GO:  hNp://GeneOntology.org   Thanks!   NAL  at  USDA   Monica  Poelchau   Christopher  Childers   NAL  team   HGSC  at  BCM   fringy  Richards   Dan  Hughes   Kim  Worley    
  20. 20. Web  Apollo   Q-­‐ratore  

×