SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
BLAZING FAST PERFORMANCE
  with Spring Cache, Lucene, GPars, RDF and Grails
WHAT’S A TRIPLE?

 subject
predicate
 object
WHAT’S A TRIPLE?

 subject         “Imatinib”
predicate          type
 object     Pharmaceutical Drug
LINKING OPEN DRUG DATA
    http://esw.w3.org/HCLSIG/LODD
WHAT’S A TRIPLE?
http://www4.wiwiss.fu-berlin.de/drugbank/page/drugs/DB00619


    subject            predicate             object

    DB00619                label           “Imatinib”

    DB00619                type           drugbank:drugs

    DB00619            brandName            “Gleevec”

    DB00619          drugbank:target        targets:17
WHAT’S A TRIPLE?
http://www4.wiwiss.fu-berlin.de/drugbank/page/targets/17


  subject            predicate                object

  DB00619           drugbank:target          targets/17

  targets/17             type            drugbank:drugs

                                       “Proto-oncogene tyrosine-
  targets/17             label           protein kinase ABL1”


  targets/17          geneName                “ABL1”
RELATIONSHIPS

                               ABL1
Imatinib
                target
                                       Legend
                      associatedGene       Gene
       possibleDrug                        Disease

                                          Compound
                                           Predicate


                 Leukemia
PERFORMANCE



     Type-ahead search
select id, label
from targets
where label like ‘%${queryValue}%’
select id, label
from targets
where label like ‘%${queryValue}%’

SELECT ?uri ?label WHERE {
  ?uri rdfs:label ?label .
  ?uri rdf:type drugbank:targets .
  FILTER regex(?label, 'Q${queryValue}E', 'i')"
}
select id, label
from targets
where label like ‘%${queryValue}%’

SELECT ?uri ?label WHERE {
  ?uri rdfs:label ?label .
  ?uri rdf:type drugbank:targets .
  FILTER regex(?label, 'Q${queryValue}E', 'i')"
}

  40 million records
Index the data (up front performance hit)

Search (really really really really fast)

BuildConfig.groovy:
     runtime 'org.apache.lucene:lucene-core:3.0.1'

  -OR-

grails install-plugin searchable
DE-DUPE
Lucene Extension

Index the (RDF) data

Search (really really really really fast)
Understands Triples (or structured data)

        subject            predicate               object

        DB00619                label            “Imatinib”

SirenTupleQuery tupleQuery = new SirenTupleQuery()
tupleQuery.add(createCellQuery(‘label’,
  SirenTupleConstraintPosition.PREDICATE), SirenTupleClause.Occur.MUST)
tupleQuery.add(createCellQuery(‘imatinib’,
  SirenTupleConstraintPosition.OBJECT), SirenTupleClause.Occur.MUST)
GRAILS SPRING CACHE

• grails   install-plugin springcache

• Add @Cacheable (and or @CacheFlush) annotation to
 services / controllers
@Cacheable('somecachename')
def slow(String name) {
    log.info "resolving $name"
    Thread.sleep(2000)
    return "took a long time to resolve ${name}"
}
GPARS
• BuildConfig.groovy:
         runtime 'org.codehaus.gpars:gpars:0.10'
         runtime 'org.coconut.forkjoin.jsr166y:jsr166y:070108'

• http://www.gpars.org/guide/index.html

• Note:hibernate session is not available on GPars threads; you
 need to get one yourself (use DomainClass.withTransaction)

• Data   Parallelism

• Map-Reduce, Fork-Join, and   many more...
GPARS
void resolve(scoreDocs){
  scoreDocs.each {scoreDoc ->
     //do something in single thread
  }
}

void resolveWithPool(scoreDocs){
  GParsPool.withPool {
    scoreDocs.eachParallel {scoreDoc ->
      //do something in parallel
    }
  }
}
OTHER HANDY TOOLS
• Grails   Melody (Java Melody) (monitoring)

• Perf4j   (logging performance)

• Solr   (lucene search server)

• Grails   Console Plugin (web based console)

Mais conteúdo relacionado

Destaque

3rd years presentation
3rd years presentation3rd years presentation
3rd years presentationFelixWilson
 
Improving RDF Search Performance with Lucene and SIREN
Improving RDF Search Performance with Lucene and SIRENImproving RDF Search Performance with Lucene and SIREN
Improving RDF Search Performance with Lucene and SIRENMike Hugo
 
Sumavisos Partners Presentation
Sumavisos Partners PresentationSumavisos Partners Presentation
Sumavisos Partners PresentationSumavisos
 
Hi speed video toepassingen in de industrie 20120615-s
Hi speed video toepassingen in de industrie 20120615-sHi speed video toepassingen in de industrie 20120615-s
Hi speed video toepassingen in de industrie 20120615-sWouterdestecker
 
Feedback from Thesis Presentation
Feedback from Thesis PresentationFeedback from Thesis Presentation
Feedback from Thesis PresentationFelixWilson
 
Kneipp badkristallen
Kneipp badkristallenKneipp badkristallen
Kneipp badkristallencalp1
 
Annotated Bibliography
Annotated BibliographyAnnotated Bibliography
Annotated BibliographyFelixWilson
 
Custom Residence, Great Falls, VA
Custom Residence, Great Falls, VACustom Residence, Great Falls, VA
Custom Residence, Great Falls, VAMark Shuler
 
Questioning powerpoint final
Questioning powerpoint finalQuestioning powerpoint final
Questioning powerpoint finalLisa Hollenbach
 

Destaque (10)

3rd years presentation
3rd years presentation3rd years presentation
3rd years presentation
 
Improving RDF Search Performance with Lucene and SIREN
Improving RDF Search Performance with Lucene and SIRENImproving RDF Search Performance with Lucene and SIREN
Improving RDF Search Performance with Lucene and SIREN
 
Sumavisos Partners Presentation
Sumavisos Partners PresentationSumavisos Partners Presentation
Sumavisos Partners Presentation
 
Hi speed video toepassingen in de industrie 20120615-s
Hi speed video toepassingen in de industrie 20120615-sHi speed video toepassingen in de industrie 20120615-s
Hi speed video toepassingen in de industrie 20120615-s
 
Feedback from Thesis Presentation
Feedback from Thesis PresentationFeedback from Thesis Presentation
Feedback from Thesis Presentation
 
Kneipp badkristallen
Kneipp badkristallenKneipp badkristallen
Kneipp badkristallen
 
Annotated Bibliography
Annotated BibliographyAnnotated Bibliography
Annotated Bibliography
 
Custom Residence, Great Falls, VA
Custom Residence, Great Falls, VACustom Residence, Great Falls, VA
Custom Residence, Great Falls, VA
 
Navigating Complicated Issues for Seniors
Navigating Complicated Issues for Seniors Navigating Complicated Issues for Seniors
Navigating Complicated Issues for Seniors
 
Questioning powerpoint final
Questioning powerpoint finalQuestioning powerpoint final
Questioning powerpoint final
 

Semelhante a Grails lucenecacherdfperformance

Linking Linked Data CSHALS2013
Linking Linked Data CSHALS2013Linking Linked Data CSHALS2013
Linking Linked Data CSHALS2013Nadia Anwar
 
Chem2bio2rdf portal
Chem2bio2rdf portalChem2bio2rdf portal
Chem2bio2rdf portalBin Chen
 
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...Maulik Kamdar
 
Towards semantic systems chemical biology
Towards semantic systems chemical biology Towards semantic systems chemical biology
Towards semantic systems chemical biology Bin Chen
 
Using Architectures for Semantic Interoperability to Create Journal Clubs for...
Using Architectures for Semantic Interoperability to Create Journal Clubs for...Using Architectures for Semantic Interoperability to Create Journal Clubs for...
Using Architectures for Semantic Interoperability to Create Journal Clubs for...James Powell
 
2008 11 13 Hcls Call
2008 11 13 Hcls Call2008 11 13 Hcls Call
2008 11 13 Hcls CallJun Zhao
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchDavid Ruau
 
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS FoundationPistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS FoundationPistoia Alliance
 
Enabling Biobank-Scale Genomic Processing with Spark SQL
Enabling Biobank-Scale Genomic Processing with Spark SQLEnabling Biobank-Scale Genomic Processing with Spark SQL
Enabling Biobank-Scale Genomic Processing with Spark SQLDatabricks
 
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Rothamsted Research, UK
 
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in BioclipseSamuel Lampa
 
Data: The Good, The Bad & The Ugly
Data: The Good, The Bad & The UglyData: The Good, The Bad & The Ugly
Data: The Good, The Bad & The UglySciBite Limited
 

Semelhante a Grails lucenecacherdfperformance (14)

Linking Linked Data CSHALS2013
Linking Linked Data CSHALS2013Linking Linked Data CSHALS2013
Linking Linked Data CSHALS2013
 
Chem2bio2rdf portal
Chem2bio2rdf portalChem2bio2rdf portal
Chem2bio2rdf portal
 
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
 
Towards semantic systems chemical biology
Towards semantic systems chemical biology Towards semantic systems chemical biology
Towards semantic systems chemical biology
 
Using Architectures for Semantic Interoperability to Create Journal Clubs for...
Using Architectures for Semantic Interoperability to Create Journal Clubs for...Using Architectures for Semantic Interoperability to Create Journal Clubs for...
Using Architectures for Semantic Interoperability to Create Journal Clubs for...
 
2008 11 13 Hcls Call
2008 11 13 Hcls Call2008 11 13 Hcls Call
2008 11 13 Hcls Call
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
 
Harvester I
Harvester IHarvester I
Harvester I
 
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS FoundationPistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
 
Enabling Biobank-Scale Genomic Processing with Spark SQL
Enabling Biobank-Scale Genomic Processing with Spark SQLEnabling Biobank-Scale Genomic Processing with Spark SQL
Enabling Biobank-Scale Genomic Processing with Spark SQL
 
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
 
Harvester Ii
Harvester IiHarvester Ii
Harvester Ii
 
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
 
Data: The Good, The Bad & The Ugly
Data: The Good, The Bad & The UglyData: The Good, The Bad & The Ugly
Data: The Good, The Bad & The Ugly
 

Grails lucenecacherdfperformance

  • 1. BLAZING FAST PERFORMANCE with Spring Cache, Lucene, GPars, RDF and Grails
  • 2. WHAT’S A TRIPLE? subject predicate object
  • 3. WHAT’S A TRIPLE? subject “Imatinib” predicate type object Pharmaceutical Drug
  • 4. LINKING OPEN DRUG DATA http://esw.w3.org/HCLSIG/LODD
  • 5. WHAT’S A TRIPLE? http://www4.wiwiss.fu-berlin.de/drugbank/page/drugs/DB00619 subject predicate object DB00619 label “Imatinib” DB00619 type drugbank:drugs DB00619 brandName “Gleevec” DB00619 drugbank:target targets:17
  • 6. WHAT’S A TRIPLE? http://www4.wiwiss.fu-berlin.de/drugbank/page/targets/17 subject predicate object DB00619 drugbank:target targets/17 targets/17 type drugbank:drugs “Proto-oncogene tyrosine- targets/17 label protein kinase ABL1” targets/17 geneName “ABL1”
  • 7. RELATIONSHIPS ABL1 Imatinib target Legend associatedGene Gene possibleDrug Disease Compound Predicate Leukemia
  • 8. PERFORMANCE Type-ahead search
  • 9. select id, label from targets where label like ‘%${queryValue}%’
  • 10. select id, label from targets where label like ‘%${queryValue}%’ SELECT ?uri ?label WHERE { ?uri rdfs:label ?label . ?uri rdf:type drugbank:targets . FILTER regex(?label, 'Q${queryValue}E', 'i')" }
  • 11. select id, label from targets where label like ‘%${queryValue}%’ SELECT ?uri ?label WHERE { ?uri rdfs:label ?label . ?uri rdf:type drugbank:targets . FILTER regex(?label, 'Q${queryValue}E', 'i')" } 40 million records
  • 12. Index the data (up front performance hit) Search (really really really really fast) BuildConfig.groovy: runtime 'org.apache.lucene:lucene-core:3.0.1' -OR- grails install-plugin searchable
  • 14. Lucene Extension Index the (RDF) data Search (really really really really fast)
  • 15. Understands Triples (or structured data) subject predicate object DB00619 label “Imatinib” SirenTupleQuery tupleQuery = new SirenTupleQuery() tupleQuery.add(createCellQuery(‘label’, SirenTupleConstraintPosition.PREDICATE), SirenTupleClause.Occur.MUST) tupleQuery.add(createCellQuery(‘imatinib’, SirenTupleConstraintPosition.OBJECT), SirenTupleClause.Occur.MUST)
  • 16. GRAILS SPRING CACHE • grails install-plugin springcache • Add @Cacheable (and or @CacheFlush) annotation to services / controllers @Cacheable('somecachename') def slow(String name) { log.info "resolving $name" Thread.sleep(2000) return "took a long time to resolve ${name}" }
  • 17. GPARS • BuildConfig.groovy: runtime 'org.codehaus.gpars:gpars:0.10' runtime 'org.coconut.forkjoin.jsr166y:jsr166y:070108' • http://www.gpars.org/guide/index.html • Note:hibernate session is not available on GPars threads; you need to get one yourself (use DomainClass.withTransaction) • Data Parallelism • Map-Reduce, Fork-Join, and many more...
  • 18. GPARS void resolve(scoreDocs){ scoreDocs.each {scoreDoc -> //do something in single thread } } void resolveWithPool(scoreDocs){ GParsPool.withPool { scoreDocs.eachParallel {scoreDoc -> //do something in parallel } } }
  • 19. OTHER HANDY TOOLS • Grails Melody (Java Melody) (monitoring) • Perf4j (logging performance) • Solr (lucene search server) • Grails Console Plugin (web based console)