SlideShare uma empresa Scribd logo
1 de 24
Baixar para ler offline
Patrick Beaucamp
           Founder of the Vanilla Project
       Mail : Patrick.beaucamp@bpm-conseil.com




How to Gain Greater Business Intelligence
     with Vanilla from Solr/Lucene




                 LuceneRevolution, Boston        1
Presentation Agenda
Vanilla powered by Lucene
- Report Indexation, Search Interface
- External document management
- evolution & constraints

Step to Solr/Lucene Adoption
- Indexation, Storage, Search
- Embeded Solr/Lucene
- External Solr/Lucene Platform

Keys Benefit for Vanilla powered by Solr/Lucene
- Cluster Architecture
- Cache Mechanism
- Support for enhanced search language


                             LuceneRevolution, Boston   2
Some Vanilla Features
Flash maps and charts : Reports, Cubes and Dashboard




   Vanilla Apps : Android and Iphone




                              LuceneRevolution, Boston   3
Vanilla Powered by Lucene (1/6)
Vanilla is a full Business Intelligence Platform that provide :
- Reporting, Olap, Dashboard, Kpi, Maps Visualisation
- Etl, Workflow, Document Management search Engine




                          LuceneRevolution, Boston                4
Vanilla Powered by Lucene (2/6)
Report Indexation
- Search engine is Apache Lucene (summer 2010)
- External Document & Vanilla Report are indexed
- Different Indexation strategy for documents :
    – No indexation
    – Real Time indexation
    – Late Indexation

2 modules to manage indexation strategy
- Enterprise Services to set document property
- Norparena to Manage Indexation



                        LuceneRevolution, Boston   5
Vanilla Powered by Lucene (3/6)
Search Interface
- Search Interface available from Vanilla Portal
- Search against Lucene index (inside Vanilla)
- Search result is combined with Security on documents
   – List contains all documents
   – Documents are ordered based on popularity




                       LuceneRevolution, Boston          6
Vanilla Powered by Lucene (4/6)
External document management
- various document format are available (Lucene)
- additional properties can be set on documents, for later
useage in search criteria
- check In / check Out on document for versioning
- search is run on the latest document version




                       LuceneRevolution, Boston         7
Vanilla Powered by Lucene (5/6)
Evolution and constraints
- No clustering available for search engine (embeded Api),
as opposed to Vanilla Report Services
- Limitation in language and keywords (internal search)
- No cache to manage search resultset, as opposed to
Vanilla dataset, powered by Memcached

 - request from customers to be compliant with enterprise
search engine → need to setup an external search
architecture




                       LuceneRevolution, Boston         8
Vanilla Powered by Lucene (6/6)
   Embeded Lucene Api inside Vanilla Platform - Video




                    LuceneRevolution, Boston            9
Step to Solr/Lucene Adoption (1/9)
   Solr/Lucene is the natural evolution of any embeded Lucene platform

Solr Version : 3.5

Indexation
Vanilla Lucene Index can be transfert & read by a Solr/Lucene
(a Solr/Lucene index is not usable inside Vanilla Platform)

Storage
Vanilla search Indexed can be managed by a Solr/Lucene platform

Search
Search language is compliant




                                LuceneRevolution, Boston                 10
Step to Solr/Lucene Adoption (2/9)
                Embeded Solr/Lucene inside Vanilla Platform

No need for any changed in Vanilla code : use of solrj Api

Immediatly provide additional features such as new Keywords

Potential upgrade to Solr/Lucene Enterprise




                                LuceneRevolution, Boston      11
Step to Solr/Lucene Adoption (3/9)
From Embeded Lucene to Embeded Solr/Lucene inside Vanilla Platform




                          LuceneRevolution, Boston               12
Step to Solr/Lucene Adoption (4/9)
    Embeded Solr/Lucene inside Vanilla Platform - Video




                      LuceneRevolution, Boston            13
Step to Solr/Lucene Adoption (5/9)
                Solr/Lucene Platform with a Vanilla Platform

Need for changes in Vanilla code, to separate document management, indexation
& search Api → 10 man days workload

Document Management Api
Easy to move to any Cmis compliancy


Indexation & Search Api
Solr/Lucene oriented & compliant, but now open to any other Search Platform




                               LuceneRevolution, Boston                       14
Step to Solr/Lucene Adoption (6/9)
                                  Coding Before

Example of Code (Api) Before the split

 - Direct use of the Lucene Api




 - Parse the document content using Apache TIKA


 - Generate Lucene's queries




                                  LuceneRevolution, Boston   15
Step to Solr/Lucene Adoption (7/9)
                                  Coding After

Example of Code (Api) After the split

 - Easy to use Solrj Api



 - Distributed search




 - Indexation with automatic parsing (using Apache Tika)




                                LuceneRevolution, Boston   16
Step to Solr/Lucene Adoption (8/9)
    Solr/Lucene Platform with Vanilla Platform - Screenshot




                        LuceneRevolution, Boston              17
Step to Solr/Lucene Adoption (9/9)
     Solr/Lucene Platform with Vanilla Platform - Video




                      LuceneRevolution, Boston            18
Key Benefits for Vanilla Powered
          by Solr/Lucene (1/4)
Clustering Search Architecture, outside of Vanilla

Search results clustering implementation (CarrotClusteringEngine) is based on the
Carrot2 framework.




                                 LuceneRevolution, Boston                     19
Key Benefits for Vanilla Powered
          by Solr/Lucene (2/4)
Additional query language to perform search

Solr Uses the Lucene Search Library and Extends it!

- A Real Data Schema, with Numeric Types, Dynamic Fields, Unique Keys
- Powerful Extensions to the Lucene Query Language
- Faceted Search and Filtering
- Geospatial Search
- Advanced, Configurable Text Analysis




                               LuceneRevolution, Boston                 20
Key Benefits for Vanilla Powered
          by Solr/Lucene (3/4)
New methods to manage result set (binary, Xml, Json)

Solr enterprise search server with a REST-like API.
You put documents in it (called "indexing") via
     XML, JSON or binary over HTTP.
You query it via HTTP GET
     and receive XML, JSON, or binary results

- Advanced Full-Text Search Capabilities
- Optimized for High Volume Web Traffic
- Standards Based Open Interfaces - XML,JSON and HTTP




                                LuceneRevolution, Boston   21
Key Benefits for Vanilla Powered
          by Solr/Lucene (4/4)
Cache Mechanism

Solr caches are associated with an Index Searcher

Three cache implementations :
solr.LRUCache (LRU = Least Recently Used in memory),
solr.FastLRUCache,
solr.LFUCache (Least Frequenty Used)

Many configuration parameters for cache optimisation




                               LuceneRevolution, Boston   22
Next Steps
Upgrade to Solr 4.0

New features for Document cycle Management

Roadmap for better Internationalisation :
- 10 languages available (not Japaneese)
- Search Translation management




                              LuceneRevolution, Boston   23
Documentations and tutorials available on our Web sites:

www.bpm-conseil.com and forge.bpm-conseil.com

               Thanks for your attention




                       LuceneRevolution, Boston               24

Mais conteúdo relacionado

Semelhante a How to Gain Greater Business Intelligence from Lucene/Solr

Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5israelekpo
 
Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.inovex GmbH
 
The power of faceted search in alfresco
The power of faceted search in alfrescoThe power of faceted search in alfresco
The power of faceted search in alfrescoXeniT Solutions nv
 
Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010Cominvent AS
 
Alfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform UpdateAlfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform UpdateAlfresco Software
 
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...Symphony Software Foundation
 
Alfresco 4.0 - A Complete Introduction
 Alfresco 4.0 - A Complete Introduction Alfresco 4.0 - A Complete Introduction
Alfresco 4.0 - A Complete IntroductionAjeet Singh
 
Alfresco Day Vienna 2015 - Technical Track - Developer Platform Updates
Alfresco Day Vienna 2015 - Technical Track - Developer Platform UpdatesAlfresco Day Vienna 2015 - Technical Track - Developer Platform Updates
Alfresco Day Vienna 2015 - Technical Track - Developer Platform UpdatesAlfresco Software
 
Soa4 all technical achievements final
Soa4 all technical achievements finalSoa4 all technical achievements final
Soa4 all technical achievements finalJohn Domingue
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFiHortonworks
 
Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1YI-CHING WU
 
Oslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaOslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaCominvent AS
 

Semelhante a How to Gain Greater Business Intelligence from Lucene/Solr (20)

What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9
 
What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9
 
What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5
 
What’s New in Apache Lucene 3.0
What’s New in Apache Lucene 3.0What’s New in Apache Lucene 3.0
What’s New in Apache Lucene 3.0
 
What’s New in Apache Lucene 3.0
What’s New in Apache Lucene 3.0What’s New in Apache Lucene 3.0
What’s New in Apache Lucene 3.0
 
What’s new in apache lucene 3.0
What’s new in apache lucene 3.0What’s new in apache lucene 3.0
What’s new in apache lucene 3.0
 
What’s new in apache lucene 3.0
What’s new in apache lucene 3.0What’s new in apache lucene 3.0
What’s new in apache lucene 3.0
 
Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.
 
The power of faceted search in alfresco
The power of faceted search in alfrescoThe power of faceted search in alfresco
The power of faceted search in alfresco
 
Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010
 
Alfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform UpdateAlfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform Update
 
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
 
Alfresco 4.0 - A Complete Introduction
 Alfresco 4.0 - A Complete Introduction Alfresco 4.0 - A Complete Introduction
Alfresco 4.0 - A Complete Introduction
 
Alfresco Day Vienna 2015 - Technical Track - Developer Platform Updates
Alfresco Day Vienna 2015 - Technical Track - Developer Platform UpdatesAlfresco Day Vienna 2015 - Technical Track - Developer Platform Updates
Alfresco Day Vienna 2015 - Technical Track - Developer Platform Updates
 
Soa4 all technical achievements final
Soa4 all technical achievements finalSoa4 all technical achievements final
Soa4 all technical achievements final
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFi
 
Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1
 
Oslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaOslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alpha
 
What’s New in Solr 1.4
What’s New in Solr 1.4What’s New in Solr 1.4
What’s New in Solr 1.4
 

Mais de lucenerevolution

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucenelucenerevolution
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! lucenerevolution
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solrlucenerevolution
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationslucenerevolution
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloudlucenerevolution
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusterslucenerevolution
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiledlucenerevolution
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs lucenerevolution
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Stormlucenerevolution
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?lucenerevolution
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APIlucenerevolution
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucenelucenerevolution
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenallucenerevolution
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside downlucenerevolution
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - finallucenerevolution
 

Mais de lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Último

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Último (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

How to Gain Greater Business Intelligence from Lucene/Solr

  • 1. Patrick Beaucamp Founder of the Vanilla Project Mail : Patrick.beaucamp@bpm-conseil.com How to Gain Greater Business Intelligence with Vanilla from Solr/Lucene LuceneRevolution, Boston 1
  • 2. Presentation Agenda Vanilla powered by Lucene - Report Indexation, Search Interface - External document management - evolution & constraints Step to Solr/Lucene Adoption - Indexation, Storage, Search - Embeded Solr/Lucene - External Solr/Lucene Platform Keys Benefit for Vanilla powered by Solr/Lucene - Cluster Architecture - Cache Mechanism - Support for enhanced search language LuceneRevolution, Boston 2
  • 3. Some Vanilla Features Flash maps and charts : Reports, Cubes and Dashboard Vanilla Apps : Android and Iphone LuceneRevolution, Boston 3
  • 4. Vanilla Powered by Lucene (1/6) Vanilla is a full Business Intelligence Platform that provide : - Reporting, Olap, Dashboard, Kpi, Maps Visualisation - Etl, Workflow, Document Management search Engine LuceneRevolution, Boston 4
  • 5. Vanilla Powered by Lucene (2/6) Report Indexation - Search engine is Apache Lucene (summer 2010) - External Document & Vanilla Report are indexed - Different Indexation strategy for documents : – No indexation – Real Time indexation – Late Indexation 2 modules to manage indexation strategy - Enterprise Services to set document property - Norparena to Manage Indexation LuceneRevolution, Boston 5
  • 6. Vanilla Powered by Lucene (3/6) Search Interface - Search Interface available from Vanilla Portal - Search against Lucene index (inside Vanilla) - Search result is combined with Security on documents – List contains all documents – Documents are ordered based on popularity LuceneRevolution, Boston 6
  • 7. Vanilla Powered by Lucene (4/6) External document management - various document format are available (Lucene) - additional properties can be set on documents, for later useage in search criteria - check In / check Out on document for versioning - search is run on the latest document version LuceneRevolution, Boston 7
  • 8. Vanilla Powered by Lucene (5/6) Evolution and constraints - No clustering available for search engine (embeded Api), as opposed to Vanilla Report Services - Limitation in language and keywords (internal search) - No cache to manage search resultset, as opposed to Vanilla dataset, powered by Memcached - request from customers to be compliant with enterprise search engine → need to setup an external search architecture LuceneRevolution, Boston 8
  • 9. Vanilla Powered by Lucene (6/6) Embeded Lucene Api inside Vanilla Platform - Video LuceneRevolution, Boston 9
  • 10. Step to Solr/Lucene Adoption (1/9) Solr/Lucene is the natural evolution of any embeded Lucene platform Solr Version : 3.5 Indexation Vanilla Lucene Index can be transfert & read by a Solr/Lucene (a Solr/Lucene index is not usable inside Vanilla Platform) Storage Vanilla search Indexed can be managed by a Solr/Lucene platform Search Search language is compliant LuceneRevolution, Boston 10
  • 11. Step to Solr/Lucene Adoption (2/9) Embeded Solr/Lucene inside Vanilla Platform No need for any changed in Vanilla code : use of solrj Api Immediatly provide additional features such as new Keywords Potential upgrade to Solr/Lucene Enterprise LuceneRevolution, Boston 11
  • 12. Step to Solr/Lucene Adoption (3/9) From Embeded Lucene to Embeded Solr/Lucene inside Vanilla Platform LuceneRevolution, Boston 12
  • 13. Step to Solr/Lucene Adoption (4/9) Embeded Solr/Lucene inside Vanilla Platform - Video LuceneRevolution, Boston 13
  • 14. Step to Solr/Lucene Adoption (5/9) Solr/Lucene Platform with a Vanilla Platform Need for changes in Vanilla code, to separate document management, indexation & search Api → 10 man days workload Document Management Api Easy to move to any Cmis compliancy Indexation & Search Api Solr/Lucene oriented & compliant, but now open to any other Search Platform LuceneRevolution, Boston 14
  • 15. Step to Solr/Lucene Adoption (6/9) Coding Before Example of Code (Api) Before the split - Direct use of the Lucene Api - Parse the document content using Apache TIKA - Generate Lucene's queries LuceneRevolution, Boston 15
  • 16. Step to Solr/Lucene Adoption (7/9) Coding After Example of Code (Api) After the split - Easy to use Solrj Api - Distributed search - Indexation with automatic parsing (using Apache Tika) LuceneRevolution, Boston 16
  • 17. Step to Solr/Lucene Adoption (8/9) Solr/Lucene Platform with Vanilla Platform - Screenshot LuceneRevolution, Boston 17
  • 18. Step to Solr/Lucene Adoption (9/9) Solr/Lucene Platform with Vanilla Platform - Video LuceneRevolution, Boston 18
  • 19. Key Benefits for Vanilla Powered by Solr/Lucene (1/4) Clustering Search Architecture, outside of Vanilla Search results clustering implementation (CarrotClusteringEngine) is based on the Carrot2 framework. LuceneRevolution, Boston 19
  • 20. Key Benefits for Vanilla Powered by Solr/Lucene (2/4) Additional query language to perform search Solr Uses the Lucene Search Library and Extends it! - A Real Data Schema, with Numeric Types, Dynamic Fields, Unique Keys - Powerful Extensions to the Lucene Query Language - Faceted Search and Filtering - Geospatial Search - Advanced, Configurable Text Analysis LuceneRevolution, Boston 20
  • 21. Key Benefits for Vanilla Powered by Solr/Lucene (3/4) New methods to manage result set (binary, Xml, Json) Solr enterprise search server with a REST-like API. You put documents in it (called "indexing") via XML, JSON or binary over HTTP. You query it via HTTP GET and receive XML, JSON, or binary results - Advanced Full-Text Search Capabilities - Optimized for High Volume Web Traffic - Standards Based Open Interfaces - XML,JSON and HTTP LuceneRevolution, Boston 21
  • 22. Key Benefits for Vanilla Powered by Solr/Lucene (4/4) Cache Mechanism Solr caches are associated with an Index Searcher Three cache implementations : solr.LRUCache (LRU = Least Recently Used in memory), solr.FastLRUCache, solr.LFUCache (Least Frequenty Used) Many configuration parameters for cache optimisation LuceneRevolution, Boston 22
  • 23. Next Steps Upgrade to Solr 4.0 New features for Document cycle Management Roadmap for better Internationalisation : - 10 languages available (not Japaneese) - Search Translation management LuceneRevolution, Boston 23
  • 24. Documentations and tutorials available on our Web sites: www.bpm-conseil.com and forge.bpm-conseil.com Thanks for your attention LuceneRevolution, Boston 24