SlideShare uma empresa Scribd logo
1 de 39
Faceted search using Solr and Ontopia 2009-11-03 Geir Ove Grønmo, grove@bouvet.no
Agenda Short introductions to Solr and Ontopia What is faceted search? An integration of the two – a prototype Demos
Apache Solr A search engine implemented as HTTP service on top of Apache Lucene searching and indexing (no web-crawling) adds support for faceted search (and more) sharding and replication distributed search excellent interoperability (i.e not really Java-specific) Next release: Solr 1.4 Open source: http://lucene.apache.org/solr/ Apache Licence 2.0
Ontopia A Topic Maps toolkit: data representation, persistence and querying application development written in Java Next release: Ontopia 5.1 Open source: http://code.google.com/p/ontopia/ Apache Licence 2.0
Where the meat is... Solr fast textual search and faceted search support Ontopia rich semantic data and structured search User interface design providing a useful interface to the user
But first, what is faceted search? A technique for refining search results Integrates textual search and navigation Allows concept composition slow + expensive + red  + used + car article + in english + about salmon people + aged 20-30 + SQL expert punk rock songs + < 1 minute + in norwegian + released 1980-1982 Support exploration and learning Never returns zero results
How is it done? Given a starting set usually all documents or the result of filling in the search input box ...do the following: count the number of hits matching each facet field which fields to facet on are defined at query time
An example without faceted search
Facet types Standard facets a list of facet values Hierarchical facet values taxonomy of facet values Range/query facets dates prices alphabet buckets intervals (lower and upper bounds)
Standard facets
Hierarchical facet values Note: the facets can also be hierarchical
Alphabet buckets
Range facets
User interface considerations Single select link radio button Multi select checkboxes Decide on which operator to use: AND/OR within a facet between facets How many facet values to display given limited screen real estate How to provide intuitive undo operation
Examples
Scoring Some types of documents should be ranked higher than others Solr lets one boost the default score: per document per field The total score of a documents depends on: the boost and score of the fields adjusted by how relevant a field is relatively to the actual query the boost of the document
Sorting How to sort the list of facets? by relevance How to sort the values of each facet? by number of hits alphabetically How to sort the search result? by relevance alphabetically by date
Proposition “Concept composition, using faceted search, and Topic Maps is a perfect match”
Why not use Ontopia only? You can, but it is not optimizedfor this use case It lets you implement faceted search but it’ll be too slow The reasons are: all the expensive processing will have to happen at runtime, and not indexing time involves a lot of traversal relies on the underlying fulltext search engine search has limited cacheability
Trade-offs Considerations: Search performance Indexing performance Consistency Ontopia no indexing overhead results always up-to-date Solr very fast search indexing overhead index must be kept up-to-date regularly
Solr – the data model An index contains documents Documents have fields A field can have multiple values { “id”: “1234”,      “title”: “Structure and Interpretation of Computer Programs”,      “authors”: [“Harold Abelson”, “Gerald Jay Sussman”] }
Ontopia – the data model A topic map contains topics and information about them Identities Names Associations to other topics Occurrences (read: non-association properties)
Integrating Solr and Ontopia Proposed solution: Solr indexes constructed from Ontopia queries For each document type create a query that extracts data from the topic map to fields in documents Then do faceting on selected fields Use-case specific schema definition should be project specific (to some degree) Perform full index or incremental reindex
Index rule set
Index rule: Organisasjonsenheter
Query result: Organisasjonsenheter
Solr index: Organisasjonsenhet
Index rule: Artikler
Query result: Artikler
Solr index: Artikler
Demo A prototype for Bergen kommune
Ideas for the future Faceted search user-interface in Ontopoly could be made declarative Incremental reindexing requires tracking changes usually done with a timestamp implement last-modified field in Ontopoly Add optional fourth column for score boost? a float between 0 and 1 Ontopia extensions for interacting with Solr JSP tag library tolog predicates
More demos Epicurious: recipe search http://www.epicurious.com/tools/searchresults?search= Flickr photo search with hierarchical facets http://people.csail.mit.edu/dfhuynh/projects/hierarchical-facets/test.html A collection of faceted navigation examples: http://www.flickr.com/photos/morville/collections/72157603789246885/
More information 3 Quick Design Patterns for Better Faceted Search http://www.thingsontop.com/3-quick-patterns-better-facet-design-889.html How to Make a Faceted Classification and Put It On the Web http://www.miskatonic.org/library/facet-web-howto.html Book: Faceted Search (Synthesis Lectures on Information Concepts, Retrieval, and Services), Daniel Tunkelang
...is easier to find when using faceted search. Structured semantics-rich data...

Mais conteúdo relacionado

Mais procurados

Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Lucidworks
 
Topic sensitive page rank(review)
Topic sensitive page rank(review)Topic sensitive page rank(review)
Topic sensitive page rank(review)
hongs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
lucenerevolution
 
Implementing Enterprise Search in SharePoint 2010
Implementing Enterprise Search in SharePoint 2010Implementing Enterprise Search in SharePoint 2010
Implementing Enterprise Search in SharePoint 2010
Agnes Molnar
 

Mais procurados (20)

The Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation EnginesThe Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation Engines
 
Interleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904LabsInterleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904Labs
 
Vespa, A Tour
Vespa, A TourVespa, A Tour
Vespa, A Tour
 
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache Solr
 
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesBoosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User Preferences
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectors
 
Domain Ontology Usage Analysis Framework (OUSAF)
Domain Ontology Usage Analysis Framework (OUSAF)Domain Ontology Usage Analysis Framework (OUSAF)
Domain Ontology Usage Analysis Framework (OUSAF)
 
Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?
Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?
Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?
 
Graphs, Graphs everywhere - Lucene powered relation exploration
Graphs, Graphs everywhere - Lucene powered relation explorationGraphs, Graphs everywhere - Lucene powered relation exploration
Graphs, Graphs everywhere - Lucene powered relation exploration
 
Made to Measure: Ranking Evaluation using Elasticsearch
Made to Measure: Ranking Evaluation using ElasticsearchMade to Measure: Ranking Evaluation using Elasticsearch
Made to Measure: Ranking Evaluation using Elasticsearch
 
Topic sensitive page rank(review)
Topic sensitive page rank(review)Topic sensitive page rank(review)
Topic sensitive page rank(review)
 
Live Blog Analysis
Live Blog AnalysisLive Blog Analysis
Live Blog Analysis
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Getting the most ouf of SharePoint Search - Tulsa SharePoint Interest Group
Getting the most ouf of SharePoint Search - Tulsa SharePoint Interest GroupGetting the most ouf of SharePoint Search - Tulsa SharePoint Interest Group
Getting the most ouf of SharePoint Search - Tulsa SharePoint Interest Group
 
Implementing Enterprise Search in SharePoint 2010
Implementing Enterprise Search in SharePoint 2010Implementing Enterprise Search in SharePoint 2010
Implementing Enterprise Search in SharePoint 2010
 
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
Exploratory Search upon Semantically Described Web Data Sources: Service regi...Exploratory Search upon Semantically Described Web Data Sources: Service regi...
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
 
PrachiSharma
PrachiSharmaPrachiSharma
PrachiSharma
 

Destaque

Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...
Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...
Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...
Lucidworks
 

Destaque (7)

Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...
Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...
Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...
 
Impedance Mismatch 2.0
Impedance Mismatch 2.0Impedance Mismatch 2.0
Impedance Mismatch 2.0
 
Web Du Faceted Search V3 Alt
Web Du Faceted Search V3 AltWeb Du Faceted Search V3 Alt
Web Du Faceted Search V3 Alt
 
Start Anywhere - Faceted Navigation (euroIA 2010)
Start Anywhere - Faceted Navigation (euroIA 2010)Start Anywhere - Faceted Navigation (euroIA 2010)
Start Anywhere - Faceted Navigation (euroIA 2010)
 
Automatically mining facets for queries from their search results
Automatically mining facets for queries from their search resultsAutomatically mining facets for queries from their search results
Automatically mining facets for queries from their search results
 
Are users really ready for faceted search?
Are users really ready for faceted search?Are users really ready for faceted search?
Are users really ready for faceted search?
 
Faceted Search and Solr
Faceted Search and SolrFaceted Search and Solr
Faceted Search and Solr
 

Semelhante a Faceted search using Solr and Ontopia

Apace Solr Web Development.pdf
Apace Solr Web Development.pdfApace Solr Web Development.pdf
Apace Solr Web Development.pdf
Abanti Aazmin
 
Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)
Bradley Allen
 
Lucene Bootcamp -1
Lucene Bootcamp -1 Lucene Bootcamp -1
Lucene Bootcamp -1
GokulD
 

Semelhante a Faceted search using Solr and Ontopia (20)

How the Lucene More Like This Works
How the Lucene More Like This WorksHow the Lucene More Like This Works
How the Lucene More Like This Works
 
Solr Architecture
Solr ArchitectureSolr Architecture
Solr Architecture
 
Sustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive AnalyticsSustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive Analytics
 
Apace Solr Web Development.pdf
Apace Solr Web Development.pdfApace Solr Web Development.pdf
Apace Solr Web Development.pdf
 
Making IA Real: Planning an Information Architecture Strategy
Making IA Real: Planning an Information Architecture StrategyMaking IA Real: Planning an Information Architecture Strategy
Making IA Real: Planning an Information Architecture Strategy
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance study
 
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
 
Longwell final ppt
Longwell final pptLongwell final ppt
Longwell final ppt
 
Search Me: Using Lucene.Net
Search Me: Using Lucene.NetSearch Me: Using Lucene.Net
Search Me: Using Lucene.Net
 
Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)
 
Wanna search? Piece of cake!
Wanna search? Piece of cake!Wanna search? Piece of cake!
Wanna search? Piece of cake!
 
Apache Solr vs Oracle Endeca
Apache Solr vs Oracle EndecaApache Solr vs Oracle Endeca
Apache Solr vs Oracle Endeca
 
Apache lucene
Apache luceneApache lucene
Apache lucene
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using Lucene
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
In search of: A meetup about Liferay and Search 2016-04-20
In search of: A meetup about Liferay and Search   2016-04-20In search of: A meetup about Liferay and Search   2016-04-20
In search of: A meetup about Liferay and Search 2016-04-20
 
Search explained T3DD15
Search explained T3DD15Search explained T3DD15
Search explained T3DD15
 
Lucene Bootcamp -1
Lucene Bootcamp -1 Lucene Bootcamp -1
Lucene Bootcamp -1
 
Search Analytics for Fun and Profit
Search Analytics for Fun and ProfitSearch Analytics for Fun and Profit
Search Analytics for Fun and Profit
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

Faceted search using Solr and Ontopia

  • 1. Faceted search using Solr and Ontopia 2009-11-03 Geir Ove Grønmo, grove@bouvet.no
  • 2. Agenda Short introductions to Solr and Ontopia What is faceted search? An integration of the two – a prototype Demos
  • 3. Apache Solr A search engine implemented as HTTP service on top of Apache Lucene searching and indexing (no web-crawling) adds support for faceted search (and more) sharding and replication distributed search excellent interoperability (i.e not really Java-specific) Next release: Solr 1.4 Open source: http://lucene.apache.org/solr/ Apache Licence 2.0
  • 4. Ontopia A Topic Maps toolkit: data representation, persistence and querying application development written in Java Next release: Ontopia 5.1 Open source: http://code.google.com/p/ontopia/ Apache Licence 2.0
  • 5. Where the meat is... Solr fast textual search and faceted search support Ontopia rich semantic data and structured search User interface design providing a useful interface to the user
  • 6. But first, what is faceted search? A technique for refining search results Integrates textual search and navigation Allows concept composition slow + expensive + red + used + car article + in english + about salmon people + aged 20-30 + SQL expert punk rock songs + < 1 minute + in norwegian + released 1980-1982 Support exploration and learning Never returns zero results
  • 7.
  • 8. How is it done? Given a starting set usually all documents or the result of filling in the search input box ...do the following: count the number of hits matching each facet field which fields to facet on are defined at query time
  • 9.
  • 10.
  • 11.
  • 12. An example without faceted search
  • 13. Facet types Standard facets a list of facet values Hierarchical facet values taxonomy of facet values Range/query facets dates prices alphabet buckets intervals (lower and upper bounds)
  • 15. Hierarchical facet values Note: the facets can also be hierarchical
  • 18. User interface considerations Single select link radio button Multi select checkboxes Decide on which operator to use: AND/OR within a facet between facets How many facet values to display given limited screen real estate How to provide intuitive undo operation
  • 20. Scoring Some types of documents should be ranked higher than others Solr lets one boost the default score: per document per field The total score of a documents depends on: the boost and score of the fields adjusted by how relevant a field is relatively to the actual query the boost of the document
  • 21. Sorting How to sort the list of facets? by relevance How to sort the values of each facet? by number of hits alphabetically How to sort the search result? by relevance alphabetically by date
  • 22. Proposition “Concept composition, using faceted search, and Topic Maps is a perfect match”
  • 23. Why not use Ontopia only? You can, but it is not optimizedfor this use case It lets you implement faceted search but it’ll be too slow The reasons are: all the expensive processing will have to happen at runtime, and not indexing time involves a lot of traversal relies on the underlying fulltext search engine search has limited cacheability
  • 24. Trade-offs Considerations: Search performance Indexing performance Consistency Ontopia no indexing overhead results always up-to-date Solr very fast search indexing overhead index must be kept up-to-date regularly
  • 25. Solr – the data model An index contains documents Documents have fields A field can have multiple values { “id”: “1234”, “title”: “Structure and Interpretation of Computer Programs”, “authors”: [“Harold Abelson”, “Gerald Jay Sussman”] }
  • 26. Ontopia – the data model A topic map contains topics and information about them Identities Names Associations to other topics Occurrences (read: non-association properties)
  • 27. Integrating Solr and Ontopia Proposed solution: Solr indexes constructed from Ontopia queries For each document type create a query that extracts data from the topic map to fields in documents Then do faceting on selected fields Use-case specific schema definition should be project specific (to some degree) Perform full index or incremental reindex
  • 35. Demo A prototype for Bergen kommune
  • 36. Ideas for the future Faceted search user-interface in Ontopoly could be made declarative Incremental reindexing requires tracking changes usually done with a timestamp implement last-modified field in Ontopoly Add optional fourth column for score boost? a float between 0 and 1 Ontopia extensions for interacting with Solr JSP tag library tolog predicates
  • 37. More demos Epicurious: recipe search http://www.epicurious.com/tools/searchresults?search= Flickr photo search with hierarchical facets http://people.csail.mit.edu/dfhuynh/projects/hierarchical-facets/test.html A collection of faceted navigation examples: http://www.flickr.com/photos/morville/collections/72157603789246885/
  • 38. More information 3 Quick Design Patterns for Better Faceted Search http://www.thingsontop.com/3-quick-patterns-better-facet-design-889.html How to Make a Faceted Classification and Put It On the Web http://www.miskatonic.org/library/facet-web-howto.html Book: Faceted Search (Synthesis Lectures on Information Concepts, Retrieval, and Services), Daniel Tunkelang
  • 39. ...is easier to find when using faceted search. Structured semantics-rich data...