10. Database Implementations
• Database specific SQL
• May include additional limitations
(i.e: MySQL - MyISAM tables only)
• Functionality define by the DB engine
11. Third Party Search Engines
• Google indexing / searching of your content
22. Indexing : Steps
• Conversion (to plain text)
• Analysis (clean and convert the text to tokens)
• Index (save the tokens to the index)
23. Indexing : Parts
• Index - either file or memory based
• Document - represents a unique object added to the index
• Field - identifies a chunk of data in the document
35. Searching : QueryParser
• 'webobjects' - contains an exact match - TermQuery
• 'webobjects apple', 'webobjects OR apple' - an OR Query
• +webobjects +apple / webobjects AND apple - an AND Query
• title:webobjects - Contains the term in title field
• title:webobjects -subject:iTunes / title:webobjects AND NOT
subject:iTunes
• (webobjects OR apple) AND iTunes
38. Using a QueryParser
QueryParser queryParser = new QueryParser(Version.LUCENE_2.9,
"content", analyzer);
Query query = queryParser.parse(queryString);
41. “The more times a query term appears in a
document relative to the number of times the term
appears in all the documents in the collection, the
more relevant that document is to the query”
42. Boost
• While Indexing
• Document
• Field
• While Searching
• Query
46. ERIndexing : Strengths
• Hides some of the complexity of integrating Lucene with WO
• Offers lots of utility and helper methods
• Speaks WebObjects collection classes
• Simplifies index creation
47. ERIndexing : Weaknesses
• Hides some of the complexity of integrating Lucene with WO
• Not fully baked
• Auto indexing may be dangerous