3. Scaling Search in Oak with Solr
adaptTo() 2013 3
§ï§âŻ Following up last yearâs âOak / Solr
integrationâ session
4. Looking at last yearâs agenda
adaptTo() 2013 4
§ï§âŻ What we have:
§ï§âŻ Search Oak content with Solr
independently
§ï§âŻ Oak running with a Solr index
§ï§âŻ Embedded
Â
Â
§ï§âŻ Remote
Â
§ï§âŻ What we miss:
§ï§âŻ Solr based MicroKernel
5. Apache Jackrabbit Oak
adaptTo() 2013 5
§ï§âŻ Scalable content repository
§ï§âŻ JCR compatibility (to some degree)
§ï§âŻ Performance especially for concurrent
access
§ï§âŻ Scalability for huge repositories (> 100M
nodes)
§ï§âŻ Support managed environments (e.g. OSGi)
§ï§âŻ Cloud deployments
6. Apache Solr
adaptTo() 2013 6
§ï§âŻ Enterprise search platform
§ï§âŻ Based on Apache Lucene
§ï§âŻ Full text, faceting, highlighting, etc.
§ï§âŻ Dynamic cluster architecture
§ï§âŻ HTTP API
§ï§âŻ Latest release 4.4.0
9. IndexEditor API
adaptTo() 2013 9
§ï§âŻ an
 Editor
 receives
 info
 about
 changes
 to
 the
Â
content
 tree
Â
§ï§âŻ the
 Editor
 evaluates
 the
 status
 before
 and
 a6er
 a
Â
speciïŹc
 Oak
 commit
Â
§ï§âŻ can
 reject
 or
 accept
 the
 changes
 (by
 even
Â
modifying
 the
 tree
 itself)
Â
§ï§âŻ an
 Editor
 is
 executed
 right
 before
 an
 Oak
 commit
Â
is
 persisted
Â
§ï§âŻ the
 SolrIndexHook
 maps
 changes
 between
Â
statuses
 in
 the
 content
 tree
 to
 Solr
 documents
Â
and
 sends
 them
 to
 the
 Solr
 instance(s)
Â
10. IndexEditor API â creating content
adaptTo() 2013 10
§ï§âŻ NodeState
 before
 =
Â
 builder.getNodeState();
Â
§ï§âŻ builder.child("newnode").setProperty("prop",
Â
"val");
Â
§ï§âŻ NodeState
 a6er
 =
 builder.getNodeState();
Â
§ï§âŻ EditorHook
 hook
 =
 new
 EditorHook(new
Â
SolrIndexEditorProvider(âŠ));
Â
§ï§âŻ NodeState
 indexed
 =
Â
hook.processCommit(before,
 a6er);
Â
11. QueryIndex API
adaptTo() 2013 11
§ï§âŻ evaluate the query (Filter) cost for each
available index to find the âbestâ
§ï§âŻ execute the query (Filter) against a specific
revision and root node state
§ï§âŻ internally
 the
 Filter
 is
 usually
 mapped
 to
 the
Â
underlying
 implementaNon
 counterpart
Â
§ï§âŻ improved
 support
 for
 full
 text
 queries
 in
 Oak
Â
§ï§âŻ eventually view the query âplanâ
12. QueryIndex API
adaptTo() 2013 12
§ï§âŻ mapping Filter restrictions:
§ï§âŻ Property restrictions:
§ï§âŻ Each
 property
 is
 mapped
 as
 a
 ïŹeld
Â
â⯠Can
 use
 term
 queries
 for
 simple
 value
 matching
 or
 range
Â
queries
 for
 âïŹrst
 to
 lastâ
Â
§ï§âŻ Path restrictions
â⯠indexed
 as
 strings
 and
 with
 special
 ïŹelds
 for
 parent
 /
Â
children
 /
 descendant
 matching
Â
§ï§âŻ Full text expressions
â⯠use
 (E)DisMax
 query
 parser
 and/or
 fallback
 ïŹelds
Â
§ï§âŻ NodeType restrictions TBD
13. Configuring the Solr index in Oak
adaptTo() 2013 13
§ï§âŻ create an index configuration under
§ï§âŻ /oak:index/solrIdx
§ï§âŻ jcr:primaryType
 =
 oak:queryIndexDeïŹniNon
Â
§ï§âŻ type
 =
 solr
Â
â⯠plus
 some
 mandatory
 props
 (e.g.
 reindex)
Â
§ï§âŻ AddiNonal
 properNes
 if
 want
 to
 run
 an
Â
embedded
 Solr
 server
 (more
 on
 this
 later)
Â
14. Configuring the Solr index in Oak
adaptTo() 2013 14
§ï§âŻ Pluggable Solr server providers and
configuration
§ï§âŻ to allow different deployment scenarios
§ï§âŻ to allow custom configuration
15. Oak Solr core bundle
adaptTo() 2013 15
§ï§âŻ provides basic API and implementation
to index and search Oak content on Solr
§ï§âŻ Solr implementation of IndexEditor
§ï§âŻ Solr implementation of QueryIndex
§ï§âŻ allows configurable mapping between
§ï§âŻ property types and fields
§ï§âŻ e.g.
 all
 binaries
 should
 be
 indexed
 in
 speciïŹc
 ïŹeld
Â
§ï§âŻ filter restrictions and fields
§ï§âŻ e.g.
 path
 restricNons
 for
 children
 should
 hit
 a
Â
certain
 ïŹeld
Â
16. Oak Solr Embedded bundle
adaptTo() 2013 16
§ï§âŻ provides support for indexing and
searching on an embedded Solr instance
§ï§âŻ running inside the Oak repository
§ï§âŻ configuration can be done
§ï§âŻ via
 the
 repository
Â
â⯠stored
 in
 the
 index
 deïŹniNon
 node
Â
§ï§âŻ via
 OSGi
Â
17. Oak Solr Remote bundle
adaptTo() 2013 17
§ï§âŻ provides support for indexing and
searching on remote Solr instances
§ï§âŻ single Solr instance
§ï§âŻ distributed / replicated Solr cluster
§ï§âŻ SolrCloud deployments
§ï§âŻ configuration is done via OSGi
18. OSGi platform running on Oak with Solr
adaptTo() 2013 18
§ï§âŻ Star
 instance
 with
 Oak
 repository
Â
§ï§âŻ Add
 a
 bunch
 of
 bundles
 for
 Solr
Â
â⯠oak-Ââsolr-Ââcore,
 oak-Ââsolr-Ââremote,
 zookeeper,
Â
servicemix.bundles.solr-Ââsolrj,
 etc.
Â
§ï§âŻ ConïŹgure
 the
 Solr
 instance
Â
§ï§âŻ ConïŹgure
 oak-Ââsolr-Ââremote
 providers
 via
 OSGi
Â
21. What needs to be done
adaptTo() 2013 21
§ï§âŻ Easy OSGi deployment
§ï§âŻ Move common configuration stuff in
oak-solr-core
§ï§âŻ Leverage new full text expression API
§ï§âŻ Solr MK?