Develop a solr request handler plugin

andrew.janowczyk@searchbox.com

Solr is
◦ Blazing fast open source enterprise search platform
◦ Lucene-based Search Server
◦ Written in Java
◦ Has REST-like HTTP/XML and JSON APIs
◦ Extensive plugin architecture
http://lucene.apache.org/solr/

 Allows for the development of plugins which
provide advanced operations
 Types of plugins:
◦ RequestHandlers
 Uses url parameters and returns own response
◦ SearchComponents
 Responses are embedded in other responses (such as
/select)
◦ ProcessFactory
 Response is stored into a field along with the
document during index time

 A quick tutorial on how to program a
RequestHandler to
◦ Be initialized
◦ Parse configuration file arguments
◦ Do something useful, (counts some words in query)
◦ Format and return response
 We’ll name our plugin “DemoPlugin” and
show how to stick it into the solrconfig.xml
for loading

 In the next slide, we’ll specify a list of variables
called “words”, and each list subtype is a string
“word”
 We want to load these specific words and then
count them on all subsequent queries.
 Ex: config file has “body”, “fish”, “dog”
 Query is: dog body body body fish fish fish fish
orange
 Result should be:
◦ body=3.0
◦ fish=4.0
◦ dog=1.0

<requestHandler name=“/newendpoint"
class="com.searchbox.DemoPlugin">
<lst name=“words">
<str name=“word">body</str>
<str name=“word">fish</str>
<str name=“word">dog</str>
</lst>
</requestHandler>
Variables will be loaded from this section
during the init method discussed later

 We can see that we’re asking for Solr to load
com.searchbox.DemoPlugin. This will be the
output of our project in .jar file format
 Copy the .jar file to the lib directory in the
Solr installation so that Solr can find it.
 That’s it!

package com.searchbox;
import java.util.HashMap;
import java.util.List;
import org.apache.solr.common.SolrException;
import org.apache.solr.common.params.CommonParams;
import org.apache.solr.common.params.SolrParams;
import org.apache.solr.common.util.NamedList;
import org.apache.solr.common.util.SimpleOrderedMap;
import org.apache.solr.handler.RequestHandlerBase;
import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.response.SolrQueryResponse;
import org.apache.solr.search.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class DemoPlugin extends RequestHandlerBase {
private static Logger LOGGER = LoggerFactory.getLogger(DemoPlugin.class);
volatile long numRequests;
volatile long totalTime;
volatile long numErrors;
List<String> words;

 Initialization is called when the plugin is first
loaded
 This most commonly occurs when Solr is
started up
 At this point we can load things from file
(models, serialized objects, etc)
 Have access to the variables set in
solrconfig.xml

 We have selected to pass a list called “words”
and have also provided the list “fish”, ”body”,
”cat” of words we’d like to count.
 During initialization we need to load this list
from solrconfig.xml and store it locally

@Override
public void init(NamedList params) {
words= (NamedList)params.get(“words”)).getAll(“word”);
if (words.isEmpty()) {
throw new
SolrException(SolrException.ErrorCode.SERVER_ERROR,
"Need to specify at least one word in requestHandler config!");}
}
super.init(params); //pass the rest of the init up
}
Notice that we’ve loaded the list “words” and
then all of its attributes called “word” and put
them into the class level variable words.

@Override
public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception
{
numRequests++;
long startTime = System.currentTimeMillis();
try {
HashMap<String, Double> counts = new HashMap<String, Double>();
SolrParams params = req.getParams();
String q = params.get(CommonParams.Q); //get the q param from url
for (String string : q.split(" ")) {
if (words.contains(string)) {
Double oldcount = counts.containsKey(string) ? counts.get(string) : 0;
counts.put(string, oldcount + 1);
}
}
• We start off by keeping track in a volatile variable the number of requests we’ve seen (for use later
in statistics), and we’d like to know how long the process takes so we note the time.
• Next we initialize our local variable which will contain our word counts
• Next we get the “q” parameter from the URL which was sent to us
• We do a very silly split by space to break it into words, and iterate through each of the words. If the
word is in our “words” variable, we keep a running total of the number of times it appears

NamedList<Double> results = new NamedList<Double>();
for (String word : words) {
results.add(word, counts.get(word));
}
rsp.add("results", results);
} catch (Exception e) {
numErrors++;
LOGGER.error(e.getMessage());
} finally {
totalTime += System.currentTimeMillis() - startTime;
}
}
• Now that we’ve looked at all of the strings, and our process is done we need to return the results.
• We create a namedlist of type double to hold the counts, and then iterate through our words adding them
to the response
• Finally, we add our result list to the Solr response variable rsp
• We also see the other end of the catch statement, which is used to collect error counts and print the error
to the Solr logger
• Finally we add the time it took to the total time

@Override
public String getDescription() {
return "Searchbox DemoPlugin";
}
@Override
public String getVersion() {
return "1.0";
}
@Override
public String getSource() {
return "http://www.searchbox.com";
}
@Override
public NamedList<Object> getStatistics() {
NamedList all = new SimpleOrderedMap<Object>();
all.add("requests", "" + numRequests);
all.add("errors", "" + numErrors);
all.add("totalTime(ms)", "" + totalTime);
return all;
}
• In order to have a production grade plugin, users expect to see certain pieces of information
available in their Solr admin panel
• Description, version and source are just Strings
• We see getStatistics() actually uses the volatile variables we were keeping track of before, sticks
them into another named list and returns them. These appear under the statistics panel in Solr.
• That’s it!

http://192.168.56.101:8983/solr/core_name/newendpoint?q=dog%20body%20body%20body%20fish%20fis
h%20fish%20fish%20orange
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
<lst name="results">
<double name="body">3.0</double>
<double name="fish">4.0</double>
<double name="dog">1.0</double>
</lst>
</response>

• Because we’ve overridden the
getStatistics() method, we can get real-
time stats from the admin panel!

Happy Developing!
Full Source Code available at:
http://www.searchbox.com/developing-a-request-handler-for-solr

Develop a solr request handler plugin

Recomendados

Recomendados

Mais conteúdo relacionado

Último

Último (20)

Destaque

Destaque (20)

Develop a solr request handler plugin