SlideShare a Scribd company logo
1 of 32
Download to read offline
1



Rapid Prototyping
      with
       Solr
                 presented by
Erik Hatcher, Technical Staff, Lucid Imagination




                                                       1
Abstract

   Got data?  Let's make it searchable!  This interactive
   presentation will demonstrate getting documents into
   Solr quickly, provide some tips in adjusting Solr's
   schema to match your needs better, and finally showcase
   your data in a flexible search user interface.  We'll
   see how to rapidly leverage faceting, highlighting,
   spell checking, and debugging.  Even after all that,
   there will be enough time left to outline the next
   steps in developing your search application and taking
   it to production.




                                                             2

                                                                 2
Why prototype?
   Demonstrate Solr can handle your needs
   Buy-in
   "Prototyping: faster than teaching a 9-year-old
    Ju-Jitsu"
   It's quick, easy, AND FUN!
   The User Interface is the app



                                                      3

                                                          3
Got Data?
   Files?
       Solr Cell

   Databases?
       Data Import Handler

   Feeds (Atom/RSS/XML)?
       Data Import Handler

   3rd party repositories?
       Lucene Connectors Framework
       custom indexing scripts using a Solr API

   CSV!!!
     CSV upload handler
                                                   4

                                                       4
UI
   Solritas (VelocityResponseWriter)
   http://localhost:8983/solr/itas
   Documentation:
       http://wiki.apache.org/solr/VelocityResponseWriter




                                                             5

                                                                 5
LucidWorks for Solr
   great starting point
   built-in and pre-configured:
       Clustering
         Carrot2

       Search UI
         Solritas (VelocityResponseWriter)

         Server includes root context, handy for serving static files

       Better stemming
         KStem

       Tomcat, optionally


                                                                        6

                                                                            6
~/LucidWorks: start.sh
2010-05-21 08:53:49.595::INFO: Logging to STDERR via
org.mortbay.log.StdErrLog
2010-05-21 08:53:49.764::INFO: jetty-6.1.3
May 21, 2010 8:53:50 AM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: JNDI not configured for solr (NoInitialContextEx)
May 21, 2010 8:53:50 AM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: using system property solr.solr.home: /Users/erikhatcher/
LucidWorks/lucidworks/jetty/../solr
May 21, 2010 8:53:50 AM org.apache.solr.core.SolrResourceLoader <init>
INFO: Solr home set to '/Users/erikhatcher/LucidWorks/lucidworks/
jetty/../solr/'
.
.
.
May 21, 2010 8:53:51 AM org.apache.solr.core.SolrCore registerSearcher
INFO: [] Registered new searcher Searcher@21fb3211 main


                                                                         7

                                                                             7
Your Data
First Name,Last Name,Company,Title,Work Country
Erik,Hatcher,Lucid Imagination,"Member, Technical Staff", USA
.
.
.




                                                                8

                                                                    8
First try

curl "http://localhost:8983/solr/update/csv?stream.file=EuroCon2010.csv"

undefined field First Name




                                                                       9

                                                                           9
Schema: dynamic field flexibility


<dynamicField name="*_s"   type="string"   indexed="true"   stored="true"/>
<dynamicField name="*_t"   type="text"     indexed="true"   stored="true"/>




                                                                          10

                                                                               10
Mapping to dynamic fields


curl "http://localhost:8983/solr/update/csv?
stream.file=EuroCon2010.csv&fieldnames=first_s,last_s,company_s,title_t,
country_s&header=true"

Document [null] missing required field: id




                                                                       11

                                                                            11
Identifying uniqueKey, or not
curl "http://localhost:8983/solr/update/csv?

stream.file=EuroCon2010.csv&fieldnames=first_s,   id,company_s,title_t,co
untry_s&header=true"

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">40</int></lst>
</response>




                                                                        12

                                                                             12
http://localhost:8983/solr/itas




                                  13

                                       13
Schema tinkering
   Removed all example field definitions
   Uncomment and adjust catch-all dynamic field:
       <dynamicField name="*" type="string" multiValued="false"/>

   Ensure uniqueKey is appropriate
       Unusual in this data example:
       <!-- <uniqueKey>id</uniqueKey> -->

   Make every document/field fully searchable!
       <copyField source="*" dest="text"/>

   Then restart!
                                                                     14

                                                                          14
Issues with no uniqueKey
   Remove from solrconfig.xml references to:
       clustering component
       query elevation component
       data import handler

   Then restart!




                                               15

                                                    15
Reindexing with cleaner field names
# Delete all documents
curl "http://localhost:8983/solr/update?stream.body=%3Cdelete%3E%3Cquery
%3E*:*%3C/query%3E%3C/delete%3E&commit=true"

# Index your data
curl "http://localhost:8983/solr/update/csv?
commit=true&stream.file=EuroCon2010.csv&fieldnames=   first,last,
company,title,country&header=true"




                                                                       16

                                                                            16
Faceting
 http://localhost:8983/solr/itas?facet.field=country




                                                      17

                                                           17
country normalization
http://localhost:8983/solr/update/csv?
commit=true&stream.file=EuroCon2010.csv&fieldnames=first,last,company,ti
             f.country.map=Great
tle,country&header=true&

+Britain:United+Kingdom




                                                                       18

                                                                            18
UI treatments
   Customize request handler mappings
   Edit templates
       hit display
       header/footer
       style




                                         19

                                              19
Customize request handlers
  <requestHandler name="/browse" class="solr.SearchHandler">
    <lst name="defaults">
      <str name="wt">velocity</str>
      <str name="v.template">browse</str>
      <str name="v.layout">layout</str>

     <str name="rows">10</str>
     <str name="fl">*,score</str>

     <str   name="defType">lucene</str>
     <str   name="q">*:*</str>
     <str   name="debugQuery">true</str>
     <str   name="hl">on</str>
     <str   name="hl.fl">title</str>
     <str   name="hl.fragsize">0</str>
     <str   name="hl.alternateField">title</str>

      <str name="facet">on</str>
      <str name="facet.mincount">1</str>
      <str name="facet.missing">true</str>
    </lst>
    <lst name="appends">
      <str name="facet.field">country</str>
    </lst>
  </requestHandler>                                            20

                                                                    20
hit.vm

<div class="result-document">
  <p>$doc.getFieldValue('first') $doc.getFieldValue('last')</p>
  <p>$!doc.getFieldValue('title'), $!doc.getFieldValue('company')</p>
  <p>$!doc.getFieldValue('country')</p>
</div>




                                                                        21

                                                                             21
Voila!




         22

              22
Adding bells and whistles
   JQuery
       <script type="text/javascript" src="/solr/admin/
        jquery-1.2.3.min.js"></script>

   Let's add a tree map
       <script type="text/javascript" src="/scripts/treemap.js"></script>
       http://plugins.jquery.com/project/Treemap




                                                                             23

                                                                                  23
tree map table
<script type="text/javascript">
  function onLoad() {
     jQuery("#treemap-country").treemap(640,480, {});
  }
</script>
----------------------------
<body onload="onLoad();">
----------------------------
<table id="treemap-country">
#foreach($facet in $response.getFacetField('country').values)
  <tr>
     <td>#if($facet.name)
$esc.html($facet.name)#else&lt;Unspecified&gt;#end</td>
     <td>$facet.count</td>
     <td>#if($facet.name)$esc.html($facet.name)#{else}Unspecified#end</
td>
  </tr>
#end
</table>

                                                                          24

                                                                               24
Tree map




           25

                25
Ajax fun: giveaways
   Add "static" templated page
   JQuery Ajax request
   snippet templated output




                                  26

                                       26
"static" Solritas page
 solrconfig.xml
 <requestHandler name="/giveaways" class="solr.DumpRequestHandler">
   <lst name="defaults">
     <str name="wt">velocity</str>
     <str name="v.template">giveaways</str>
     <str name="v.layout">layout</str>
   </lst>
 </requestHandler>

 giveaways.vm
 <input type="button" value="Pick a Winner" onClick="javascript:$
 ('#winner').load('/solr/generate_winner?sort=random_' + new
 Date().getTime() + '+asc');">
 <h2>And the winner is...</h2>
 <center><font size="20"><div id="winner"></div></font></center>




                                                                      27

                                                                           27
fragment template
solrconfig.xml
<requestHandler name="/generate_winner" class="solr.SearchHandler">
  <!-- sort=random_... required -->
  <lst name="defaults">
    <str name="wt">velocity</str>
    <str name="v.template">winner</str>

    <str name="rows">1</str>
    <str name="fl">first,last</str>

    <str name="defType">lucene</str>
    <str name="q">*:* -company:"Lucid Imagination" -company:"Stone Circle
Productions"</str>
      </lst>
   </requestHandler>


winner.vm
#set($winner=$response.results.get(0))
$winner.getFieldValue('first') $winner.getFieldValue('last')

                                                                       28

                                                                            28
And the winner is...




                       29

                            29
Prototyping tools
   CSV update handler
   Schema Browser
   Solritas
   Solr Explorer
       https://issues.apache.org/jira/browse/SOLR-1163

   Solr Flare
       http://wiki.apache.org/solr/Flare



                                                          30

                                                               30
Refine, iterate, integrate
   What's next?
       script full & delta indexing processes
       adjust schema
         define fields, field types, analysis

       tweak configuration
         caches, indexing parameters

       deploy to staging/production environments




                                                    31

                                                         31
Test
   Performance
   Scalability
   Relevance
   Automate all of the above, start baselines and avoid
    regressions




                                                           32

                                                                32

More Related Content

What's hot

Scala ActiveRecord
Scala ActiveRecordScala ActiveRecord
Scala ActiveRecord
scalaconfjp
 
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solr
adunne
 
Web2py Code Lab
Web2py Code LabWeb2py Code Lab
Web2py Code Lab
Colin Su
 
Web2py tutorial to create db driven application.
Web2py tutorial to create db driven application.Web2py tutorial to create db driven application.
Web2py tutorial to create db driven application.
fRui Apps
 
PofEAA and SQLAlchemy
PofEAA and SQLAlchemyPofEAA and SQLAlchemy
PofEAA and SQLAlchemy
Inada Naoki
 
td_mxc_rubyrails_shin
td_mxc_rubyrails_shintd_mxc_rubyrails_shin
td_mxc_rubyrails_shin
tutorialsruby
 

What's hot (20)

Scala ActiveRecord
Scala ActiveRecordScala ActiveRecord
Scala ActiveRecord
 
Building node.js applications with Database Jones
Building node.js applications with Database JonesBuilding node.js applications with Database Jones
Building node.js applications with Database Jones
 
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solr
 
ORM Injection
ORM InjectionORM Injection
ORM Injection
 
ERRest
ERRestERRest
ERRest
 
Developing for Node.JS with MySQL and NoSQL
Developing for Node.JS with MySQL and NoSQLDeveloping for Node.JS with MySQL and NoSQL
Developing for Node.JS with MySQL and NoSQL
 
ERRest - The Next Steps
ERRest - The Next StepsERRest - The Next Steps
ERRest - The Next Steps
 
Web2py Code Lab
Web2py Code LabWeb2py Code Lab
Web2py Code Lab
 
An introduction to SQLAlchemy
An introduction to SQLAlchemyAn introduction to SQLAlchemy
An introduction to SQLAlchemy
 
Web2py tutorial to create db driven application.
Web2py tutorial to create db driven application.Web2py tutorial to create db driven application.
Web2py tutorial to create db driven application.
 
PofEAA and SQLAlchemy
PofEAA and SQLAlchemyPofEAA and SQLAlchemy
PofEAA and SQLAlchemy
 
Alfredo-PUMEX
Alfredo-PUMEXAlfredo-PUMEX
Alfredo-PUMEX
 
ORM2Pwn: Exploiting injections in Hibernate ORM
ORM2Pwn: Exploiting injections in Hibernate ORMORM2Pwn: Exploiting injections in Hibernate ORM
ORM2Pwn: Exploiting injections in Hibernate ORM
 
RicoLiveGrid
RicoLiveGridRicoLiveGrid
RicoLiveGrid
 
td_mxc_rubyrails_shin
td_mxc_rubyrails_shintd_mxc_rubyrails_shin
td_mxc_rubyrails_shin
 
Let ColdFusion ORM do the work for you!
Let ColdFusion ORM do the work for you!Let ColdFusion ORM do the work for you!
Let ColdFusion ORM do the work for you!
 
Introduction to Apache solr
Introduction to Apache solrIntroduction to Apache solr
Introduction to Apache solr
 
XQuery Design Patterns
XQuery Design PatternsXQuery Design Patterns
XQuery Design Patterns
 
ShmooCon 2009 - (Re)Playing(Blind)Sql
ShmooCon 2009 - (Re)Playing(Blind)SqlShmooCon 2009 - (Re)Playing(Blind)Sql
ShmooCon 2009 - (Re)Playing(Blind)Sql
 
New methods for exploiting ORM injections in Java applications
New methods for exploiting ORM injections in Java applicationsNew methods for exploiting ORM injections in Java applications
New methods for exploiting ORM injections in Java applications
 

Similar to Rapid Prototyping with Solr

Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
Erik Hatcher
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
Erik Hatcher
 
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...The Magic Revealed: Four Real-World Examples of Using the Client Object Model...
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...
SPTechCon
 
Developing your first application using FIWARE
Developing your first application using FIWAREDeveloping your first application using FIWARE
Developing your first application using FIWARE
FIWARE
 
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)
Igor Bronovskyy
 

Similar to Rapid Prototyping with Solr (20)

Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and Solr
 
[Coscup 2012] JavascriptMVC
[Coscup 2012] JavascriptMVC[Coscup 2012] JavascriptMVC
[Coscup 2012] JavascriptMVC
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.
 
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...The Magic Revealed: Four Real-World Examples of Using the Client Object Model...
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...
 
112 portfpres.pdf
112 portfpres.pdf112 portfpres.pdf
112 portfpres.pdf
 
Developing your first application using FIWARE
Developing your first application using FIWAREDeveloping your first application using FIWARE
Developing your first application using FIWARE
 
General Principles of Web Security
General Principles of Web SecurityGeneral Principles of Web Security
General Principles of Web Security
 
XPages Blast - ILUG 2010
XPages Blast - ILUG 2010XPages Blast - ILUG 2010
XPages Blast - ILUG 2010
 
Rails and alternative ORMs
Rails and alternative ORMsRails and alternative ORMs
Rails and alternative ORMs
 
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)
09 - express nodes on the right angle - vitaliy basyuk - it event 2013 (5)
 
Apache solr liferay
Apache solr liferayApache solr liferay
Apache solr liferay
 
All Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for NewbiesAll Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for Newbies
 
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
 
Building Your Own IoT Platform using FIWARE GEis
Building Your Own IoT Platform using FIWARE GEisBuilding Your Own IoT Platform using FIWARE GEis
Building Your Own IoT Platform using FIWARE GEis
 
Innovative Specifications for Better Performance Logging and Monitoring
Innovative Specifications for Better Performance Logging and MonitoringInnovative Specifications for Better Performance Logging and Monitoring
Innovative Specifications for Better Performance Logging and Monitoring
 
The top 10 security issues in web applications
The top 10 security issues in web applicationsThe top 10 security issues in web applications
The top 10 security issues in web applications
 
MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!
 
Do you know what your drupal is doing? Observe it!
Do you know what your drupal is doing? Observe it!Do you know what your drupal is doing? Observe it!
Do you know what your drupal is doing? Observe it!
 

More from Erik Hatcher

Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query Parsing
Erik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
Erik Hatcher
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
Erik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
Erik Hatcher
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
Erik Hatcher
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
Erik Hatcher
 

More from Erik Hatcher (20)

Ted Talk
Ted TalkTed Talk
Ted Talk
 
Solr Payloads
Solr PayloadsSolr Payloads
Solr Payloads
 
it's just search
it's just searchit's just search
it's just search
 
Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis Tricks
 
Solr Powered Libraries
Solr Powered LibrariesSolr Powered Libraries
Solr Powered Libraries
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query Parsing
 
"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago
 
Query Parsing - Tips and Tricks
Query Parsing - Tips and TricksQuery Parsing - Tips and Tricks
Query Parsing - Tips and Tricks
 
Solr 4
Solr 4Solr 4
Solr 4
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Solr Flair
Solr FlairSolr Flair
Solr Flair
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Rapid Prototyping with Solr

  • 1. 1 Rapid Prototyping with Solr presented by Erik Hatcher, Technical Staff, Lucid Imagination 1
  • 2. Abstract Got data?  Let's make it searchable!  This interactive presentation will demonstrate getting documents into Solr quickly, provide some tips in adjusting Solr's schema to match your needs better, and finally showcase your data in a flexible search user interface.  We'll see how to rapidly leverage faceting, highlighting, spell checking, and debugging.  Even after all that, there will be enough time left to outline the next steps in developing your search application and taking it to production. 2 2
  • 3. Why prototype?  Demonstrate Solr can handle your needs  Buy-in  "Prototyping: faster than teaching a 9-year-old Ju-Jitsu"  It's quick, easy, AND FUN!  The User Interface is the app 3 3
  • 4. Got Data?  Files?  Solr Cell  Databases?  Data Import Handler  Feeds (Atom/RSS/XML)?  Data Import Handler  3rd party repositories?  Lucene Connectors Framework  custom indexing scripts using a Solr API  CSV!!!  CSV upload handler 4 4
  • 5. UI  Solritas (VelocityResponseWriter)  http://localhost:8983/solr/itas  Documentation:  http://wiki.apache.org/solr/VelocityResponseWriter 5 5
  • 6. LucidWorks for Solr  great starting point  built-in and pre-configured:  Clustering  Carrot2  Search UI  Solritas (VelocityResponseWriter)  Server includes root context, handy for serving static files  Better stemming  KStem  Tomcat, optionally 6 6
  • 7. ~/LucidWorks: start.sh 2010-05-21 08:53:49.595::INFO: Logging to STDERR via org.mortbay.log.StdErrLog 2010-05-21 08:53:49.764::INFO: jetty-6.1.3 May 21, 2010 8:53:50 AM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: JNDI not configured for solr (NoInitialContextEx) May 21, 2010 8:53:50 AM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: using system property solr.solr.home: /Users/erikhatcher/ LucidWorks/lucidworks/jetty/../solr May 21, 2010 8:53:50 AM org.apache.solr.core.SolrResourceLoader <init> INFO: Solr home set to '/Users/erikhatcher/LucidWorks/lucidworks/ jetty/../solr/' . . . May 21, 2010 8:53:51 AM org.apache.solr.core.SolrCore registerSearcher INFO: [] Registered new searcher Searcher@21fb3211 main 7 7
  • 8. Your Data First Name,Last Name,Company,Title,Work Country Erik,Hatcher,Lucid Imagination,"Member, Technical Staff", USA . . . 8 8
  • 10. Schema: dynamic field flexibility <dynamicField name="*_s" type="string" indexed="true" stored="true"/> <dynamicField name="*_t" type="text" indexed="true" stored="true"/> 10 10
  • 11. Mapping to dynamic fields curl "http://localhost:8983/solr/update/csv? stream.file=EuroCon2010.csv&fieldnames=first_s,last_s,company_s,title_t, country_s&header=true" Document [null] missing required field: id 11 11
  • 12. Identifying uniqueKey, or not curl "http://localhost:8983/solr/update/csv? stream.file=EuroCon2010.csv&fieldnames=first_s, id,company_s,title_t,co untry_s&header=true" <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"><int name="status">0</int><int name="QTime">40</int></lst> </response> 12 12
  • 14. Schema tinkering  Removed all example field definitions  Uncomment and adjust catch-all dynamic field:  <dynamicField name="*" type="string" multiValued="false"/>  Ensure uniqueKey is appropriate  Unusual in this data example:  <!-- <uniqueKey>id</uniqueKey> -->  Make every document/field fully searchable!  <copyField source="*" dest="text"/>  Then restart! 14 14
  • 15. Issues with no uniqueKey  Remove from solrconfig.xml references to:  clustering component  query elevation component  data import handler  Then restart! 15 15
  • 16. Reindexing with cleaner field names # Delete all documents curl "http://localhost:8983/solr/update?stream.body=%3Cdelete%3E%3Cquery %3E*:*%3C/query%3E%3C/delete%3E&commit=true" # Index your data curl "http://localhost:8983/solr/update/csv? commit=true&stream.file=EuroCon2010.csv&fieldnames= first,last, company,title,country&header=true" 16 16
  • 19. UI treatments  Customize request handler mappings  Edit templates  hit display  header/footer  style 19 19
  • 20. Customize request handlers <requestHandler name="/browse" class="solr.SearchHandler"> <lst name="defaults"> <str name="wt">velocity</str> <str name="v.template">browse</str> <str name="v.layout">layout</str> <str name="rows">10</str> <str name="fl">*,score</str> <str name="defType">lucene</str> <str name="q">*:*</str> <str name="debugQuery">true</str> <str name="hl">on</str> <str name="hl.fl">title</str> <str name="hl.fragsize">0</str> <str name="hl.alternateField">title</str> <str name="facet">on</str> <str name="facet.mincount">1</str> <str name="facet.missing">true</str> </lst> <lst name="appends"> <str name="facet.field">country</str> </lst> </requestHandler> 20 20
  • 21. hit.vm <div class="result-document"> <p>$doc.getFieldValue('first') $doc.getFieldValue('last')</p> <p>$!doc.getFieldValue('title'), $!doc.getFieldValue('company')</p> <p>$!doc.getFieldValue('country')</p> </div> 21 21
  • 22. Voila! 22 22
  • 23. Adding bells and whistles  JQuery  <script type="text/javascript" src="/solr/admin/ jquery-1.2.3.min.js"></script>  Let's add a tree map  <script type="text/javascript" src="/scripts/treemap.js"></script>  http://plugins.jquery.com/project/Treemap 23 23
  • 24. tree map table <script type="text/javascript"> function onLoad() { jQuery("#treemap-country").treemap(640,480, {}); } </script> ---------------------------- <body onload="onLoad();"> ---------------------------- <table id="treemap-country"> #foreach($facet in $response.getFacetField('country').values) <tr> <td>#if($facet.name) $esc.html($facet.name)#else&lt;Unspecified&gt;#end</td> <td>$facet.count</td> <td>#if($facet.name)$esc.html($facet.name)#{else}Unspecified#end</ td> </tr> #end </table> 24 24
  • 25. Tree map 25 25
  • 26. Ajax fun: giveaways  Add "static" templated page  JQuery Ajax request  snippet templated output 26 26
  • 27. "static" Solritas page solrconfig.xml <requestHandler name="/giveaways" class="solr.DumpRequestHandler"> <lst name="defaults"> <str name="wt">velocity</str> <str name="v.template">giveaways</str> <str name="v.layout">layout</str> </lst> </requestHandler> giveaways.vm <input type="button" value="Pick a Winner" onClick="javascript:$ ('#winner').load('/solr/generate_winner?sort=random_' + new Date().getTime() + '+asc');"> <h2>And the winner is...</h2> <center><font size="20"><div id="winner"></div></font></center> 27 27
  • 28. fragment template solrconfig.xml <requestHandler name="/generate_winner" class="solr.SearchHandler"> <!-- sort=random_... required --> <lst name="defaults"> <str name="wt">velocity</str> <str name="v.template">winner</str> <str name="rows">1</str> <str name="fl">first,last</str> <str name="defType">lucene</str> <str name="q">*:* -company:"Lucid Imagination" -company:"Stone Circle Productions"</str> </lst> </requestHandler> winner.vm #set($winner=$response.results.get(0)) $winner.getFieldValue('first') $winner.getFieldValue('last') 28 28
  • 29. And the winner is... 29 29
  • 30. Prototyping tools  CSV update handler  Schema Browser  Solritas  Solr Explorer  https://issues.apache.org/jira/browse/SOLR-1163  Solr Flare  http://wiki.apache.org/solr/Flare 30 30
  • 31. Refine, iterate, integrate  What's next?  script full & delta indexing processes  adjust schema  define fields, field types, analysis  tweak configuration  caches, indexing parameters  deploy to staging/production environments 31 31
  • 32. Test  Performance  Scalability  Relevance  Automate all of the above, start baselines and avoid regressions 32 32