SlideShare uma empresa Scribd logo
1 de 43
Apache Solr 1.4
Faster, Easier, and More Versatile than Ever




                                               1
About Erik

• Member of Technical Staff,
  Lucid Imagination
• Co-author "Lucene in Action"
• Frequent speaker at industry conferences
• Committer: Lucene and Solr
• Apache Lucene PMC member

                                             2
Lucene



         3
Lucene 2.9
• IndexReader#reopen()
• Faster filter performance, by 300% in some
  cases
• Per-segment FieldCache
• Reusable token streams
• Faster numeric/date range queries, thanks to
  trie
• and tons more, see Lucene 2.9's CHANGES.txt
                                                 4
reopen()




           5
Trie fields

• Trie* fields index multiple precisions steps
• Works for numerics & dates
• Result: up to 40x faster than standard range
  queries
• Configurable precision step


                                                 6
Trie Example




Trie-Range Prefix Tree Encoding and relevant nodes for range [215 TO 977]




                                                                           7
Solr



       8
New Logo!




            9
Performance Improvements

• Caching
• Concurrent file access
• Per-segment index updates
• Faceting
• DocSet generation, avoids scoring
• Streaming updates for SolrJ

                                      10
Caching
• LRU cache now based on
  ConcurrentHashMap
• Reads are lockless, writes are partitioned
• Minimizes overhead of synchronization
• Improves filterCache, queryCache, and
  documentCache
• Anecdotal evidence: can double query
  throughput in some circumstances


                                               11
Concurrent file access


• Defaults to NIOFSDirectory on non-
  Windows platforms for better concurrency
• Pluggable DirectoryProvider



                                             12
Per-segment caching


• FieldCache per segment
• Improves searching, sorting, and function
  queries




                                              13
Faceting performance


• Major performance improvements on
  multi-valued fields!
• http://yonik.wordpress.com/2008/11/25/
  solr-faceted-search-performance-
  improvements/




                                           14
SolrJ
• Binary updates (bye bye XML)
• StreamingUpdateSolrServer
 • Streams multiple documents over
    multiple connections
  • Simple test went from 231 docs/sec to
    25000 docs/sec!
• LBHttpSolrServer
 • load balancing / failover
                                            15
Feature Improvements
•   Rich document indexing

•   DataImportHandler enhancements

•   Smoother replication

•   More choices for logging

•   Multi-select faceting

•   Speedier range queries

•   Duplicate detection

•   New request handler components


                                     16
Solr Cell

• Content Extraction Library L ?
• Richly extracts text from Microsoft Office,
  PDF, HTML, and many other formats
• Powered by Apache Tika
• http://wiki.apache.org/solr/
  ExtractingRequestHandler



                                               17
DataImportHandler
•   SQL deltaImportQuery

•   abort, skip, continue options

•   FieldReader

    •   example: can read XML from CLOB

•   HTML strip transformer

•   Event listener API (import start/end)

•   ContentStreamDataSource

•   MailEntityProcessor

•   LineEntityProcessor, FileListEntityProcessor




                                                   18
Replication
•   Solr 1.3: rsync-based
•   Solr 1.4: Java-based, request handler
•   Copy config files, and optionally rename
•   Basic authentication support
•   Custom Solr index deletion policy enables
    deletion of commit points on various criteria
    such as number of commits, age of commit
    point and optimized status (pluggable)


                                                    19
Replication Configuration
Master
<requestHandler name="/replication" class="solr.ReplicationHandler">
  <lst name="master">
    <str name="replicateAfter">commit</str>
    <str name="confFiles">schema.xml,stopwords.txt</str>
  </lst>
</requestHandler>

Slave
<requestHandler name="/replication" class="solr.ReplicationHandler">
  <lst name="slave">
    <str name="masterUrl">
      http://masterhostname:8983/solr/replication
    </str>
    <str name="pollInterval">00:00:60</str>
  </lst>
</requestHandler>
                                                                       20
New faceting features


• multi-select
 • tag a filter query (fq)
 • exclude a tagged filter from facet counts
• key - for labeling response structure


                                              21
q=solr+1.4&facet=true&fq={!tag=proj}project:(lucene OR solr)&facet.field={!ex=proj}project

                                                                                             22
Deduplication

• Detect duplicates during indexing and
  handle them
• Adds a signature field to the document
  (could be uniqueKey)
• Exact (hash on certain fields) or Fuzzy
  duplicate detection
• http://wiki.apache.org/solr/Deduplication

                                              23
commit/rollback



• commitWithin
• rollback



                         24
Analysis
•   CharFilter
                                      •   avoid splitting or
                                          dropping international
•   DoubleMetaphone
                                          non-letter characters
•   PersianAnalyzer,                      such as non spacing
                                          marks.
    ArabicAnalyzer,
    SmartChineseAnalyzer
                                  •   PositionFilter support
•   WordDelimiterFilter
                                  •   SnowballPorterFilterFactory
    •   splitOnNumerics               - protected words support

    •   protected words support   •   HTMLStripCharFilter

    •   stemEnglishPossessive     •   CommonGrams




                                                                    25
omitTermFreqAndPositions


• Omits number of terms in that specific field
  & list of positions
• Saves time and index space for non-text
  fields




                                               26
Trie Numeric Field Types
<fieldType name="tint" class="solr.TrieIntField"
           precisionStep="8" omitNorms="true"
           positionIncrementGap="0"/>

<fieldType name="tfloat" class="solr.TrieFloatField"
           precisionStep="8" omitNorms="true"
           positionIncrementGap="0"/>

<fieldType name="tlong" class="solr.TrieLongField"
           precisionStep="8" omitNorms="true"
           positionIncrementGap="0"/>

<fieldType name="tdouble" class="solr.TrieDoubleField"
           precisionStep="8" omitNorms="true"
           positionIncrementGap="0"/>

                                                       27
Trie Date Field Types


<fieldType name="date" class="solr.TrieDateField"
           omitNorms="true" precisionStep="0"
           positionIncrementGap="0"/>

<fieldType name="tdate" class="solr.TrieDateField"
           omitNorms="true" precisionStep="6"
           positionIncrementGap="0"/>




                                                     28
Wildcard handling
•   ReversedWildcardFilterFactory

•   Index-time reversal, query-time handling

•   Uses special marker for end

•   Example:

    •   Document: <field   name="text_rev">Solr</field>

    •   Indexed: #rlos

    •   Query: *olr -> #rlo*

    •   # used here to denote marker character



                                                          29
Function queries

• milliseconds: ms()
• subtraction: sub()
• {!frange l=6 u=9}sqrt(sum(a,b))
 • http://yonik.wordpress.com/2009/07/06/
    ranges-over-functions-in-solr-1-4/




                                            30
Stats Component

• min, max, sum, sumOfSquares, count,
    missing, mean, stddev
• numeric fields only
•   http://localhost:8983/solr/select?
    q=*:*&stats=true&stats.field=price&stats.field=popularity
    &rows=0&indent=true




                                                              31
Terms Component


• Return indexed terms+docfreq in a field,
    use for auto-suggest, etc
•   http://localhost:8983/solr/terms?
    terms.fl=name&terms.lower=a&terms.sort=index




                                                  32
Term Vector Component


• Returns term info per document (tf,
    positions)
•   http://localhost:8983/solr/select/?q=*
    %3A*&version=2.2&start=0&rows=10&indent=on&qt=tvrh&
    tv=true&tv.tf=true&tv.df=true&tv.positions&tv.offsets=true




                                                                 33
Field & Document Request Handlers




• Provides similar capabilites as Solr's admin
  analysis tool, but with response in Solr's
  flexible formats




                                                 34
Clustering

• Implemented with Carrot2
• Uses Carrot2 to dynamically cluster the
  top N search results
• Like dynamically discovered facets
• http://wiki.apache.org/solr/
  ClusteringComponent



                                            35
Example Clustering Output
    <lst name="cluster">
      <lst name="labels">
             <str name="label">Car Power Adapter</str>
      </lst>
      <lst name="docs">
             <str name="doc">F8V7067-APL-KIT</str>
             <str name="doc">IW-02</str>
      </lst>
     </lst>
     <lst name="cluster">
      <lst name="labels">
             <str name="label">Display</str>
      </lst>
      <lst name="docs">
             <str name="doc">MA147LL/A</str>
             <str name="doc">VA902B</str>
      </lst>
     </lst>
     <lst name="cluster">
      <lst name="labels">
             <str name="label">Hard Drive</str>
      </lst>
      <lst name="docs">
             <str name="doc">SP2514N</str>
             <str name="doc">6H500F0</str>
      </lst>
     </lst>
     <lst name="cluster">
      <lst name="labels">
             <str name="label">Retail</str>
      </lst>
      <lst name="docs">
             <str name="doc">TWINX2048-3200PRO</str>
             <str name="doc">VS1GB400C3</str>            36
Distributed Search


• facet.sort=lex
• timeout support: shard-socket-timeout &
  shard-connection-timeout




                                            37
VelocityResponseWriter

• celeritas: swiftness, speed (Latin), origin of
  the symbol "c" for the speed of light
• solritas:Velocity template rendering of Solr
  responses
• Useful for rapid prototyping and more


                                                   38
AJAX-Solr


• Newest Solr/AJAX library
• Improves upon the old SolrJS library that
  was to be in Solr 1.4
• http://github.com/evolvingweb/AJAX-Solr/


                                              39
Odds & Ends
•   omitHeader
                                •   binary field (Base64)
•   logging switched to SLF4J
                                •   Highlighter:
•   spellcheck rebuild on
                                    •   field globbing:
    optimize, if configured
                                        hl.fl=*_text
•   maxChars on copyField
                                    •   now supports range/
•   Nested query support in             wildcard/fuzzy/prefix
    function query parser
                                    •   FieldCache stats
•   expungeDeletes                      exposed to stats.jsp
                                        and JMX
•   XInclude support

•   multicore merge
                                    •   plugins: enable flag



                                                               40
Upgrading from 1.3
• omitTermFreqAndPositions
 • set version from 1.1 to 1.2 in schema.xml
• default query parser syntax no longer
  supports ;sort options, use &sort= instead,
  or switch defType=lucenePlusSort
• Potential analysis differences when using
  WordDelimiterFilterFactory (SOLR-1078)
• Reindexing not required, but can't hurt.
                                                41
Book




       42
lucidimagination.com

                       43

Mais conteúdo relacionado

Mais procurados

Solr 6 Feature Preview
Solr 6 Feature PreviewSolr 6 Feature Preview
Solr 6 Feature PreviewYonik Seeley
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrChristos Manios
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHPPaul Borgermans
 
Ingesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScriptIngesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScriptLucidworks
 
IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys" IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys" DataArt
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBertrand Delacretaz
 
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)Yonik Seeley
 
ERRest - Designing a good REST service
ERRest - Designing a good REST serviceERRest - Designing a good REST service
ERRest - Designing a good REST serviceWO Community
 
Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and SparkLucidworks
 
20150210 solr introdution
20150210 solr introdution20150210 solr introdution
20150210 solr introdutionXuan-Chao Huang
 
Using existing language skillsets to create large-scale, cloud-based analytics
Using existing language skillsets to create large-scale, cloud-based analyticsUsing existing language skillsets to create large-scale, cloud-based analytics
Using existing language skillsets to create large-scale, cloud-based analyticsMicrosoft Tech Community
 
Introduction to Apache Lucene/Solr
Introduction to Apache Lucene/SolrIntroduction to Apache Lucene/Solr
Introduction to Apache Lucene/SolrRahul Jain
 
ElasticSearch AJUG 2013
ElasticSearch AJUG 2013ElasticSearch AJUG 2013
ElasticSearch AJUG 2013Roy Russo
 
ActiveRecord Query Interface (1), Season 1
ActiveRecord Query Interface (1), Season 1ActiveRecord Query Interface (1), Season 1
ActiveRecord Query Interface (1), Season 1RORLAB
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrAndy Jackson
 
Analytics and Graph Traversal with Solr - Yonik Seeley, Cloudera
Analytics and Graph Traversal with Solr - Yonik Seeley, ClouderaAnalytics and Graph Traversal with Solr - Yonik Seeley, Cloudera
Analytics and Graph Traversal with Solr - Yonik Seeley, ClouderaLucidworks
 
Flexible search in Apache Jackrabbit Oak
Flexible search in Apache Jackrabbit OakFlexible search in Apache Jackrabbit Oak
Flexible search in Apache Jackrabbit OakTommaso Teofili
 

Mais procurados (20)

Solr 6 Feature Preview
Solr 6 Feature PreviewSolr 6 Feature Preview
Solr 6 Feature Preview
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHP
 
Ingesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScriptIngesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScript
 
ERRest and Dojo
ERRest and DojoERRest and Dojo
ERRest and Dojo
 
IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys" IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys"
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and Solr
 
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
 
ERRest - Designing a good REST service
ERRest - Designing a good REST serviceERRest - Designing a good REST service
ERRest - Designing a good REST service
 
Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and Spark
 
20150210 solr introdution
20150210 solr introdution20150210 solr introdution
20150210 solr introdution
 
Using existing language skillsets to create large-scale, cloud-based analytics
Using existing language skillsets to create large-scale, cloud-based analyticsUsing existing language skillsets to create large-scale, cloud-based analytics
Using existing language skillsets to create large-scale, cloud-based analytics
 
Introduction to Apache Lucene/Solr
Introduction to Apache Lucene/SolrIntroduction to Apache Lucene/Solr
Introduction to Apache Lucene/Solr
 
ElasticSearch AJUG 2013
ElasticSearch AJUG 2013ElasticSearch AJUG 2013
ElasticSearch AJUG 2013
 
ActiveRecord Query Interface (1), Season 1
ActiveRecord Query Interface (1), Season 1ActiveRecord Query Interface (1), Season 1
ActiveRecord Query Interface (1), Season 1
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
ERRest in Depth
ERRest in DepthERRest in Depth
ERRest in Depth
 
Analytics and Graph Traversal with Solr - Yonik Seeley, Cloudera
Analytics and Graph Traversal with Solr - Yonik Seeley, ClouderaAnalytics and Graph Traversal with Solr - Yonik Seeley, Cloudera
Analytics and Graph Traversal with Solr - Yonik Seeley, Cloudera
 
Flexible search in Apache Jackrabbit Oak
Flexible search in Apache Jackrabbit OakFlexible search in Apache Jackrabbit Oak
Flexible search in Apache Jackrabbit Oak
 

Destaque

Beyond text similarity
Beyond text similarityBeyond text similarity
Beyond text similaritychristianuhlcc
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered LuceneErik Hatcher
 
Jazeed about Solr - People as A Search Problem
Jazeed about Solr - People as A Search ProblemJazeed about Solr - People as A Search Problem
Jazeed about Solr - People as A Search ProblemLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Updated: Preparing an investor presentation
Updated:  Preparing an investor presentationUpdated:  Preparing an investor presentation
Updated: Preparing an investor presentationMarty Kaszubowski
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrLucidworks (Archived)
 
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery PlatformExtending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery PlatformLucidworks (Archived)
 
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14Marty Kaszubowski
 
Davis mark advanced search analytics in 20 minutes
Davis mark   advanced search analytics in 20 minutesDavis mark   advanced search analytics in 20 minutes
Davis mark advanced search analytics in 20 minutesLucidworks (Archived)
 
Updated: You Have An Idea ... Do You Have A Business?
Updated: You Have An Idea ...  Do You Have A Business?Updated: You Have An Idea ...  Do You Have A Business?
Updated: You Have An Idea ... Do You Have A Business?Marty Kaszubowski
 
Seeley yonik solr performance key innovations
Seeley yonik   solr performance key innovationsSeeley yonik   solr performance key innovations
Seeley yonik solr performance key innovationsLucidworks (Archived)
 
Discover the new techniques about search application
Discover the new techniques about search applicationDiscover the new techniques about search application
Discover the new techniques about search applicationLucidworks (Archived)
 
Already, just, still, yet
Already, just, still, yetAlready, just, still, yet
Already, just, still, yettanica
 
IE のサポート変更が Azure に及ぼす影響
IE のサポート変更が Azure に及ぼす影響IE のサポート変更が Azure に及ぼす影響
IE のサポート変更が Azure に及ぼす影響彰 村地
 
Windows 8 で魅力的なWeb サイトを作る
Windows 8 で魅力的なWeb サイトを作るWindows 8 で魅力的なWeb サイトを作る
Windows 8 で魅力的なWeb サイトを作る彰 村地
 
Adobe Photoshop
Adobe PhotoshopAdobe Photoshop
Adobe PhotoshopLaRue
 

Destaque (20)

Beyond text similarity
Beyond text similarityBeyond text similarity
Beyond text similarity
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
Jazeed about Solr - People as A Search Problem
Jazeed about Solr - People as A Search ProblemJazeed about Solr - People as A Search Problem
Jazeed about Solr - People as A Search Problem
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Updated: Preparing an investor presentation
Updated:  Preparing an investor presentationUpdated:  Preparing an investor presentation
Updated: Preparing an investor presentation
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
Open Source Search Applications
Open Source Search ApplicationsOpen Source Search Applications
Open Source Search Applications
 
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery PlatformExtending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
 
Overview of Searching in Solr 1.4
Overview of Searching in Solr 1.4Overview of Searching in Solr 1.4
Overview of Searching in Solr 1.4
 
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14
 
Solr lucene search revolution
Solr lucene search revolutionSolr lucene search revolution
Solr lucene search revolution
 
Davis mark advanced search analytics in 20 minutes
Davis mark   advanced search analytics in 20 minutesDavis mark   advanced search analytics in 20 minutes
Davis mark advanced search analytics in 20 minutes
 
Updated: You Have An Idea ... Do You Have A Business?
Updated: You Have An Idea ...  Do You Have A Business?Updated: You Have An Idea ...  Do You Have A Business?
Updated: You Have An Idea ... Do You Have A Business?
 
Seeley yonik solr performance key innovations
Seeley yonik   solr performance key innovationsSeeley yonik   solr performance key innovations
Seeley yonik solr performance key innovations
 
Discover the new techniques about search application
Discover the new techniques about search applicationDiscover the new techniques about search application
Discover the new techniques about search application
 
Already, just, still, yet
Already, just, still, yetAlready, just, still, yet
Already, just, still, yet
 
IE のサポート変更が Azure に及ぼす影響
IE のサポート変更が Azure に及ぼす影響IE のサポート変更が Azure に及ぼす影響
IE のサポート変更が Azure に及ぼす影響
 
E learning At The Library
E learning At The LibraryE learning At The Library
E learning At The Library
 
Windows 8 で魅力的なWeb サイトを作る
Windows 8 で魅力的なWeb サイトを作るWindows 8 で魅力的なWeb サイトを作る
Windows 8 で魅力的なWeb サイトを作る
 
Adobe Photoshop
Adobe PhotoshopAdobe Photoshop
Adobe Photoshop
 

Semelhante a Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever

Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr WorkshopJSGB
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conferenceErik Hatcher
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platformTommaso Teofili
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesRahul Jain
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0Erik Hatcher
 
OrigoDB - take the red pill
OrigoDB - take the red pillOrigoDB - take the red pill
OrigoDB - take the red pillRobert Friberg
 
What's New in Apache Solr 4.10
What's New in Apache Solr 4.10What's New in Apache Solr 4.10
What's New in Apache Solr 4.10Anshum Gupta
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Alexandre Rafalovitch
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack
 

Semelhante a Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever (20)

Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conference
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platform
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Apache Solr for begginers
Apache Solr for begginersApache Solr for begginers
Apache Solr for begginers
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
 
High Performance Solr
High Performance SolrHigh Performance Solr
High Performance Solr
 
OrigoDB - take the red pill
OrigoDB - take the red pillOrigoDB - take the red pill
OrigoDB - take the red pill
 
What's New in Apache Solr 4.10
What's New in Apache Solr 4.10What's New in Apache Solr 4.10
What's New in Apache Solr 4.10
 
Apache solr liferay
Apache solr liferayApache solr liferay
Apache solr liferay
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
Oracle by Muhammad Iqbal
Oracle by Muhammad IqbalOracle by Muhammad Iqbal
Oracle by Muhammad Iqbal
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 

Mais de Lucidworks (Archived)

Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and SolrLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarLucidworks (Archived)
 
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucidworks (Archived)
 

Mais de Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinar
 
Solr4 nosql search_server_2013
Solr4 nosql search_server_2013Solr4 nosql search_server_2013
Solr4 nosql search_server_2013
 
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
 

Último

QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 

Último (20)

QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 

Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever

  • 1. Apache Solr 1.4 Faster, Easier, and More Versatile than Ever 1
  • 2. About Erik • Member of Technical Staff, Lucid Imagination • Co-author "Lucene in Action" • Frequent speaker at industry conferences • Committer: Lucene and Solr • Apache Lucene PMC member 2
  • 3. Lucene 3
  • 4. Lucene 2.9 • IndexReader#reopen() • Faster filter performance, by 300% in some cases • Per-segment FieldCache • Reusable token streams • Faster numeric/date range queries, thanks to trie • and tons more, see Lucene 2.9's CHANGES.txt 4
  • 6. Trie fields • Trie* fields index multiple precisions steps • Works for numerics & dates • Result: up to 40x faster than standard range queries • Configurable precision step 6
  • 7. Trie Example Trie-Range Prefix Tree Encoding and relevant nodes for range [215 TO 977] 7
  • 8. Solr 8
  • 10. Performance Improvements • Caching • Concurrent file access • Per-segment index updates • Faceting • DocSet generation, avoids scoring • Streaming updates for SolrJ 10
  • 11. Caching • LRU cache now based on ConcurrentHashMap • Reads are lockless, writes are partitioned • Minimizes overhead of synchronization • Improves filterCache, queryCache, and documentCache • Anecdotal evidence: can double query throughput in some circumstances 11
  • 12. Concurrent file access • Defaults to NIOFSDirectory on non- Windows platforms for better concurrency • Pluggable DirectoryProvider 12
  • 13. Per-segment caching • FieldCache per segment • Improves searching, sorting, and function queries 13
  • 14. Faceting performance • Major performance improvements on multi-valued fields! • http://yonik.wordpress.com/2008/11/25/ solr-faceted-search-performance- improvements/ 14
  • 15. SolrJ • Binary updates (bye bye XML) • StreamingUpdateSolrServer • Streams multiple documents over multiple connections • Simple test went from 231 docs/sec to 25000 docs/sec! • LBHttpSolrServer • load balancing / failover 15
  • 16. Feature Improvements • Rich document indexing • DataImportHandler enhancements • Smoother replication • More choices for logging • Multi-select faceting • Speedier range queries • Duplicate detection • New request handler components 16
  • 17. Solr Cell • Content Extraction Library L ? • Richly extracts text from Microsoft Office, PDF, HTML, and many other formats • Powered by Apache Tika • http://wiki.apache.org/solr/ ExtractingRequestHandler 17
  • 18. DataImportHandler • SQL deltaImportQuery • abort, skip, continue options • FieldReader • example: can read XML from CLOB • HTML strip transformer • Event listener API (import start/end) • ContentStreamDataSource • MailEntityProcessor • LineEntityProcessor, FileListEntityProcessor 18
  • 19. Replication • Solr 1.3: rsync-based • Solr 1.4: Java-based, request handler • Copy config files, and optionally rename • Basic authentication support • Custom Solr index deletion policy enables deletion of commit points on various criteria such as number of commits, age of commit point and optimized status (pluggable) 19
  • 20. Replication Configuration Master <requestHandler name="/replication" class="solr.ReplicationHandler"> <lst name="master"> <str name="replicateAfter">commit</str> <str name="confFiles">schema.xml,stopwords.txt</str> </lst> </requestHandler> Slave <requestHandler name="/replication" class="solr.ReplicationHandler"> <lst name="slave"> <str name="masterUrl"> http://masterhostname:8983/solr/replication </str> <str name="pollInterval">00:00:60</str> </lst> </requestHandler> 20
  • 21. New faceting features • multi-select • tag a filter query (fq) • exclude a tagged filter from facet counts • key - for labeling response structure 21
  • 23. Deduplication • Detect duplicates during indexing and handle them • Adds a signature field to the document (could be uniqueKey) • Exact (hash on certain fields) or Fuzzy duplicate detection • http://wiki.apache.org/solr/Deduplication 23
  • 25. Analysis • CharFilter • avoid splitting or dropping international • DoubleMetaphone non-letter characters • PersianAnalyzer, such as non spacing marks. ArabicAnalyzer, SmartChineseAnalyzer • PositionFilter support • WordDelimiterFilter • SnowballPorterFilterFactory • splitOnNumerics - protected words support • protected words support • HTMLStripCharFilter • stemEnglishPossessive • CommonGrams 25
  • 26. omitTermFreqAndPositions • Omits number of terms in that specific field & list of positions • Saves time and index space for non-text fields 26
  • 27. Trie Numeric Field Types <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/> <fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/> <fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/> <fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/> 27
  • 28. Trie Date Field Types <fieldType name="date" class="solr.TrieDateField" omitNorms="true" precisionStep="0" positionIncrementGap="0"/> <fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/> 28
  • 29. Wildcard handling • ReversedWildcardFilterFactory • Index-time reversal, query-time handling • Uses special marker for end • Example: • Document: <field name="text_rev">Solr</field> • Indexed: #rlos • Query: *olr -> #rlo* • # used here to denote marker character 29
  • 30. Function queries • milliseconds: ms() • subtraction: sub() • {!frange l=6 u=9}sqrt(sum(a,b)) • http://yonik.wordpress.com/2009/07/06/ ranges-over-functions-in-solr-1-4/ 30
  • 31. Stats Component • min, max, sum, sumOfSquares, count, missing, mean, stddev • numeric fields only • http://localhost:8983/solr/select? q=*:*&stats=true&stats.field=price&stats.field=popularity &rows=0&indent=true 31
  • 32. Terms Component • Return indexed terms+docfreq in a field, use for auto-suggest, etc • http://localhost:8983/solr/terms? terms.fl=name&terms.lower=a&terms.sort=index 32
  • 33. Term Vector Component • Returns term info per document (tf, positions) • http://localhost:8983/solr/select/?q=* %3A*&version=2.2&start=0&rows=10&indent=on&qt=tvrh& tv=true&tv.tf=true&tv.df=true&tv.positions&tv.offsets=true 33
  • 34. Field & Document Request Handlers • Provides similar capabilites as Solr's admin analysis tool, but with response in Solr's flexible formats 34
  • 35. Clustering • Implemented with Carrot2 • Uses Carrot2 to dynamically cluster the top N search results • Like dynamically discovered facets • http://wiki.apache.org/solr/ ClusteringComponent 35
  • 36. Example Clustering Output <lst name="cluster"> <lst name="labels"> <str name="label">Car Power Adapter</str> </lst> <lst name="docs"> <str name="doc">F8V7067-APL-KIT</str> <str name="doc">IW-02</str> </lst> </lst> <lst name="cluster"> <lst name="labels"> <str name="label">Display</str> </lst> <lst name="docs"> <str name="doc">MA147LL/A</str> <str name="doc">VA902B</str> </lst> </lst> <lst name="cluster"> <lst name="labels"> <str name="label">Hard Drive</str> </lst> <lst name="docs"> <str name="doc">SP2514N</str> <str name="doc">6H500F0</str> </lst> </lst> <lst name="cluster"> <lst name="labels"> <str name="label">Retail</str> </lst> <lst name="docs"> <str name="doc">TWINX2048-3200PRO</str> <str name="doc">VS1GB400C3</str> 36
  • 37. Distributed Search • facet.sort=lex • timeout support: shard-socket-timeout & shard-connection-timeout 37
  • 38. VelocityResponseWriter • celeritas: swiftness, speed (Latin), origin of the symbol "c" for the speed of light • solritas:Velocity template rendering of Solr responses • Useful for rapid prototyping and more 38
  • 39. AJAX-Solr • Newest Solr/AJAX library • Improves upon the old SolrJS library that was to be in Solr 1.4 • http://github.com/evolvingweb/AJAX-Solr/ 39
  • 40. Odds & Ends • omitHeader • binary field (Base64) • logging switched to SLF4J • Highlighter: • spellcheck rebuild on • field globbing: optimize, if configured hl.fl=*_text • maxChars on copyField • now supports range/ • Nested query support in wildcard/fuzzy/prefix function query parser • FieldCache stats • expungeDeletes exposed to stats.jsp and JMX • XInclude support • multicore merge • plugins: enable flag 40
  • 41. Upgrading from 1.3 • omitTermFreqAndPositions • set version from 1.1 to 1.2 in schema.xml • default query parser syntax no longer supports ;sort options, use &sort= instead, or switch defType=lucenePlusSort • Potential analysis differences when using WordDelimiterFilterFactory (SOLR-1078) • Reindexing not required, but can't hurt. 41
  • 42. Book 42