This document provides an overview of Apache Solr and the EXT:solr extension for TYPO3 CMS. It discusses the history and development of EXT:solr, its current status, how it works, and some caveats. Key points include that EXT:solr allows indexing of content to enable powerful search, it supports versions 4.5-6.1 of TYPO3 CMS, and additional features are provided through add-ons. The roadmap is to transition EXT:solr to Extbase and FLUID for future versions.
4. About me
Olivier Dobberkau
CEO of dkd Internet Service GmbH
Research and Development
over 10 years of TYPO3 CMS
Member of the T3A EAB
olivier.dobberkau@dkd.de
Twitter: T3RevNeverEnd
7. History of EXT:solr
●
●
●
●
●
●
●
Indexed Search gave us some pain
First prototype 2009
What you get in one or two days of work
Started Funding of Development
over 70 Sponsors
Its possible to offer services around it
Support and Consulting available
8. Current Status
Version 2.8.2 was released November 2012
Introduced the Add-ons for additional features
Supported TYPO3 CMS Versions
4.5, 4.6 & 4.7
Supported Solr Server
3.6.2 (Time flies when you are having fun!)
9. The last TER Release
TER: 2.8.3
Introduce support for TYPO3 CMS Versions 4.5
- 6.1
Loads of bug-fixes
Maintenance Release
10. Next Major Version
EXT:solr 3.x will be the next version
Release will be hopefully soon(tm)
Will have no new features on the TYPO3 side
Support for TYPO3 CMS 4.5 - 6.1
Add Apache Solr 4.4 as a Server
11. Roadmap for EXT:solr 4.x
●
●
●
●
●
Backend parts of the EXT all in Extbase
Templates go FLUID
Frontend goes Extbase
4.x will be 6.2 only!
Effort estimated 2 to 4 man months
12. The EXT:solr ecosystem
The base is EXT:solr
Features are added thru Add-ons
● EXT:solrfile (File-Indexing for CMS 4.5 - 4.7)
● EXT:solrdam (File-Indexing with DAM)
● EXT:solrfal (File-Indexing for CMS 6.1 & 6.2)
● EXT:solrmlt (More like this)
● EXT:solrgrouping
● EXT:tika (Extracting Service)
13. EXT:solr
So what does it do?
● Indexing
● Querying
● Results Listing
● Logging / Analysis
14. Indexing
●
●
●
●
Indexing of pages
Indexing of TCA records
Indexing of Files (Add-On)
Index Queue
○ List of all to be indexed items
○ Every time an items is touched/changed an update
is sent to the solr server
○ No need for a crawler / instant results
15. Indexing
● Indexing is very easy and can be achieved
thru simple typoscript configuration
● Additionally you can use Apache Nutch to
index non TYPO3 websites
● Support for more than 30 Languages
16. Querying
● Easy to set up
● Apply Lucene query language if you want to
search for specific items (only news i.e)
● You can tell solr to boost results if query
terms are in the fields you are searching
● Use elevation to rank terms
● Correct Stemming available
● Range queries (Intelligent dates)
17. Results Listing
● Results can be fully individualized
○ Templates for different results types
● Sorting of the Results List
○
○
○
○
Relevance
Date
Title
any other field
● Can be toggled
18. Result Listings
● Facettes
○ Filter the results based of attributes
○ Hierarchical Facettes
●
●
●
●
Suggestions / Autocomplete
Stopwords
Protected words
Did you mean?
19. Logging / Analysis
● Built in query logging
● Can be used with your favorite Analytics
suite
● Feature rich analysis & debugging options
20. Caveats
● Junk in / Junk out
● Get your data right
● A String is not Text
○ Be aware of the difference between Strings and Text
○ Protect proper names from stemming
○ Example
21. Caveats
● Synonyms are nice, but don't abuse them
● Don't confuse Solr with a Database
○ %WORD% does not work
● Search with “WORD” if you want your query
to remain untouched
● * work only at the end of a word
○ cat* will find catapult, cats, catastrophe etc
○ *cat will yield with no results
22. Caveats
● Beware of indexing time
○ Pages index slower than TCA records
○ Files might be too big for initial settings
23. Some web resources
● You will find a lot of infos around the Apache
Solr Extension: www.typo3-solr.com
● http://forge.typo3.
org/projects/show/extension-solr
● Mailing List / Newsgroup / Forums
● Afraid of Solr? try www.hosted-solr.com
24. Books & Documentation
● Taming Text
● Apache Solr Cookbook
● Administering Solr
● Apache Solr 4.x
● WIKI of Apache Solr
https://cwiki.apache.
org/confluence/display/solr/Apache+Solr+Refer
ence+Guide