SlideShare uma empresa Scribd logo
1 de 25
Baixar para ler offline
Apache Solr 101
Killing the Vampires of Search
Cluj, 2013
Olivier Dobberkau
Some Vampirology first
●
●
●
●
●

Nosferatu
Dracula
van Helsing
Selene
Edward & Bella

http://en.wikipedia.org/wiki/Vampire_film
Agenda
●
●
●
●
●
●

About me
History of EXT:solr
Current status
Solr Basics
Caveats
Books & Documents
About me
Olivier Dobberkau
CEO of dkd Internet Service GmbH
Research and Development
over 10 years of TYPO3 CMS
Member of the T3A EAB
olivier.dobberkau@dkd.de
Twitter: T3RevNeverEnd
Scratching ...
.. the TYPO3 CMS search itch
History of EXT:solr
We all know when a solution fails ...
History of EXT:solr
●
●
●
●
●
●
●

Indexed Search gave us some pain
First prototype 2009
What you get in one or two days of work
Started Funding of Development
over 70 Sponsors
Its possible to offer services around it
Support and Consulting available
Current Status
Version 2.8.2 was released November 2012
Introduced the Add-ons for additional features
Supported TYPO3 CMS Versions
4.5, 4.6 & 4.7
Supported Solr Server
3.6.2 (Time flies when you are having fun!)
The last TER Release
TER: 2.8.3
Introduce support for TYPO3 CMS Versions 4.5
- 6.1
Loads of bug-fixes
Maintenance Release
Next Major Version
EXT:solr 3.x will be the next version
Release will be hopefully soon(tm)
Will have no new features on the TYPO3 side
Support for TYPO3 CMS 4.5 - 6.1
Add Apache Solr 4.4 as a Server
Roadmap for EXT:solr 4.x
●
●
●
●
●

Backend parts of the EXT all in Extbase
Templates go FLUID
Frontend goes Extbase
4.x will be 6.2 only!
Effort estimated 2 to 4 man months
The EXT:solr ecosystem
The base is EXT:solr
Features are added thru Add-ons
● EXT:solrfile (File-Indexing for CMS 4.5 - 4.7)
● EXT:solrdam (File-Indexing with DAM)
● EXT:solrfal (File-Indexing for CMS 6.1 & 6.2)
● EXT:solrmlt (More like this)
● EXT:solrgrouping
● EXT:tika (Extracting Service)
EXT:solr
So what does it do?
● Indexing
● Querying
● Results Listing
● Logging / Analysis
Indexing
●
●
●
●

Indexing of pages
Indexing of TCA records
Indexing of Files (Add-On)
Index Queue
○ List of all to be indexed items
○ Every time an items is touched/changed an update
is sent to the solr server
○ No need for a crawler / instant results
Indexing
● Indexing is very easy and can be achieved
thru simple typoscript configuration
● Additionally you can use Apache Nutch to
index non TYPO3 websites
● Support for more than 30 Languages
Querying
● Easy to set up
● Apply Lucene query language if you want to
search for specific items (only news i.e)
● You can tell solr to boost results if query
terms are in the fields you are searching
● Use elevation to rank terms
● Correct Stemming available
● Range queries (Intelligent dates)
Results Listing
● Results can be fully individualized
○ Templates for different results types

● Sorting of the Results List
○
○
○
○

Relevance
Date
Title
any other field

● Can be toggled
Result Listings
● Facettes
○ Filter the results based of attributes
○ Hierarchical Facettes

●
●
●
●

Suggestions / Autocomplete
Stopwords
Protected words
Did you mean?
Logging / Analysis
● Built in query logging
● Can be used with your favorite Analytics
suite
● Feature rich analysis & debugging options
Caveats
● Junk in / Junk out
● Get your data right
● A String is not Text
○ Be aware of the difference between Strings and Text
○ Protect proper names from stemming
○ Example
Caveats
● Synonyms are nice, but don't abuse them
● Don't confuse Solr with a Database
○ %WORD% does not work

● Search with “WORD” if you want your query
to remain untouched
● * work only at the end of a word
○ cat* will find catapult, cats, catastrophe etc
○ *cat will yield with no results
Caveats
● Beware of indexing time
○ Pages index slower than TCA records
○ Files might be too big for initial settings
Some web resources
● You will find a lot of infos around the Apache
Solr Extension: www.typo3-solr.com
● http://forge.typo3.
org/projects/show/extension-solr
● Mailing List / Newsgroup / Forums
● Afraid of Solr? try www.hosted-solr.com
Books & Documentation
● Taming Text
● Apache Solr Cookbook
● Administering Solr
● Apache Solr 4.x
● WIKI of Apache Solr
https://cwiki.apache.
org/confluence/display/solr/Apache+Solr+Refer
ence+Guide
Merci!
Thank you!

Mais conteúdo relacionado

Semelhante a Killing the Vampires of Search with Apache Solr 101

2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies
2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies
2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for NewbiesTYPO3 CertiFUNcation
 
Status & Outlook on EXT:solr for TYPO3 CMS
Status & Outlook on EXT:solr for TYPO3 CMSStatus & Outlook on EXT:solr for TYPO3 CMS
Status & Outlook on EXT:solr for TYPO3 CMSOlivier Dobberkau
 
Introduction to Apache solr
Introduction to Apache solrIntroduction to Apache solr
Introduction to Apache solrKnoldus Inc.
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithNETWAYS
 
Destination Documentation: How Not to Get Lost in Your Org
Destination Documentation: How Not to Get Lost in Your OrgDestination Documentation: How Not to Get Lost in Your Org
Destination Documentation: How Not to Get Lost in Your Orgcsupilowski
 
Using Search API, Search API Solr and Facets in Drupal 8
Using Search API, Search API Solr and Facets in Drupal 8Using Search API, Search API Solr and Facets in Drupal 8
Using Search API, Search API Solr and Facets in Drupal 8Websolutions Agency
 
OpenSearch.pdf
OpenSearch.pdfOpenSearch.pdf
OpenSearch.pdfAbhi Jain
 
What Goes In Must Come Out: Egress-Assess and Data Exfiltration
What Goes In Must Come Out: Egress-Assess and Data ExfiltrationWhat Goes In Must Come Out: Egress-Assess and Data Exfiltration
What Goes In Must Come Out: Egress-Assess and Data ExfiltrationCTruncer
 
Turbo charge your logs
Turbo charge your logsTurbo charge your logs
Turbo charge your logsJeremy Cook
 
Solr in drupal 7 index and search more entities
Solr in drupal 7  index and search more entitiesSolr in drupal 7  index and search more entities
Solr in drupal 7 index and search more entitiesBiglazy
 
Query and audit logging in cassandra
Query and audit logging in cassandraQuery and audit logging in cassandra
Query and audit logging in cassandraVinay Kumar Chella
 
Ukoug webinar - testing PLSQL APIs with utPLSQL v3
Ukoug webinar - testing PLSQL APIs with utPLSQL v3Ukoug webinar - testing PLSQL APIs with utPLSQL v3
Ukoug webinar - testing PLSQL APIs with utPLSQL v3Jacek Gebal
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHPPaul Borgermans
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional ProgrammerDave Cross
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbWei Shan Ang
 
Apache Solr - An Experience Report
Apache Solr - An Experience ReportApache Solr - An Experience Report
Apache Solr - An Experience ReportNetcetera
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6DEEPAK KHETAWAT
 
Presentation of OpenNLP
Presentation of OpenNLPPresentation of OpenNLP
Presentation of OpenNLPRobert Viseur
 
Improve your SQL workload with observability
Improve your SQL workload with observabilityImprove your SQL workload with observability
Improve your SQL workload with observabilityOVHcloud
 

Semelhante a Killing the Vampires of Search with Apache Solr 101 (20)

2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies
2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies
2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies
 
Status & Outlook on EXT:solr for TYPO3 CMS
Status & Outlook on EXT:solr for TYPO3 CMSStatus & Outlook on EXT:solr for TYPO3 CMS
Status & Outlook on EXT:solr for TYPO3 CMS
 
Introduction to Apache solr
Introduction to Apache solrIntroduction to Apache solr
Introduction to Apache solr
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles Judith
 
Destination Documentation: How Not to Get Lost in Your Org
Destination Documentation: How Not to Get Lost in Your OrgDestination Documentation: How Not to Get Lost in Your Org
Destination Documentation: How Not to Get Lost in Your Org
 
Using Search API, Search API Solr and Facets in Drupal 8
Using Search API, Search API Solr and Facets in Drupal 8Using Search API, Search API Solr and Facets in Drupal 8
Using Search API, Search API Solr and Facets in Drupal 8
 
OpenSearch.pdf
OpenSearch.pdfOpenSearch.pdf
OpenSearch.pdf
 
What Goes In Must Come Out: Egress-Assess and Data Exfiltration
What Goes In Must Come Out: Egress-Assess and Data ExfiltrationWhat Goes In Must Come Out: Egress-Assess and Data Exfiltration
What Goes In Must Come Out: Egress-Assess and Data Exfiltration
 
Turbo charge your logs
Turbo charge your logsTurbo charge your logs
Turbo charge your logs
 
Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'
 
Solr in drupal 7 index and search more entities
Solr in drupal 7  index and search more entitiesSolr in drupal 7  index and search more entities
Solr in drupal 7 index and search more entities
 
Query and audit logging in cassandra
Query and audit logging in cassandraQuery and audit logging in cassandra
Query and audit logging in cassandra
 
Ukoug webinar - testing PLSQL APIs with utPLSQL v3
Ukoug webinar - testing PLSQL APIs with utPLSQL v3Ukoug webinar - testing PLSQL APIs with utPLSQL v3
Ukoug webinar - testing PLSQL APIs with utPLSQL v3
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHP
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional Programmer
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
Apache Solr - An Experience Report
Apache Solr - An Experience ReportApache Solr - An Experience Report
Apache Solr - An Experience Report
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6
 
Presentation of OpenNLP
Presentation of OpenNLPPresentation of OpenNLP
Presentation of OpenNLP
 
Improve your SQL workload with observability
Improve your SQL workload with observabilityImprove your SQL workload with observability
Improve your SQL workload with observability
 

Mais de Olivier Dobberkau

Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3
Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3
Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3Olivier Dobberkau
 
Apache Solr for TYPO3: More than a search engine
Apache Solr for TYPO3: More than a search engineApache Solr for TYPO3: More than a search engine
Apache Solr for TYPO3: More than a search engineOlivier Dobberkau
 
With a little help from my friends (english)
With a little help  from my friends (english)With a little help  from my friends (english)
With a little help from my friends (english)Olivier Dobberkau
 
With a little help from my friends
With a little help from my friendsWith a little help from my friends
With a little help from my friendsOlivier Dobberkau
 
Sonnenschein für ihre Website
Sonnenschein für ihre WebsiteSonnenschein für ihre Website
Sonnenschein für ihre WebsiteOlivier Dobberkau
 
TYPO3 Camp Poznan - Solr Usecases with Hosted Solr
TYPO3 Camp Poznan - Solr Usecases with Hosted SolrTYPO3 Camp Poznan - Solr Usecases with Hosted Solr
TYPO3 Camp Poznan - Solr Usecases with Hosted SolrOlivier Dobberkau
 
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...Olivier Dobberkau
 
ForgetIT: Beyond the page: Giving content a meaning and value
ForgetIT: Beyond the page: Giving content a meaning and valueForgetIT: Beyond the page: Giving content a meaning and value
ForgetIT: Beyond the page: Giving content a meaning and valueOlivier Dobberkau
 
ForgetIT Project TYPO3Camp Milano 2014
ForgetIT Project TYPO3Camp Milano 2014ForgetIT Project TYPO3Camp Milano 2014
ForgetIT Project TYPO3Camp Milano 2014Olivier Dobberkau
 
Explain TYPO3 Association March 2014
Explain TYPO3 Association March 2014Explain TYPO3 Association March 2014
Explain TYPO3 Association March 2014Olivier Dobberkau
 
Outside the Box - Panel on CMS at TYPO3 Camp Mallorca
Outside the Box - Panel on CMS at TYPO3 Camp MallorcaOutside the Box - Panel on CMS at TYPO3 Camp Mallorca
Outside the Box - Panel on CMS at TYPO3 Camp MallorcaOlivier Dobberkau
 
The future of CMS @T3UNI 2013 Annecy France
The future of CMS @T3UNI 2013 Annecy FranceThe future of CMS @T3UNI 2013 Annecy France
The future of CMS @T3UNI 2013 Annecy FranceOlivier Dobberkau
 
Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?Olivier Dobberkau
 
Everything you always wanted to know about search in typo3
Everything you always wanted to know about search in typo3Everything you always wanted to know about search in typo3
Everything you always wanted to know about search in typo3Olivier Dobberkau
 
Alles was-sie-ueber-suche-wissen-wollten
Alles was-sie-ueber-suche-wissen-wolltenAlles was-sie-ueber-suche-wissen-wollten
Alles was-sie-ueber-suche-wissen-wolltenOlivier Dobberkau
 

Mais de Olivier Dobberkau (20)

Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3
Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3
Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3
 
Apache Solr for TYPO3: More than a search engine
Apache Solr for TYPO3: More than a search engineApache Solr for TYPO3: More than a search engine
Apache Solr for TYPO3: More than a search engine
 
TYPO3 v8 LTS in the cloud
TYPO3 v8 LTS in the cloudTYPO3 v8 LTS in the cloud
TYPO3 v8 LTS in the cloud
 
With a little help from my friends (english)
With a little help  from my friends (english)With a little help  from my friends (english)
With a little help from my friends (english)
 
With a little help from my friends
With a little help from my friendsWith a little help from my friends
With a little help from my friends
 
TYPO3 & You
TYPO3 & YouTYPO3 & You
TYPO3 & You
 
Sonnenschein für ihre Website
Sonnenschein für ihre WebsiteSonnenschein für ihre Website
Sonnenschein für ihre Website
 
Apache Solr Revisited 2015
Apache Solr Revisited 2015Apache Solr Revisited 2015
Apache Solr Revisited 2015
 
TYPO3 Camp Poznan - Solr Usecases with Hosted Solr
TYPO3 Camp Poznan - Solr Usecases with Hosted SolrTYPO3 Camp Poznan - Solr Usecases with Hosted Solr
TYPO3 Camp Poznan - Solr Usecases with Hosted Solr
 
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...
 
TYPO3 and CMIS
TYPO3 and CMISTYPO3 and CMIS
TYPO3 and CMIS
 
ForgetIT: Beyond the page: Giving content a meaning and value
ForgetIT: Beyond the page: Giving content a meaning and valueForgetIT: Beyond the page: Giving content a meaning and value
ForgetIT: Beyond the page: Giving content a meaning and value
 
ForgetIT Project TYPO3Camp Milano 2014
ForgetIT Project TYPO3Camp Milano 2014ForgetIT Project TYPO3Camp Milano 2014
ForgetIT Project TYPO3Camp Milano 2014
 
Explain TYPO3 Association March 2014
Explain TYPO3 Association March 2014Explain TYPO3 Association March 2014
Explain TYPO3 Association March 2014
 
EXPLAIN #t3a
EXPLAIN #t3aEXPLAIN #t3a
EXPLAIN #t3a
 
Outside the Box - Panel on CMS at TYPO3 Camp Mallorca
Outside the Box - Panel on CMS at TYPO3 Camp MallorcaOutside the Box - Panel on CMS at TYPO3 Camp Mallorca
Outside the Box - Panel on CMS at TYPO3 Camp Mallorca
 
The future of CMS @T3UNI 2013 Annecy France
The future of CMS @T3UNI 2013 Annecy FranceThe future of CMS @T3UNI 2013 Annecy France
The future of CMS @T3UNI 2013 Annecy France
 
Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?
 
Everything you always wanted to know about search in typo3
Everything you always wanted to know about search in typo3Everything you always wanted to know about search in typo3
Everything you always wanted to know about search in typo3
 
Alles was-sie-ueber-suche-wissen-wollten
Alles was-sie-ueber-suche-wissen-wolltenAlles was-sie-ueber-suche-wissen-wollten
Alles was-sie-ueber-suche-wissen-wollten
 

Último

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Killing the Vampires of Search with Apache Solr 101

  • 1. Apache Solr 101 Killing the Vampires of Search Cluj, 2013 Olivier Dobberkau
  • 2. Some Vampirology first ● ● ● ● ● Nosferatu Dracula van Helsing Selene Edward & Bella http://en.wikipedia.org/wiki/Vampire_film
  • 3. Agenda ● ● ● ● ● ● About me History of EXT:solr Current status Solr Basics Caveats Books & Documents
  • 4. About me Olivier Dobberkau CEO of dkd Internet Service GmbH Research and Development over 10 years of TYPO3 CMS Member of the T3A EAB olivier.dobberkau@dkd.de Twitter: T3RevNeverEnd
  • 5. Scratching ... .. the TYPO3 CMS search itch
  • 6. History of EXT:solr We all know when a solution fails ...
  • 7. History of EXT:solr ● ● ● ● ● ● ● Indexed Search gave us some pain First prototype 2009 What you get in one or two days of work Started Funding of Development over 70 Sponsors Its possible to offer services around it Support and Consulting available
  • 8. Current Status Version 2.8.2 was released November 2012 Introduced the Add-ons for additional features Supported TYPO3 CMS Versions 4.5, 4.6 & 4.7 Supported Solr Server 3.6.2 (Time flies when you are having fun!)
  • 9. The last TER Release TER: 2.8.3 Introduce support for TYPO3 CMS Versions 4.5 - 6.1 Loads of bug-fixes Maintenance Release
  • 10. Next Major Version EXT:solr 3.x will be the next version Release will be hopefully soon(tm) Will have no new features on the TYPO3 side Support for TYPO3 CMS 4.5 - 6.1 Add Apache Solr 4.4 as a Server
  • 11. Roadmap for EXT:solr 4.x ● ● ● ● ● Backend parts of the EXT all in Extbase Templates go FLUID Frontend goes Extbase 4.x will be 6.2 only! Effort estimated 2 to 4 man months
  • 12. The EXT:solr ecosystem The base is EXT:solr Features are added thru Add-ons ● EXT:solrfile (File-Indexing for CMS 4.5 - 4.7) ● EXT:solrdam (File-Indexing with DAM) ● EXT:solrfal (File-Indexing for CMS 6.1 & 6.2) ● EXT:solrmlt (More like this) ● EXT:solrgrouping ● EXT:tika (Extracting Service)
  • 13. EXT:solr So what does it do? ● Indexing ● Querying ● Results Listing ● Logging / Analysis
  • 14. Indexing ● ● ● ● Indexing of pages Indexing of TCA records Indexing of Files (Add-On) Index Queue ○ List of all to be indexed items ○ Every time an items is touched/changed an update is sent to the solr server ○ No need for a crawler / instant results
  • 15. Indexing ● Indexing is very easy and can be achieved thru simple typoscript configuration ● Additionally you can use Apache Nutch to index non TYPO3 websites ● Support for more than 30 Languages
  • 16. Querying ● Easy to set up ● Apply Lucene query language if you want to search for specific items (only news i.e) ● You can tell solr to boost results if query terms are in the fields you are searching ● Use elevation to rank terms ● Correct Stemming available ● Range queries (Intelligent dates)
  • 17. Results Listing ● Results can be fully individualized ○ Templates for different results types ● Sorting of the Results List ○ ○ ○ ○ Relevance Date Title any other field ● Can be toggled
  • 18. Result Listings ● Facettes ○ Filter the results based of attributes ○ Hierarchical Facettes ● ● ● ● Suggestions / Autocomplete Stopwords Protected words Did you mean?
  • 19. Logging / Analysis ● Built in query logging ● Can be used with your favorite Analytics suite ● Feature rich analysis & debugging options
  • 20. Caveats ● Junk in / Junk out ● Get your data right ● A String is not Text ○ Be aware of the difference between Strings and Text ○ Protect proper names from stemming ○ Example
  • 21. Caveats ● Synonyms are nice, but don't abuse them ● Don't confuse Solr with a Database ○ %WORD% does not work ● Search with “WORD” if you want your query to remain untouched ● * work only at the end of a word ○ cat* will find catapult, cats, catastrophe etc ○ *cat will yield with no results
  • 22. Caveats ● Beware of indexing time ○ Pages index slower than TCA records ○ Files might be too big for initial settings
  • 23. Some web resources ● You will find a lot of infos around the Apache Solr Extension: www.typo3-solr.com ● http://forge.typo3. org/projects/show/extension-solr ● Mailing List / Newsgroup / Forums ● Afraid of Solr? try www.hosted-solr.com
  • 24. Books & Documentation ● Taming Text ● Apache Solr Cookbook ● Administering Solr ● Apache Solr 4.x ● WIKI of Apache Solr https://cwiki.apache. org/confluence/display/solr/Apache+Solr+Refer ence+Guide