SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
By:	
  Ramez	
  Ibrahim	
  AL	
  Fayez	
  
Agenda	
  
¡ Introduc9ons	
  
¡ What	
  is	
  Solr?	
  
¡ Main	
  Solr	
  Features	
  and	
  A@ributes	
  	
  
¡ Content,	
  Query,	
  Facet,	
  API,	
  Scalability	
  
¡ Interface	
  and	
  useful	
  commands	
  
¡ Live	
  Demo	
  
Introduc9on	
  
—  Search	
  has	
  become	
  mission	
  cri9cal	
  for	
  most	
  enterprises	
  
—  Intranet	
  
—  Web	
  presence	
  
—  E-­‐commerce	
  
—  Exponen9al	
  growth	
  of	
  data	
  
—  Cost	
  of	
  not	
  finding	
  informa9on	
  
—  Knowledge	
  (sharing)	
  
—  Time	
  
—  Money	
  
—  Informa9on	
  blackhole	
  
What	
  is	
  Solr?	
  
Official	
  defini,on:	
  
	
   “Solr	
   is	
   an	
   open	
   source	
   enterprise	
   search	
   pla7orm	
   based	
   on	
   the	
  
	
   Lucene	
   Java	
   search	
   library,	
   with	
   an	
   HTTP	
   interface	
   using	
   XML,	
  
	
   JSON	
   or	
   other	
   formats.	
   It	
   provides	
   hit	
   highligh,ng,	
   faceted	
  
	
   search,	
   caching,	
   replica,on,	
   a	
   web	
   administra,on	
   interface	
   and	
  
	
   many	
   more	
   features.	
   It	
   runs	
   in	
   a	
   Java	
   servlet	
   container	
   such	
   as	
  
	
  Apache	
  Tomcat.”	
  
— h#p://lucene.apache.org/solr	
  
What	
  is	
  Solr?	
  
—  In	
  2004,	
  Solr	
  was	
  created	
  by	
  Yonik	
  Seeley	
  at	
  CNET	
  Networks	
  as	
  an	
  in-­‐house	
  project	
  
to	
  add	
  search	
  capability	
  for	
  the	
  company	
  website.	
  
—  Open-­‐source,	
  license-­‐free	
  search	
  engine	
  
—  Built	
  on	
  top	
  of	
  Apache	
  Lucene	
  library,	
  and	
  adds	
  enterprise	
  search	
  server	
  features	
  
and	
  capabili9es	
  	
  
—  Web	
  based	
  applica9on	
  that	
  processes	
  requests	
  and	
  returns	
  responses	
  via	
  HTTP,	
  
and	
  APIs	
  
Why	
  choosing	
  Solr?	
  
—  Customizable	
  
—  High	
  quality	
  and	
  easily	
  modifiable	
  relevancy	
  
—  Very	
  fast	
  query	
  and	
  indexing	
  performance	
  
—  Open	
  source	
  so^ware	
  is	
  free	
  
—  Highly	
  flexible	
  data	
  processing/transforma9on	
  
—  Easy	
  scalability	
  and	
  great	
  performance	
  	
  
—  Modern	
  solu9on	
  architecture	
  based	
  on	
  XML	
  and	
  Java	
  
—  Well	
  integrated	
  with	
  the	
  ecosystem	
  around	
  Big	
  Data,	
  such	
  as	
  Hadoop	
  (also	
  
Nutch,	
  Tika)	
  
Solr’s	
  Main	
  Features	
  
—  Full	
  text	
  search	
  
—  Field	
  search	
  
—  Number	
  and	
  date	
  searching	
  
—  Facets	
  
—  Spelling	
  assistance	
  –	
  “Did	
  you	
  mean…?”	
  
—  Related	
  hits	
  	
  
—  Query	
  comple9on	
  
—  Admin	
  GUI	
  
—  Data	
  Import	
  Handler	
  
—  Index	
  Databases,	
  Mails,	
  RSS,	
  XMLs	
  etc.	
  
—  Rich	
  document	
  support	
  
—  PDF,	
  MS	
  Office,	
  Images	
  etc	
  
—  Replica9on	
  for	
  high	
  query	
  volume	
  
—  Distributed	
  search	
  for	
  large	
  indexes	
  
—  Produc9on	
  systems	
  with	
  1B+	
  documents	
  
—  Very	
  extensible	
  and	
  customizable	
  
—  Embedded	
  in	
  commercial	
  search	
  products	
  
from	
  LucidWorks,	
  DataStax,	
  Cloudera,	
  
Hortonworks,	
  Amazon	
  CloudSearch	
  and	
  Riak	
  
Main	
  A@ribute	
  	
  
—  Index(ing)	
  
—  Inverted	
  index	
  
—  Document	
  
—  Field	
  
—  Stored	
  and/or	
  indexed	
  
fields	
  
—  Analysis	
  
—  Tokeniza9on	
  
—  Filters	
  
—  Terms	
  
—  Query	
  
—  Filter	
  
—  Func9on	
  
—  Facet	
  
Content	
  
—  Out	
  of	
  the	
  box	
  support	
  for	
  JSON	
  
—  Solr	
  handles	
  CSV,	
  XML,	
  Rich	
  Content	
  out	
  of	
  the	
  box	
  without	
  
having	
  to	
  install	
  plugins	
  	
  
Indexing	
  and	
  Ranking	
  
—  Solr	
  use	
  Inverted	
  index	
  
—  For	
  ranking,	
  solr	
  use	
  TF-­‐IDF	
  and	
  Similarity	
  
—  Similarity	
  is	
  a	
  combina9on	
  of	
  Boolean	
  model	
  (BM)	
  and	
  
Vector	
  Space	
  Model	
  (VSM)	
  
—  Another	
  feature,	
  user	
  can	
  do	
  re-­‐rank	
  to	
  the	
  query	
  	
  
Query	
  
—  Common	
  parameters	
  
—  Start,	
  rows,	
  fl,	
  fq,	
  sort	
  
?q=*:*&start=0&rows=10&fl=9tle&fq=collec9on:popular&sort=9tle	
  asc	
  
—  Slightly	
  more	
  advanced	
  
—  &facets	
  
—  &qf	
  
&qf=keyword^4	
  content1^8	
  content2^3	
  content3^2	
  stem1^1.5	
  stem2^1.2	
  
stem3^0.5	
  
Facet	
  
“Faceted	
  search	
  is	
  the	
  dynamic	
  clustering	
  of	
  items	
  or	
  search	
  results	
  
into	
  categories	
  that	
  let	
  users	
  drill	
  into	
  search	
  results	
  (or	
  even	
  skip	
  
searching	
  en9rely)	
  by	
  any	
  value	
  in	
  any	
  field.	
  “	
  
—  Naviga9on/discovery	
  technique	
  
—  Tally	
  of	
  docs	
  for	
  each	
  dis9nct	
  field	
  value	
  
—  Parameters	
  
—  &facet=true	
  
—  &facet.field=category	
  
API	
  
—  REST	
  API	
  for	
  adding	
  field	
  types,	
  and	
  dynamic	
  fields	
  	
  
—  Managing	
  Request	
  Handlers	
  through	
  API	
  	
  
—  Improved	
  APIs	
  for	
  managing	
  collec9ons	
  	
  
—  Implicit	
  registra9on	
  of	
  replica9on,	
  Real	
  Time	
  Get	
  and	
  Administra9on	
  
Handlers	
  
—  Out	
  of	
  the	
  box	
  support	
  for	
  JSON	
  
—  Solr	
  handles	
  CSV,	
  XML,	
  Rich	
  Content	
  out	
  of	
  the	
  box	
  without	
  having	
  to	
  install	
  
plugins	
  	
  
Scalability	
  
—  Architecture	
  goals:	
  
—  More	
  queries	
  per	
  second	
  (qps)	
  
—  Faster	
  query	
  execu9on	
  
—  Bigger	
  indexes	
  
—  Faster	
  indexing	
  
—  Scaling	
  op9ons	
  
—  Mul9core	
  
—  Replica9on	
  
—  Sharding	
  
Useful	
  commands	
  
—  ./bin/solr	
  {start|stop}	
  	
  
—  ./bin/solr	
  create	
  -­‐c	
  <COLL_NAME>	
  
—  bin/post	
  -­‐c	
  <COLL_NAME>	
  <Files	
  to	
  index>	
  	
  
—  /bin/solr	
  delete	
  	
  
Main	
  Interface	
  
Finish	
  !	
  

Mais conteúdo relacionado

Mais procurados

Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solrsagar chaturvedi
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5israelekpo
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6DEEPAK KHETAWAT
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrAndy Jackson
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesRahul Jain
 
Building your own search engine with Apache Solr
Building your own search engine with Apache SolrBuilding your own search engine with Apache Solr
Building your own search engine with Apache SolrBiogeeks
 
Consuming External Content and Enriching Content with Apache Camel
Consuming External Content and Enriching Content with Apache CamelConsuming External Content and Enriching Content with Apache Camel
Consuming External Content and Enriching Content with Apache Cameltherealgaston
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorialChris Huang
 
Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solrpittaya
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solrguest432cd6
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBertrand Delacretaz
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHPPaul Borgermans
 

Mais procurados (20)

Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solr
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Lucene basics
Lucene basicsLucene basics
Lucene basics
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
EVOLVE'13 | Enhance | External Search | Matthias Wermund
EVOLVE'13 | Enhance | External Search | Matthias WermundEVOLVE'13 | Enhance | External Search | Matthias Wermund
EVOLVE'13 | Enhance | External Search | Matthias Wermund
 
Building your own search engine with Apache Solr
Building your own search engine with Apache SolrBuilding your own search engine with Apache Solr
Building your own search engine with Apache Solr
 
Consuming External Content and Enriching Content with Apache Camel
Consuming External Content and Enriching Content with Apache CamelConsuming External Content and Enriching Content with Apache Camel
Consuming External Content and Enriching Content with Apache Camel
 
Solr Presentation
Solr PresentationSolr Presentation
Solr Presentation
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorial
 
Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solr
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solr
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and Solr
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHP
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 

Semelhante a Solr Architecture

New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemTrey Grainger
 
Apace Solr Web Development.pdf
Apace Solr Web Development.pdfApace Solr Web Development.pdf
Apace Solr Web Development.pdfAbanti Aazmin
 
IBM Omnifind Enterprise Portal Seach To Improve Productivity
IBM Omnifind Enterprise   Portal Seach To Improve ProductivityIBM Omnifind Enterprise   Portal Seach To Improve Productivity
IBM Omnifind Enterprise Portal Seach To Improve ProductivityFrancis Ricalde
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Lucidworks
 
SemTech 2010: Pelorus Platform
SemTech 2010: Pelorus PlatformSemTech 2010: Pelorus Platform
SemTech 2010: Pelorus PlatformClark & Parsia LLC
 
Faceted search using Solr and Ontopia
Faceted search using Solr and OntopiaFaceted search using Solr and Ontopia
Faceted search using Solr and OntopiaGeir Ove Grønmo
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineLeigh Dodds
 
Drupal and Apache Solr Search Go Together Like Pizza and Beer for Your Site
Drupal and Apache Solr Search Go Together Like Pizza and Beer for Your SiteDrupal and Apache Solr Search Go Together Like Pizza and Beer for Your Site
Drupal and Apache Solr Search Go Together Like Pizza and Beer for Your Sitenyccamp
 
Why MongoDB over other Databases - Habilelabs
Why MongoDB over other Databases - HabilelabsWhy MongoDB over other Databases - Habilelabs
Why MongoDB over other Databases - HabilelabsHabilelabs
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAsad Abbas
 
Introduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
Introduction to Lucidworks Fusion - Alexander Kanarsky, LucidworksIntroduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
Introduction to Lucidworks Fusion - Alexander Kanarsky, LucidworksLucidworks
 
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionFlink Forward
 
Semtech 2011 impressions
Semtech 2011 impressionsSemtech 2011 impressions
Semtech 2011 impressionsGeorge Roth
 

Semelhante a Solr Architecture (20)

Solr 101
Solr 101Solr 101
Solr 101
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Drupal 7 and SolR
Drupal 7 and SolRDrupal 7 and SolR
Drupal 7 and SolR
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
 
Apace Solr Web Development.pdf
Apace Solr Web Development.pdfApace Solr Web Development.pdf
Apace Solr Web Development.pdf
 
IBM Omnifind Enterprise Portal Seach To Improve Productivity
IBM Omnifind Enterprise   Portal Seach To Improve ProductivityIBM Omnifind Enterprise   Portal Seach To Improve Productivity
IBM Omnifind Enterprise Portal Seach To Improve Productivity
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
SemTech 2010: Pelorus Platform
SemTech 2010: Pelorus PlatformSemTech 2010: Pelorus Platform
SemTech 2010: Pelorus Platform
 
Faceted search using Solr and Ontopia
Faceted search using Solr and OntopiaFaceted search using Solr and Ontopia
Faceted search using Solr and Ontopia
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data Engine
 
HDP Next: Governance
HDP Next: GovernanceHDP Next: Governance
HDP Next: Governance
 
Apache solr
Apache solrApache solr
Apache solr
 
Drupal and Apache Solr Search Go Together Like Pizza and Beer for Your Site
Drupal and Apache Solr Search Go Together Like Pizza and Beer for Your SiteDrupal and Apache Solr Search Go Together Like Pizza and Beer for Your Site
Drupal and Apache Solr Search Go Together Like Pizza and Beer for Your Site
 
Why MongoDB over other Databases - Habilelabs
Why MongoDB over other Databases - HabilelabsWhy MongoDB over other Databases - Habilelabs
Why MongoDB over other Databases - Habilelabs
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using Lucene
 
Introduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
Introduction to Lucidworks Fusion - Alexander Kanarsky, LucidworksIntroduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
Introduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
 
Fundamentals Of Search
Fundamentals Of SearchFundamentals Of Search
Fundamentals Of Search
 
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
 
Semtech 2011 impressions
Semtech 2011 impressionsSemtech 2011 impressions
Semtech 2011 impressions
 
Apache Solr vs Oracle Endeca
Apache Solr vs Oracle EndecaApache Solr vs Oracle Endeca
Apache Solr vs Oracle Endeca
 

Mais de Ramez Al-Fayez

Process mining in business process management
Process mining in business process managementProcess mining in business process management
Process mining in business process managementRamez Al-Fayez
 
Twitter Search Architecture
Twitter Search Architecture Twitter Search Architecture
Twitter Search Architecture Ramez Al-Fayez
 
SECURITY REQUIREMENTS ENGINEERING: APPLYING SQUARE FRAMEWORK
SECURITY REQUIREMENTS ENGINEERING: APPLYING SQUARE FRAMEWORKSECURITY REQUIREMENTS ENGINEERING: APPLYING SQUARE FRAMEWORK
SECURITY REQUIREMENTS ENGINEERING: APPLYING SQUARE FRAMEWORKRamez Al-Fayez
 
Social networks and social media analysis in the context of the enterprise
Social networks and social media analysis in the context of the enterpriseSocial networks and social media analysis in the context of the enterprise
Social networks and social media analysis in the context of the enterpriseRamez Al-Fayez
 
IT strategic planning session
IT strategic planning sessionIT strategic planning session
IT strategic planning sessionRamez Al-Fayez
 

Mais de Ramez Al-Fayez (7)

Process mining in business process management
Process mining in business process managementProcess mining in business process management
Process mining in business process management
 
Wcc elise features
Wcc elise featuresWcc elise features
Wcc elise features
 
Twitter Search Architecture
Twitter Search Architecture Twitter Search Architecture
Twitter Search Architecture
 
SECURITY REQUIREMENTS ENGINEERING: APPLYING SQUARE FRAMEWORK
SECURITY REQUIREMENTS ENGINEERING: APPLYING SQUARE FRAMEWORKSECURITY REQUIREMENTS ENGINEERING: APPLYING SQUARE FRAMEWORK
SECURITY REQUIREMENTS ENGINEERING: APPLYING SQUARE FRAMEWORK
 
Maria DBMS
Maria DBMSMaria DBMS
Maria DBMS
 
Social networks and social media analysis in the context of the enterprise
Social networks and social media analysis in the context of the enterpriseSocial networks and social media analysis in the context of the enterprise
Social networks and social media analysis in the context of the enterprise
 
IT strategic planning session
IT strategic planning sessionIT strategic planning session
IT strategic planning session
 

Último

Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 

Último (20)

Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 

Solr Architecture

  • 1. By:  Ramez  Ibrahim  AL  Fayez  
  • 2. Agenda   ¡ Introduc9ons   ¡ What  is  Solr?   ¡ Main  Solr  Features  and  A@ributes     ¡ Content,  Query,  Facet,  API,  Scalability   ¡ Interface  and  useful  commands   ¡ Live  Demo  
  • 3. Introduc9on   —  Search  has  become  mission  cri9cal  for  most  enterprises   —  Intranet   —  Web  presence   —  E-­‐commerce   —  Exponen9al  growth  of  data   —  Cost  of  not  finding  informa9on   —  Knowledge  (sharing)   —  Time   —  Money   —  Informa9on  blackhole  
  • 4. What  is  Solr?   Official  defini,on:     “Solr   is   an   open   source   enterprise   search   pla7orm   based   on   the     Lucene   Java   search   library,   with   an   HTTP   interface   using   XML,     JSON   or   other   formats.   It   provides   hit   highligh,ng,   faceted     search,   caching,   replica,on,   a   web   administra,on   interface   and     many   more   features.   It   runs   in   a   Java   servlet   container   such   as    Apache  Tomcat.”   — h#p://lucene.apache.org/solr  
  • 5. What  is  Solr?   —  In  2004,  Solr  was  created  by  Yonik  Seeley  at  CNET  Networks  as  an  in-­‐house  project   to  add  search  capability  for  the  company  website.   —  Open-­‐source,  license-­‐free  search  engine   —  Built  on  top  of  Apache  Lucene  library,  and  adds  enterprise  search  server  features   and  capabili9es     —  Web  based  applica9on  that  processes  requests  and  returns  responses  via  HTTP,   and  APIs  
  • 6. Why  choosing  Solr?   —  Customizable   —  High  quality  and  easily  modifiable  relevancy   —  Very  fast  query  and  indexing  performance   —  Open  source  so^ware  is  free   —  Highly  flexible  data  processing/transforma9on   —  Easy  scalability  and  great  performance     —  Modern  solu9on  architecture  based  on  XML  and  Java   —  Well  integrated  with  the  ecosystem  around  Big  Data,  such  as  Hadoop  (also   Nutch,  Tika)  
  • 7. Solr’s  Main  Features   —  Full  text  search   —  Field  search   —  Number  and  date  searching   —  Facets   —  Spelling  assistance  –  “Did  you  mean…?”   —  Related  hits     —  Query  comple9on   —  Admin  GUI   —  Data  Import  Handler   —  Index  Databases,  Mails,  RSS,  XMLs  etc.   —  Rich  document  support   —  PDF,  MS  Office,  Images  etc   —  Replica9on  for  high  query  volume   —  Distributed  search  for  large  indexes   —  Produc9on  systems  with  1B+  documents   —  Very  extensible  and  customizable   —  Embedded  in  commercial  search  products   from  LucidWorks,  DataStax,  Cloudera,   Hortonworks,  Amazon  CloudSearch  and  Riak  
  • 8. Main  A@ribute     —  Index(ing)   —  Inverted  index   —  Document   —  Field   —  Stored  and/or  indexed   fields   —  Analysis   —  Tokeniza9on   —  Filters   —  Terms   —  Query   —  Filter   —  Func9on   —  Facet  
  • 9. Content   —  Out  of  the  box  support  for  JSON   —  Solr  handles  CSV,  XML,  Rich  Content  out  of  the  box  without   having  to  install  plugins    
  • 10. Indexing  and  Ranking   —  Solr  use  Inverted  index   —  For  ranking,  solr  use  TF-­‐IDF  and  Similarity   —  Similarity  is  a  combina9on  of  Boolean  model  (BM)  and   Vector  Space  Model  (VSM)   —  Another  feature,  user  can  do  re-­‐rank  to  the  query    
  • 11. Query   —  Common  parameters   —  Start,  rows,  fl,  fq,  sort   ?q=*:*&start=0&rows=10&fl=9tle&fq=collec9on:popular&sort=9tle  asc   —  Slightly  more  advanced   —  &facets   —  &qf   &qf=keyword^4  content1^8  content2^3  content3^2  stem1^1.5  stem2^1.2   stem3^0.5  
  • 12. Facet   “Faceted  search  is  the  dynamic  clustering  of  items  or  search  results   into  categories  that  let  users  drill  into  search  results  (or  even  skip   searching  en9rely)  by  any  value  in  any  field.  “   —  Naviga9on/discovery  technique   —  Tally  of  docs  for  each  dis9nct  field  value   —  Parameters   —  &facet=true   —  &facet.field=category  
  • 13. API   —  REST  API  for  adding  field  types,  and  dynamic  fields     —  Managing  Request  Handlers  through  API     —  Improved  APIs  for  managing  collec9ons     —  Implicit  registra9on  of  replica9on,  Real  Time  Get  and  Administra9on   Handlers   —  Out  of  the  box  support  for  JSON   —  Solr  handles  CSV,  XML,  Rich  Content  out  of  the  box  without  having  to  install   plugins    
  • 14. Scalability   —  Architecture  goals:   —  More  queries  per  second  (qps)   —  Faster  query  execu9on   —  Bigger  indexes   —  Faster  indexing   —  Scaling  op9ons   —  Mul9core   —  Replica9on   —  Sharding  
  • 15. Useful  commands   —  ./bin/solr  {start|stop}     —  ./bin/solr  create  -­‐c  <COLL_NAME>   —  bin/post  -­‐c  <COLL_NAME>  <Files  to  index>     —  /bin/solr  delete    
  • 17.