SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
IMPLEMENTING SITE SEARCH IN CQ5/AEM
DEVANG SHAH, I-CUBED


Session Outline
 Importance of site search functionality
 CQ5 internal search workings & limitations
 Integrating CQ5 with external search engines &

challenges
 Indexing patterns for integrating with external
search engines
 Q&A


Site Search is one of the core
functionality of most websites



Browse v/s Search: Alternate methods
of allowing visitors to find the
information they need quickly and
easily



“90 percent of companies report
that search is the No.1 means of
navigation on their site”
-- Forrester Research

“82 percent of visitors use site
search to find the information they
need”
-- Juniper Research

Advances in search features, which
allows site visitors to:





Auto complete/auto correct search terms
Build advanced queries,
Filter results by facets,
Search results refined by location,
preferences, previous history, etc

“Visitors who used site search
were “more likely to convert from
browsers to buyers”.”
-- Juniper Research
•

Jackrabbit internally uses Lucene to
Index repository content

•

Whenever any content is modified, along
with it getting stored in repository,
lucene index is also updated

•

Index Location:
<crx-quickstart>/repository:
• repository/index
• workspaces/crx.default/index

•

Index Configuration:
• Repository.xml & workspaces.xml
<SearchIndex> block
• tika-config.xml in workspaces
folder

•

Changes in new version of Jackrabbit (3.x
/ Oak)
•

Jackrabbit
• JCR Spec 1.0: Support for XPATH &
JCR SQL1
• JCR Spec 2.0: Support for JCR
SQL2. Support for XPATH
deprecated in JCR 2.0 but
Jackrabbit still supports it
• Both SQL & XPATH queries are
translated to same search tree

•

Query Builder is an API to build queries
for a query engine

•

CQ providers several OOTB components
& extensions which leverages
QueryBuilder API for full text or predicate
based searches

•

OOTB Search Component provides
support for full text query and enhanced
search features: similar pages, facets
support, pagination, etc


Use Case: Non CQ Content Sources





Use Case: Author v/s Visitor Search Patterns





CQ generates one index per server
Author and visitor search patterns and requirements are typically different

Performance & Architecture Considerations






Larger sites with more than one source of content and assets.
Difficult to index non-CQ content

‘n’ number of queries and search variations – making it difficult to utilize CQ caching
architecture
Jackrabbit layer on top of Lucene may slow down search and query performance
Scaling of search architecture dependent upon CQ architecture

Customizations




Utilizing different content parsers, index tuning, etc (mitigated in 5.6.1)
Can I use newer version of Lucene?
How can I extend Jackrabbit search implementation?


External Search Platforms
 Search Providers with Crawlers (examples):
▪ Google Search Appliance
▪ Microsoft FAST

 Non-crawler Search Providers (examples):
▪ Endeca
▪ Lucene/Solr



Enables independent scaling of search platform



Supports more than one content sources



Configuration & customization of search application is decoupled from
CQ5 application



May provide more advanced search features (faceted search, geospatial
search, personalization, etc)


Challenges building & managing search indexes
 Building Site Index: Crawl or Query & Inject?
 How often should index be rebuilt?
 How to ensure that content & metadata between content

sources and search index is always in sync?
 In case of multiple data sources, how to manage
duplicates, index structure and common metadata model?


Challenges querying & building search results
 Should search results page be hosted on the provider’s

platform or within CQ?
 Does search provider offer extended API to query and
build search results within the application?


Integration Notes:
 GSA, FAST Site Crawler, Endeca’s Plugin for CRX Indexing, Solr via open

Source crawlers (Nutch, etc)
 May require custom service which returns data (for example for Solr, Endeca)


Pros:
 Ease of implementation
 Indexes rendered version of the pages



Cons:
 Lag between content publishing and index update process may result in out

of sync search results experience. Also, what happens to deleted content?
 Larger index crawl and build times
 Search index doesn’t have complete set of meta-data


Example – CQ / FAST connector (available via service pack)



Pros:
⁻ Search index always in sync with content repository
⁻ Ability to send metadata with content
⁻ Customizable data formats and allows for partial indexing of

page



Cons:
⁻ Will require custom development efforts
⁻ Indexing of content instead of rendered version of the pages
⁻ System Performance / Event Handling


Pros:
⁻ Search index (mostly) in sync with content repository
⁻ Ability to send metadata with content
⁻ Customizable data formats and allows for partial indexing

of page
⁻ Minimal replication event processing


Cons:
⁻ Will require custom development efforts
⁻ Search index may get out of sync with content repository

(but for a shorter duration only)
⁻ Indexing of content instead of rendered version of the
pages


Handling initial content load & index creation
 In case of content push approach, how will initial index be generated? May

need to create initial baseline via site crawl or custom service
 In case of content pull approach, how will index reflect deleted, moved, site
pages?


Permission sensitive site pages & assets
 Option 1: Export ACLs to Search Provider (example: CQ/FAST Connector)
 Option 2: Check user permission via CQ at run time (similar to how CQ handles

delivery of content incase of closed user groups)



Referenced assets, content pages and promos
 Option: Query referenced pages and index. May cause performance (&

recursive index) issue though.
 Option: Selective content indexing (Index parts of page instead of entire page)
Implementing Site Search in CQ5/AEM

Mais conteúdo relacionado

Mais procurados

Gab2015 azure search as a service
Gab2015 azure search as a serviceGab2015 azure search as a service
Gab2015 azure search as a serviceAlexandre Marreiros
 
HAL APIs and Ember Data
HAL APIs and Ember DataHAL APIs and Ember Data
HAL APIs and Ember DataCory Forsyth
 
Building Ext JS Using HATEOAS - Jeff Stano
Building Ext JS Using HATEOAS - Jeff StanoBuilding Ext JS Using HATEOAS - Jeff Stano
Building Ext JS Using HATEOAS - Jeff StanoSencha
 
SharePoint 2013 REST API & Remote Authentication
SharePoint 2013 REST API & Remote AuthenticationSharePoint 2013 REST API & Remote Authentication
SharePoint 2013 REST API & Remote AuthenticationAdil Ansari
 
Deep-Dive to Azure Search
Deep-Dive to Azure SearchDeep-Dive to Azure Search
Deep-Dive to Azure SearchGunnar Peipman
 
Kql and the content search web part
Kql and the content search web part Kql and the content search web part
Kql and the content search web part InnoTech
 
Adding azuresearch
Adding azuresearchAdding azuresearch
Adding azuresearchEvan Boyle
 
Introduction to the SharePoint 2013 REST API
Introduction to the SharePoint 2013 REST APIIntroduction to the SharePoint 2013 REST API
Introduction to the SharePoint 2013 REST APISparkhound Inc.
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analyticsamesar0
 
(ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service (ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service BIOVIA
 
Taking Advantage of the SharePoint 2013 REST API
Taking Advantage of the SharePoint 2013 REST APITaking Advantage of the SharePoint 2013 REST API
Taking Advantage of the SharePoint 2013 REST APIEric Shupps
 
40+ tips to use Postman more efficiently
40+ tips to use Postman more efficiently40+ tips to use Postman more efficiently
40+ tips to use Postman more efficientlypostmanclient
 
5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia Search5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia SearchAcquia
 
Search domain basics
Search domain basicsSearch domain basics
Search domain basicspmanvi
 
Webinar: Fusion 2.3 Preview - Enhanced Features with Solr & Spark
Webinar: Fusion 2.3 Preview - Enhanced Features with Solr & SparkWebinar: Fusion 2.3 Preview - Enhanced Features with Solr & Spark
Webinar: Fusion 2.3 Preview - Enhanced Features with Solr & SparkLucidworks
 
How_To_Soup_Up_Your_Farm
How_To_Soup_Up_Your_FarmHow_To_Soup_Up_Your_Farm
How_To_Soup_Up_Your_FarmNigel Price
 
Melbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDBMelbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDBYuval Ararat
 

Mais procurados (20)

Azure search
Azure searchAzure search
Azure search
 
Gab2015 azure search as a service
Gab2015 azure search as a serviceGab2015 azure search as a service
Gab2015 azure search as a service
 
HAL APIs and Ember Data
HAL APIs and Ember DataHAL APIs and Ember Data
HAL APIs and Ember Data
 
Building Ext JS Using HATEOAS - Jeff Stano
Building Ext JS Using HATEOAS - Jeff StanoBuilding Ext JS Using HATEOAS - Jeff Stano
Building Ext JS Using HATEOAS - Jeff Stano
 
SharePoint 2013 REST API & Remote Authentication
SharePoint 2013 REST API & Remote AuthenticationSharePoint 2013 REST API & Remote Authentication
SharePoint 2013 REST API & Remote Authentication
 
Deep-Dive to Azure Search
Deep-Dive to Azure SearchDeep-Dive to Azure Search
Deep-Dive to Azure Search
 
Kql and the content search web part
Kql and the content search web part Kql and the content search web part
Kql and the content search web part
 
Adding azuresearch
Adding azuresearchAdding azuresearch
Adding azuresearch
 
Introduction to the SharePoint 2013 REST API
Introduction to the SharePoint 2013 REST APIIntroduction to the SharePoint 2013 REST API
Introduction to the SharePoint 2013 REST API
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
 
(ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service (ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service
 
Taking Advantage of the SharePoint 2013 REST API
Taking Advantage of the SharePoint 2013 REST APITaking Advantage of the SharePoint 2013 REST API
Taking Advantage of the SharePoint 2013 REST API
 
Azure search
Azure searchAzure search
Azure search
 
40+ tips to use Postman more efficiently
40+ tips to use Postman more efficiently40+ tips to use Postman more efficiently
40+ tips to use Postman more efficiently
 
5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia Search5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia Search
 
Search domain basics
Search domain basicsSearch domain basics
Search domain basics
 
Webinar: Fusion 2.3 Preview - Enhanced Features with Solr & Spark
Webinar: Fusion 2.3 Preview - Enhanced Features with Solr & SparkWebinar: Fusion 2.3 Preview - Enhanced Features with Solr & Spark
Webinar: Fusion 2.3 Preview - Enhanced Features with Solr & Spark
 
How_To_Soup_Up_Your_Farm
How_To_Soup_Up_Your_FarmHow_To_Soup_Up_Your_Farm
How_To_Soup_Up_Your_Farm
 
Melbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDBMelbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDB
 
Oracle APEX Nitro
Oracle APEX NitroOracle APEX Nitro
Oracle APEX Nitro
 

Destaque

Apache SOLR in AEM 6
Apache SOLR in AEM 6Apache SOLR in AEM 6
Apache SOLR in AEM 6Yash Mody
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6DEEPAK KHETAWAT
 
Omnisearch in AEM 6.2 - Search All the Things
Omnisearch in AEM 6.2 - Search All the ThingsOmnisearch in AEM 6.2 - Search All the Things
Omnisearch in AEM 6.2 - Search All the ThingsJustin Edelson
 
CQ5 QueryBuilder - .adaptTo(Berlin) 2011
CQ5 QueryBuilder - .adaptTo(Berlin) 2011CQ5 QueryBuilder - .adaptTo(Berlin) 2011
CQ5 QueryBuilder - .adaptTo(Berlin) 2011Alexander Klimetschek
 
Neha Gupta - AIR Mobile: Cross promotion
Neha Gupta - AIR Mobile: Cross promotionNeha Gupta - AIR Mobile: Cross promotion
Neha Gupta - AIR Mobile: Cross promotionFlash Conference
 
Adobe Experience Manager (AEM) - Multilingual SIG on SEO - Dave Lloyd
Adobe Experience Manager (AEM) - Multilingual SIG on SEO - Dave LloydAdobe Experience Manager (AEM) - Multilingual SIG on SEO - Dave Lloyd
Adobe Experience Manager (AEM) - Multilingual SIG on SEO - Dave LloydDave Lloyd
 
AEM 6.2 -Assets - Creating engaging experience at scale
AEM 6.2 -Assets - Creating engaging experience at scaleAEM 6.2 -Assets - Creating engaging experience at scale
AEM 6.2 -Assets - Creating engaging experience at scaleKevin Farley
 
JCR, Sling or AEM? Which API should I use and when?
JCR, Sling or AEM? Which API should I use and when?JCR, Sling or AEM? Which API should I use and when?
JCR, Sling or AEM? Which API should I use and when?connectwebex
 
AEM 6 DAM - Integrations, Integrations, Integrations
AEM 6 DAM - Integrations, Integrations, IntegrationsAEM 6 DAM - Integrations, Integrations, Integrations
AEM 6 DAM - Integrations, Integrations, Integrationsconnectwebex
 
Evolve13 cq-commerce-framework
Evolve13 cq-commerce-frameworkEvolve13 cq-commerce-framework
Evolve13 cq-commerce-frameworkPaolo Mottadelli
 
adaptTo() 2014 - Integrating Open Source Search with CQ/AEM
adaptTo() 2014 - Integrating Open Source Search with CQ/AEMadaptTo() 2014 - Integrating Open Source Search with CQ/AEM
adaptTo() 2014 - Integrating Open Source Search with CQ/AEMtherealgaston
 
Creating Real-Time Data Mashups with Node.JS and Adobe CQ
Creating Real-Time Data Mashups with Node.JS and Adobe CQCreating Real-Time Data Mashups with Node.JS and Adobe CQ
Creating Real-Time Data Mashups with Node.JS and Adobe CQiCiDIGITAL
 
Adobe Marketing Cloud Integration with Adobe AEM
Adobe Marketing Cloud Integration with Adobe AEMAdobe Marketing Cloud Integration with Adobe AEM
Adobe Marketing Cloud Integration with Adobe AEMDeepak Narisety
 
Sap java connector / Hybris RFC
Sap java connector / Hybris RFCSap java connector / Hybris RFC
Sap java connector / Hybris RFCMonsif Elaissoussi
 
AEM & eCommerce integration
AEM & eCommerce integrationAEM & eCommerce integration
AEM & eCommerce integrationLokesh BS
 
Adobe AEM Commerce with hybris
Adobe AEM Commerce with hybrisAdobe AEM Commerce with hybris
Adobe AEM Commerce with hybrisPaolo Mottadelli
 

Destaque (20)

Apache SOLR in AEM 6
Apache SOLR in AEM 6Apache SOLR in AEM 6
Apache SOLR in AEM 6
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6
 
Omnisearch in AEM 6.2 - Search All the Things
Omnisearch in AEM 6.2 - Search All the ThingsOmnisearch in AEM 6.2 - Search All the Things
Omnisearch in AEM 6.2 - Search All the Things
 
CQ5 QueryBuilder - .adaptTo(Berlin) 2011
CQ5 QueryBuilder - .adaptTo(Berlin) 2011CQ5 QueryBuilder - .adaptTo(Berlin) 2011
CQ5 QueryBuilder - .adaptTo(Berlin) 2011
 
EVOLVE'15 | Enhance| Christian Meyer & Andreea Sandru | AEM User interfacecus...
EVOLVE'15 | Enhance| Christian Meyer & Andreea Sandru | AEM User interfacecus...EVOLVE'15 | Enhance| Christian Meyer & Andreea Sandru | AEM User interfacecus...
EVOLVE'15 | Enhance| Christian Meyer & Andreea Sandru | AEM User interfacecus...
 
Khushal Patil New
Khushal Patil NewKhushal Patil New
Khushal Patil New
 
Oak / Solr integration
Oak / Solr integrationOak / Solr integration
Oak / Solr integration
 
Neha Gupta - AIR Mobile: Cross promotion
Neha Gupta - AIR Mobile: Cross promotionNeha Gupta - AIR Mobile: Cross promotion
Neha Gupta - AIR Mobile: Cross promotion
 
Adobe Experience Manager (AEM) - Multilingual SIG on SEO - Dave Lloyd
Adobe Experience Manager (AEM) - Multilingual SIG on SEO - Dave LloydAdobe Experience Manager (AEM) - Multilingual SIG on SEO - Dave Lloyd
Adobe Experience Manager (AEM) - Multilingual SIG on SEO - Dave Lloyd
 
AEM 6.2 -Assets - Creating engaging experience at scale
AEM 6.2 -Assets - Creating engaging experience at scaleAEM 6.2 -Assets - Creating engaging experience at scale
AEM 6.2 -Assets - Creating engaging experience at scale
 
Quiery builder
Quiery builderQuiery builder
Quiery builder
 
JCR, Sling or AEM? Which API should I use and when?
JCR, Sling or AEM? Which API should I use and when?JCR, Sling or AEM? Which API should I use and when?
JCR, Sling or AEM? Which API should I use and when?
 
AEM 6 DAM - Integrations, Integrations, Integrations
AEM 6 DAM - Integrations, Integrations, IntegrationsAEM 6 DAM - Integrations, Integrations, Integrations
AEM 6 DAM - Integrations, Integrations, Integrations
 
Evolve13 cq-commerce-framework
Evolve13 cq-commerce-frameworkEvolve13 cq-commerce-framework
Evolve13 cq-commerce-framework
 
adaptTo() 2014 - Integrating Open Source Search with CQ/AEM
adaptTo() 2014 - Integrating Open Source Search with CQ/AEMadaptTo() 2014 - Integrating Open Source Search with CQ/AEM
adaptTo() 2014 - Integrating Open Source Search with CQ/AEM
 
Creating Real-Time Data Mashups with Node.JS and Adobe CQ
Creating Real-Time Data Mashups with Node.JS and Adobe CQCreating Real-Time Data Mashups with Node.JS and Adobe CQ
Creating Real-Time Data Mashups with Node.JS and Adobe CQ
 
Adobe Marketing Cloud Integration with Adobe AEM
Adobe Marketing Cloud Integration with Adobe AEMAdobe Marketing Cloud Integration with Adobe AEM
Adobe Marketing Cloud Integration with Adobe AEM
 
Sap java connector / Hybris RFC
Sap java connector / Hybris RFCSap java connector / Hybris RFC
Sap java connector / Hybris RFC
 
AEM & eCommerce integration
AEM & eCommerce integrationAEM & eCommerce integration
AEM & eCommerce integration
 
Adobe AEM Commerce with hybris
Adobe AEM Commerce with hybrisAdobe AEM Commerce with hybris
Adobe AEM Commerce with hybris
 

Semelhante a Implementing Site Search in CQ5/AEM

SharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 SearchSharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 SearchC/D/H Technology Consultants
 
Web Scale Discovery Vs Federated Search
Web Scale Discovery Vs Federated SearchWeb Scale Discovery Vs Federated Search
Web Scale Discovery Vs Federated SearchNikesh Narayanan
 
Fried dallas spug
Fried dallas spugFried dallas spug
Fried dallas spugJeff Fried
 
Evaluation of web scale discovery services
Evaluation of web scale discovery servicesEvaluation of web scale discovery services
Evaluation of web scale discovery servicesNikesh Narayanan
 
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web Crawler
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web CrawlerIRJET-Deep Web Crawling Efficiently using Dynamic Focused Web Crawler
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web CrawlerIRJET Journal
 
Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...
Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...
Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...Sezai Komur
 
CREATE SEARCH DRIVEN BUSINESS INTELLIGENCE APPLICATION USING FAST SEARCH FO...
CREATE SEARCH DRIVEN BUSINESS  INTELLIGENCE APPLICATION USING  FAST SEARCH FO...CREATE SEARCH DRIVEN BUSINESS  INTELLIGENCE APPLICATION USING  FAST SEARCH FO...
CREATE SEARCH DRIVEN BUSINESS INTELLIGENCE APPLICATION USING FAST SEARCH FO...Netwoven Inc.
 
#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimization#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimizationMike Maadarani
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...butest
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...butest
 
ESPC13 - 10 Things I Like in SharePoint 2013 Search
ESPC13 - 10 Things I Like in SharePoint 2013 SearchESPC13 - 10 Things I Like in SharePoint 2013 Search
ESPC13 - 10 Things I Like in SharePoint 2013 SearchAgnes Molnar
 
Fried houston spug
Fried houston spugFried houston spug
Fried houston spugJeff Fried
 
Google search vs Solr search for Enterprise search
Google search vs Solr search for Enterprise searchGoogle search vs Solr search for Enterprise search
Google search vs Solr search for Enterprise searchVeera Shekar
 
SharePoint 2013 Search - Whats new for End Users
SharePoint 2013 Search - Whats new for End UsersSharePoint 2013 Search - Whats new for End Users
SharePoint 2013 Search - Whats new for End UsersMark Stokes
 
SharePoint Saturday Perth 2013 - Overview of Search in SharePoint Server 201...
SharePoint Saturday Perth 2013  - Overview of Search in SharePoint Server 201...SharePoint Saturday Perth 2013  - Overview of Search in SharePoint Server 201...
SharePoint Saturday Perth 2013 - Overview of Search in SharePoint Server 201...Sezai Komur
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Petter Skodvin-Hvammen
 
TechFuse 2013 - Break down the walls SharePoint 2013
TechFuse 2013 - Break down the walls SharePoint 2013TechFuse 2013 - Break down the walls SharePoint 2013
TechFuse 2013 - Break down the walls SharePoint 2013Avtex
 
I2 - SharePoint Hybrid Search Start to Finish - Thomas Vochten
I2 - SharePoint Hybrid Search Start to Finish - Thomas VochtenI2 - SharePoint Hybrid Search Start to Finish - Thomas Vochten
I2 - SharePoint Hybrid Search Start to Finish - Thomas VochtenSPS Paris
 
SharePoint 2013 Search Driven Sites - SPSHOU
SharePoint 2013 Search Driven Sites - SPSHOUSharePoint 2013 Search Driven Sites - SPSHOU
SharePoint 2013 Search Driven Sites - SPSHOUBrian Culver
 

Semelhante a Implementing Site Search in CQ5/AEM (20)

SharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 SearchSharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 Search
 
Web Scale Discovery Vs Federated Search
Web Scale Discovery Vs Federated SearchWeb Scale Discovery Vs Federated Search
Web Scale Discovery Vs Federated Search
 
Fried dallas spug
Fried dallas spugFried dallas spug
Fried dallas spug
 
Evaluation of web scale discovery services
Evaluation of web scale discovery servicesEvaluation of web scale discovery services
Evaluation of web scale discovery services
 
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web Crawler
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web CrawlerIRJET-Deep Web Crawling Efficiently using Dynamic Focused Web Crawler
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web Crawler
 
Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...
Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...
Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...
 
CREATE SEARCH DRIVEN BUSINESS INTELLIGENCE APPLICATION USING FAST SEARCH FO...
CREATE SEARCH DRIVEN BUSINESS  INTELLIGENCE APPLICATION USING  FAST SEARCH FO...CREATE SEARCH DRIVEN BUSINESS  INTELLIGENCE APPLICATION USING  FAST SEARCH FO...
CREATE SEARCH DRIVEN BUSINESS INTELLIGENCE APPLICATION USING FAST SEARCH FO...
 
#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimization#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimization
 
Search engine
Search engineSearch engine
Search engine
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
 
ESPC13 - 10 Things I Like in SharePoint 2013 Search
ESPC13 - 10 Things I Like in SharePoint 2013 SearchESPC13 - 10 Things I Like in SharePoint 2013 Search
ESPC13 - 10 Things I Like in SharePoint 2013 Search
 
Fried houston spug
Fried houston spugFried houston spug
Fried houston spug
 
Google search vs Solr search for Enterprise search
Google search vs Solr search for Enterprise searchGoogle search vs Solr search for Enterprise search
Google search vs Solr search for Enterprise search
 
SharePoint 2013 Search - Whats new for End Users
SharePoint 2013 Search - Whats new for End UsersSharePoint 2013 Search - Whats new for End Users
SharePoint 2013 Search - Whats new for End Users
 
SharePoint Saturday Perth 2013 - Overview of Search in SharePoint Server 201...
SharePoint Saturday Perth 2013  - Overview of Search in SharePoint Server 201...SharePoint Saturday Perth 2013  - Overview of Search in SharePoint Server 201...
SharePoint Saturday Perth 2013 - Overview of Search in SharePoint Server 201...
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
 
TechFuse 2013 - Break down the walls SharePoint 2013
TechFuse 2013 - Break down the walls SharePoint 2013TechFuse 2013 - Break down the walls SharePoint 2013
TechFuse 2013 - Break down the walls SharePoint 2013
 
I2 - SharePoint Hybrid Search Start to Finish - Thomas Vochten
I2 - SharePoint Hybrid Search Start to Finish - Thomas VochtenI2 - SharePoint Hybrid Search Start to Finish - Thomas Vochten
I2 - SharePoint Hybrid Search Start to Finish - Thomas Vochten
 
SharePoint 2013 Search Driven Sites - SPSHOU
SharePoint 2013 Search Driven Sites - SPSHOUSharePoint 2013 Search Driven Sites - SPSHOU
SharePoint 2013 Search Driven Sites - SPSHOU
 

Último

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Último (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Implementing Site Search in CQ5/AEM

  • 1. IMPLEMENTING SITE SEARCH IN CQ5/AEM DEVANG SHAH, I-CUBED
  • 2.  Session Outline  Importance of site search functionality  CQ5 internal search workings & limitations  Integrating CQ5 with external search engines & challenges  Indexing patterns for integrating with external search engines  Q&A
  • 3.  Site Search is one of the core functionality of most websites  Browse v/s Search: Alternate methods of allowing visitors to find the information they need quickly and easily  “90 percent of companies report that search is the No.1 means of navigation on their site” -- Forrester Research “82 percent of visitors use site search to find the information they need” -- Juniper Research Advances in search features, which allows site visitors to:     Auto complete/auto correct search terms Build advanced queries, Filter results by facets, Search results refined by location, preferences, previous history, etc “Visitors who used site search were “more likely to convert from browsers to buyers”.” -- Juniper Research
  • 4. • Jackrabbit internally uses Lucene to Index repository content • Whenever any content is modified, along with it getting stored in repository, lucene index is also updated • Index Location: <crx-quickstart>/repository: • repository/index • workspaces/crx.default/index • Index Configuration: • Repository.xml & workspaces.xml <SearchIndex> block • tika-config.xml in workspaces folder • Changes in new version of Jackrabbit (3.x / Oak)
  • 5. • Jackrabbit • JCR Spec 1.0: Support for XPATH & JCR SQL1 • JCR Spec 2.0: Support for JCR SQL2. Support for XPATH deprecated in JCR 2.0 but Jackrabbit still supports it • Both SQL & XPATH queries are translated to same search tree • Query Builder is an API to build queries for a query engine • CQ providers several OOTB components & extensions which leverages QueryBuilder API for full text or predicate based searches • OOTB Search Component provides support for full text query and enhanced search features: similar pages, facets support, pagination, etc
  • 6.  Use Case: Non CQ Content Sources    Use Case: Author v/s Visitor Search Patterns    CQ generates one index per server Author and visitor search patterns and requirements are typically different Performance & Architecture Considerations     Larger sites with more than one source of content and assets. Difficult to index non-CQ content ‘n’ number of queries and search variations – making it difficult to utilize CQ caching architecture Jackrabbit layer on top of Lucene may slow down search and query performance Scaling of search architecture dependent upon CQ architecture Customizations    Utilizing different content parsers, index tuning, etc (mitigated in 5.6.1) Can I use newer version of Lucene? How can I extend Jackrabbit search implementation?
  • 7.  External Search Platforms  Search Providers with Crawlers (examples): ▪ Google Search Appliance ▪ Microsoft FAST  Non-crawler Search Providers (examples): ▪ Endeca ▪ Lucene/Solr  Enables independent scaling of search platform  Supports more than one content sources  Configuration & customization of search application is decoupled from CQ5 application  May provide more advanced search features (faceted search, geospatial search, personalization, etc)
  • 8.  Challenges building & managing search indexes  Building Site Index: Crawl or Query & Inject?  How often should index be rebuilt?  How to ensure that content & metadata between content sources and search index is always in sync?  In case of multiple data sources, how to manage duplicates, index structure and common metadata model?  Challenges querying & building search results  Should search results page be hosted on the provider’s platform or within CQ?  Does search provider offer extended API to query and build search results within the application?
  • 9.
  • 10.  Integration Notes:  GSA, FAST Site Crawler, Endeca’s Plugin for CRX Indexing, Solr via open Source crawlers (Nutch, etc)  May require custom service which returns data (for example for Solr, Endeca)  Pros:  Ease of implementation  Indexes rendered version of the pages  Cons:  Lag between content publishing and index update process may result in out of sync search results experience. Also, what happens to deleted content?  Larger index crawl and build times  Search index doesn’t have complete set of meta-data
  • 11.
  • 12.  Example – CQ / FAST connector (available via service pack)  Pros: ⁻ Search index always in sync with content repository ⁻ Ability to send metadata with content ⁻ Customizable data formats and allows for partial indexing of page  Cons: ⁻ Will require custom development efforts ⁻ Indexing of content instead of rendered version of the pages ⁻ System Performance / Event Handling
  • 13.
  • 14.
  • 15.  Pros: ⁻ Search index (mostly) in sync with content repository ⁻ Ability to send metadata with content ⁻ Customizable data formats and allows for partial indexing of page ⁻ Minimal replication event processing  Cons: ⁻ Will require custom development efforts ⁻ Search index may get out of sync with content repository (but for a shorter duration only) ⁻ Indexing of content instead of rendered version of the pages
  • 16.  Handling initial content load & index creation  In case of content push approach, how will initial index be generated? May need to create initial baseline via site crawl or custom service  In case of content pull approach, how will index reflect deleted, moved, site pages?  Permission sensitive site pages & assets  Option 1: Export ACLs to Search Provider (example: CQ/FAST Connector)  Option 2: Check user permission via CQ at run time (similar to how CQ handles delivery of content incase of closed user groups)  Referenced assets, content pages and promos  Option: Query referenced pages and index. May cause performance (& recursive index) issue though.  Option: Selective content indexing (Index parts of page instead of entire page)