SlideShare uma empresa Scribd logo
1 de 23
Search, APIs,
Capability Management
          and
  the Sensis Journey


      Craig Rees
•    Project background

•    Platform selection

•    Search capability

•    Relevance

•    Architecture

•    Quality management

•    Hurdles

•    What’s next


    Today’s menu
• Sensis helps Australians
          find, buy and sell

         • From print directories to a
          cross-platform lead generator

         • Sensis publishes over 1.8
          Million business listings

         • Two of the top 10 visited online
          sites in Australia
          (WhitePages.com.au and
          YellowPages.com.au)


Sensis
Business objectives
•   Drive presence in the local
    search market place
•   Open up the largest database of
    business listings in Australia
•   Reduce the effort required from
    local search developers           Technology objectives
•   Free to use, we are after the     •   Develop a total search platform
    reporting                         •   Relevancy testing as part of the
                                          development lifecycle
                                      •   A framework to identify problem
                                          spaces
                                      •   Manageable platform
                                      •   Continuous deployments


Project background
Developer portal
•   Support for the search
    capability team

•   Structured vs non
    structured data

•   Deterministic vs black
    box

•   Non propriety code base

•   Community backing




    Platform selection
• A/B testing
                                                      • Machine learning
Optimized                                     Lvl 5   • External collaboration
                                                      • Multiple contexts


                                                                   • Online dashboards
                                                                   • Test environments
Managed                                       Lvl 4                • Dynamic search refinements
                                                                   • Targets and metrics


                                                                            • Defined team
                                                                            • Regular monitoring
Monitored                                     Lvl 3                         • Static autosuggest
                                                                            • Basic linguistics


                                                                                 • Adhoc processes
                                                                                 • Part time team
Adhoc                                         Lvl 2                              • Static dictionaries
                                                                                 • Individual led innovation

                                                                                     • No resources
                                                                                     • No reporting
Unmanaged                                     Lvl 1                                  • Out of the box
                                                                                       features




The Sensis Search capability maturity model
*Courtesy of Pete Crawford & Craig Lonsdale
Location



                 Intent      Chronology
                 • Name
                 • Type
                             Social Graph
                 • Product
                 • Spatial

                               Device




                              Individual



Context is key
Business                         Geo Service
    Data



                                       Solr                     Mashery
  Business                             Name Query
    Data                                                         Search
               MongoDB                   Handler                 Service
                           Index                      API                   Publisher
                                                                Reporting
                                       Type Query
                                                                 Service
                                         Handler

  Historical
   search
    Data

                                                    Reporting
                                                     Events

                         Ontologies




Our architecture
Business                         Geo Service
    Data



                                       Solr                     Mashery
  Business                             Name Query
    Data                                                         Search
               MongoDB                   Handler                 Service
                           Index                      API                   Publisher
                                                                Reporting
                                       Type Query
                                                                 Service
                                         Handler

  Historical
   search
    Data

                                                    Reporting
                                                     Events

                         Ontologies




Data staging
Business                          Geo Service
   Data



                                       Solr                     Mashery
 Business                              Name Query
   Data                                                          Search
               MongoDB                   Handler                 Service
                           Index                      API                   Publisher
                                                                Reporting
                                       Type Query
                                                                 Service
                                         Handler

  Historical
   search
    Data

                                                    Reporting
                                                     Events

                         Ontologies




Search
Business                          Geo Service
   Data



                                       Solr                     Mashery
 Business                              Name Query
   Data                                                          Search
               MongoDB                   Handler                 Service
                           Index                      API                   Publisher
                                                                Reporting
                                       Type Query
                                                                 Service
                                         Handler

  Historical
   search
    Data

                                                    Reporting
                                                     Events

                         Ontologies




API
Business                          Geo Service
   Data



                                       Solr                     Mashery
 Business                              Name Query
   Data                                                          Search
               MongoDB                   Handler                 Service
                           Index                      API                   Publisher
                                                                Reporting
                                       Type Query
                                                                 Service
                                         Handler

  Historical
   search
    Data

                                                    Reporting
                                                     Events

                         Ontologies




API proxy
• Moved from a black box             Yesterday   Today   Tomorrow
  solution to a manageable
  platform
• Deliver search improvements
  without major code changes
• Understand how results were
  calculated
• Identity problems scientifically
• Continuously tune and test
  relevance




  Evolution of search management
Specific gold sets for each
       Path Analysis         problem space:
       used to identify          Intent
                                 Spelling & stemming
       problems                  Location
       spaces                    Phrase parsing




                             Features signed off
       “Gold Sets”           only when they make
       used to define        a positive impact to
       overall quality       quality score
       score (TREC)



Problem spaces, quality management & tuning
Search quality analysis and testing
Results examiner
Score analysis
Tuning
Lather, rinse, repeat
• Data redundancy and
                     homogeneity
                   • Solr ranking of rare terms
                   • Intent differentiation
                   • Contextual synonyms




Hurdles along the way
•   Query engine
              •   Facets / autosuggest
              •   Real time tuning
              •   Machine learning
              •   Multi term queries
              •   Scoring thresholds
              •   Content Value




Where next?
Email: craig.rees@sensis.com.au
             www: developers.sensis.com.au
             Twitter: @SensisAPI
                      @ablebagel




Questions?

Mais conteúdo relacionado

Semelhante a Search, APIs, capability management and Sensis's journey

SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 SearchSPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
Agnes Molnar
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Grant Ingersoll
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Grant Ingersoll
 
Kuali update v4 - mw
Kuali update   v4 - mwKuali update   v4 - mw
Kuali update v4 - mw
sarnoa
 
Large-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
Large-Scale Search Discovery Analytics with Hadoop, Mahout, SolrLarge-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
Large-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
DataWorks Summit
 
2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr
Lucidworks (Archived)
 
MetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 SearchMetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
Agnes Molnar
 
A Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com PlatformA Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com Platform
Salesforce Developers
 
Information architecture strategic process
Information architecture strategic processInformation architecture strategic process
Information architecture strategic process
Kerry Dirks MCPS MS
 
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchSPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
Agnes Molnar
 
MapR LucidWorks Joint Webinar 121211
MapR LucidWorks Joint Webinar 121211MapR LucidWorks Joint Webinar 121211
MapR LucidWorks Joint Webinar 121211
MapR Technologies
 

Semelhante a Search, APIs, capability management and Sensis's journey (20)

SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 SearchSPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
 
Kuali update v4 - mw
Kuali update   v4 - mwKuali update   v4 - mw
Kuali update v4 - mw
 
32 cc 3_a_l-drumheller
32 cc 3_a_l-drumheller32 cc 3_a_l-drumheller
32 cc 3_a_l-drumheller
 
SharePoint Development
SharePoint DevelopmentSharePoint Development
SharePoint Development
 
Large-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
Large-Scale Search Discovery Analytics with Hadoop, Mahout, SolrLarge-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
Large-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
 
Leveraging Solr and Mahout
Leveraging Solr and MahoutLeveraging Solr and Mahout
Leveraging Solr and Mahout
 
SEALS @ WWW2012
SEALS @ WWW2012SEALS @ WWW2012
SEALS @ WWW2012
 
Oracle Application Management Suite
Oracle Application Management SuiteOracle Application Management Suite
Oracle Application Management Suite
 
2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr
 
MetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 SearchMetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
 
E-commerce website development process
E-commerce website development processE-commerce website development process
E-commerce website development process
 
A Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com PlatformA Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com Platform
 
Business intelligence-solutions 2012-english
Business intelligence-solutions 2012-englishBusiness intelligence-solutions 2012-english
Business intelligence-solutions 2012-english
 
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
Exploratory Search upon Semantically Described Web Data Sources: Service regi...Exploratory Search upon Semantically Described Web Data Sources: Service regi...
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
 
Information architecture strategic process
Information architecture strategic processInformation architecture strategic process
Information architecture strategic process
 
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchSPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
 
MapR lucidworks joint webinar
MapR lucidworks joint webinarMapR lucidworks joint webinar
MapR lucidworks joint webinar
 
MapR LucidWorks Joint Webinar 121211
MapR LucidWorks Joint Webinar 121211MapR LucidWorks Joint Webinar 121211
MapR LucidWorks Joint Webinar 121211
 

Último

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Último (20)

Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreel
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Buy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdfBuy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdf
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 

Search, APIs, capability management and Sensis's journey

  • 1. Search, APIs, Capability Management and the Sensis Journey Craig Rees
  • 2. Project background • Platform selection • Search capability • Relevance • Architecture • Quality management • Hurdles • What’s next Today’s menu
  • 3. • Sensis helps Australians find, buy and sell • From print directories to a cross-platform lead generator • Sensis publishes over 1.8 Million business listings • Two of the top 10 visited online sites in Australia (WhitePages.com.au and YellowPages.com.au) Sensis
  • 4. Business objectives • Drive presence in the local search market place • Open up the largest database of business listings in Australia • Reduce the effort required from local search developers Technology objectives • Free to use, we are after the • Develop a total search platform reporting • Relevancy testing as part of the development lifecycle • A framework to identify problem spaces • Manageable platform • Continuous deployments Project background
  • 6. Support for the search capability team • Structured vs non structured data • Deterministic vs black box • Non propriety code base • Community backing Platform selection
  • 7. • A/B testing • Machine learning Optimized Lvl 5 • External collaboration • Multiple contexts • Online dashboards • Test environments Managed Lvl 4 • Dynamic search refinements • Targets and metrics • Defined team • Regular monitoring Monitored Lvl 3 • Static autosuggest • Basic linguistics • Adhoc processes • Part time team Adhoc Lvl 2 • Static dictionaries • Individual led innovation • No resources • No reporting Unmanaged Lvl 1 • Out of the box features The Sensis Search capability maturity model *Courtesy of Pete Crawford & Craig Lonsdale
  • 8. Location Intent Chronology • Name • Type Social Graph • Product • Spatial Device Individual Context is key
  • 9. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events Ontologies Our architecture
  • 10. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events Ontologies Data staging
  • 11. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events Ontologies Search
  • 12. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events Ontologies API
  • 13. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events Ontologies API proxy
  • 14. • Moved from a black box Yesterday Today Tomorrow solution to a manageable platform • Deliver search improvements without major code changes • Understand how results were calculated • Identity problems scientifically • Continuously tune and test relevance Evolution of search management
  • 15. Specific gold sets for each Path Analysis problem space: used to identify  Intent  Spelling & stemming problems  Location spaces  Phrase parsing Features signed off “Gold Sets” only when they make used to define a positive impact to overall quality quality score score (TREC) Problem spaces, quality management & tuning
  • 16. Search quality analysis and testing
  • 21. • Data redundancy and homogeneity • Solr ranking of rare terms • Intent differentiation • Contextual synonyms Hurdles along the way
  • 22. Query engine • Facets / autosuggest • Real time tuning • Machine learning • Multi term queries • Scoring thresholds • Content Value Where next?
  • 23. Email: craig.rees@sensis.com.au www: developers.sensis.com.au Twitter: @SensisAPI @ablebagel Questions?