SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
Search + Big Data:
It’s (still) All About the User
Grant Ingersoll, Chief Scientist – Lucid Imagination
           grant@lucidimagination.com
                 October 19, 2011
Promise and Reality

  “Data is increasingly digital air: the oxygen we
 breathe and the carbon dioxide that we exhale. It
can be a source of both sustenance and pollution.”
            Six Provocations for Big Data
          by Danah Boyd and Kate Crawford



  “The truth is, I spend most of my time trying to
reduce the size of my data so it can be analyzed.”
      Hilary Mason, Chief Scientist, Bitly @ Strata
Pragmatism
Evolution
                           Documents
                           • Models
                           • Feature Selection




                                                 User
                                                 Interaction
       Content
                                                 • Clicks
       Relationships                             • Ratings/
       • Page Rank, etc.                          Reviews
       • Organization                            • Learning to
                                                  Rank
                                                 • Social Graph




                               Queries
                               • Phrases
                               • NLP
Minding the Intersection


                   Search




       Analytics            Discovery
Benefits
§  End users
   •  Better relevance/conversion
   •  Serendipity
   •  Better/faster insight


§  Business:
   •    ROI
   •    Awareness across organization
   •    Enablement
   •    Agility
Needs
§  Fast, efficient, scalable search
§  Large scale, cost effective storage
§  Processing Power:
   •  Large scale distributed for whole data consumption
   •  Streaming/In Memory for real time needs
   •  Ability to learn


§  Willingness to ask questions
The Good News
Search
§  Good scalable, search a given
   •  Talks: Chitouras, Sturlese, Binns, Miller


§  Custom Relevancy via function queries, boosts
§  Explore other relevance models
   •  Talks: Muir, Pugh
   •  Lucene/Solr trunk has pluggable scoring (BM25, etc.)


§  NRT for timeliness
   •  Talks: Busch
Discovery
Facets
  •  Talks: Yonik
  •  Classification, Taxonomy
Clustering
  •  Talk: Frank S.
Suggestions
  •  Auto-suggest, Spelling,
     More Like This,
     Related Searches, search trails
Visualization
Analytics
Analytics for End Users
Offline                         Online
   •    Popularity/Click          •  Trends/Stats
   •    Link Analysis
   •    Search Trails             •  Social/Personal
   •    Recommendations
   •    Spellchecking weights     •  Location
   •    Collocations


                                         STORM
Analytics for Internal Users
Offline                         Online
   •    Top X                     •  Trends
   •    Zero results
   •    MRR, MAP                  •  Operational alerts
   •    User segmentation            (QPS,
   •    Location, conversions        DPS, etc)
   •    Ad hoc Analysis


                                   GIRAPH
What’s Missing?
§  The glue is up to you (us?)
   •    Lucene Index -> Pig/Others
   •    Mahout -> Pig/Others
   •    Mahout -> Lucene/Solr
   •    Logs -> Pig/Others


§  Nice to have:
   •  More in-index functionality (that performs)
         §  Aggregations
         §  Arbitrary stats
         §  Complex Joins
What’s Next?

“I can have all the data I want to have – but I still
  have to communicate it to our players. It has to
  get into their minds. And they have to utilize it. ”
        Brad Stevens, Head Basketball Coach,
     Butler University in Oct. ‘11 McKinsey Quarterly
Thanks!


§  http://www.lucidimagination.com

§  @gsingers

§  grant@lucidimagination.com

§  stump@lucene-eurocon.com
Lucene Ecosystem




               Spark   Storm
              Giraph
Lucene Ecosystem




               Spark   Storm
              Giraph

Mais conteúdo relacionado

Semelhante a Search + Big Data: It's (still) All About the User- Grant Ingersoll

IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting FindingsIxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting FindingsJieyun Yang
 
Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14Robert Stribley
 
Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14Robert Stribley
 
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive
 
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...Yandex
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisOpen Analytics
 
Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16Robert Stribley
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information RetrievalRoi Blanco
 
Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16Robert Stribley
 
Warming Up to Analytics
Warming Up to AnalyticsWarming Up to Analytics
Warming Up to AnalyticsLewandog, Inc,
 
Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16Robert Stribley
 
Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17Robert Stribley
 
Share point 2013 the way to go...
Share point 2013 the way to go...Share point 2013 the way to go...
Share point 2013 the way to go...K.Mohamed Faizal
 
Alla ricerca della User Story perduta
Alla ricerca della User Story perdutaAlla ricerca della User Story perduta
Alla ricerca della User Story perdutaEdoardo Schepis
 
Alla ricerca della user story perduta
Alla ricerca della user story perdutaAlla ricerca della user story perduta
Alla ricerca della user story perdutaBetter Software
 
ASA conference Feb 2013
ASA conference Feb 2013ASA conference Feb 2013
ASA conference Feb 2013mrkwr
 
IA breakfast briefing apr12 upload
IA breakfast briefing apr12 uploadIA breakfast briefing apr12 upload
IA breakfast briefing apr12 uploadRoss Philip
 
UXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San FranciscoUXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San FranciscoChris Farnum
 
Evaluating search engines
Evaluating search enginesEvaluating search engines
Evaluating search enginesPhil Bradley
 

Semelhante a Search + Big Data: It's (still) All About the User- Grant Ingersoll (20)

Duncan product tank
Duncan product tankDuncan product tank
Duncan product tank
 
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting FindingsIxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
 
Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14
 
Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14
 
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
 
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media Analysis
 
Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16
 
Warming Up to Analytics
Warming Up to AnalyticsWarming Up to Analytics
Warming Up to Analytics
 
Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16
 
Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17
 
Share point 2013 the way to go...
Share point 2013 the way to go...Share point 2013 the way to go...
Share point 2013 the way to go...
 
Alla ricerca della User Story perduta
Alla ricerca della User Story perdutaAlla ricerca della User Story perduta
Alla ricerca della User Story perduta
 
Alla ricerca della user story perduta
Alla ricerca della user story perdutaAlla ricerca della user story perduta
Alla ricerca della user story perduta
 
ASA conference Feb 2013
ASA conference Feb 2013ASA conference Feb 2013
ASA conference Feb 2013
 
IA breakfast briefing apr12 upload
IA breakfast briefing apr12 uploadIA breakfast briefing apr12 upload
IA breakfast briefing apr12 upload
 
UXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San FranciscoUXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San Francisco
 
Evaluating search engines
Evaluating search enginesEvaluating search engines
Evaluating search engines
 

Mais de lucenerevolution

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucenelucenerevolution
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! lucenerevolution
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solrlucenerevolution
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationslucenerevolution
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloudlucenerevolution
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusterslucenerevolution
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiledlucenerevolution
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs lucenerevolution
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Stormlucenerevolution
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?lucenerevolution
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APIlucenerevolution
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucenelucenerevolution
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenallucenerevolution
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside downlucenerevolution
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - finallucenerevolution
 

Mais de lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Último

IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 

Último (20)

IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 

Search + Big Data: It's (still) All About the User- Grant Ingersoll

  • 1. Search + Big Data: It’s (still) All About the User Grant Ingersoll, Chief Scientist – Lucid Imagination grant@lucidimagination.com October 19, 2011
  • 2. Promise and Reality “Data is increasingly digital air: the oxygen we breathe and the carbon dioxide that we exhale. It can be a source of both sustenance and pollution.” Six Provocations for Big Data by Danah Boyd and Kate Crawford “The truth is, I spend most of my time trying to reduce the size of my data so it can be analyzed.” Hilary Mason, Chief Scientist, Bitly @ Strata
  • 4. Evolution Documents • Models • Feature Selection User Interaction Content • Clicks Relationships • Ratings/ • Page Rank, etc. Reviews • Organization • Learning to Rank • Social Graph Queries • Phrases • NLP
  • 5. Minding the Intersection Search Analytics Discovery
  • 6. Benefits §  End users •  Better relevance/conversion •  Serendipity •  Better/faster insight §  Business: •  ROI •  Awareness across organization •  Enablement •  Agility
  • 7. Needs §  Fast, efficient, scalable search §  Large scale, cost effective storage §  Processing Power: •  Large scale distributed for whole data consumption •  Streaming/In Memory for real time needs •  Ability to learn §  Willingness to ask questions
  • 9. Search §  Good scalable, search a given •  Talks: Chitouras, Sturlese, Binns, Miller §  Custom Relevancy via function queries, boosts §  Explore other relevance models •  Talks: Muir, Pugh •  Lucene/Solr trunk has pluggable scoring (BM25, etc.) §  NRT for timeliness •  Talks: Busch
  • 10. Discovery Facets •  Talks: Yonik •  Classification, Taxonomy Clustering •  Talk: Frank S. Suggestions •  Auto-suggest, Spelling, More Like This, Related Searches, search trails Visualization
  • 12. Analytics for End Users Offline Online •  Popularity/Click •  Trends/Stats •  Link Analysis •  Search Trails •  Social/Personal •  Recommendations •  Spellchecking weights •  Location •  Collocations STORM
  • 13. Analytics for Internal Users Offline Online •  Top X •  Trends •  Zero results •  MRR, MAP •  Operational alerts •  User segmentation (QPS, •  Location, conversions DPS, etc) •  Ad hoc Analysis GIRAPH
  • 14. What’s Missing? §  The glue is up to you (us?) •  Lucene Index -> Pig/Others •  Mahout -> Pig/Others •  Mahout -> Lucene/Solr •  Logs -> Pig/Others §  Nice to have: •  More in-index functionality (that performs) §  Aggregations §  Arbitrary stats §  Complex Joins
  • 15. What’s Next? “I can have all the data I want to have – but I still have to communicate it to our players. It has to get into their minds. And they have to utilize it. ” Brad Stevens, Head Basketball Coach, Butler University in Oct. ‘11 McKinsey Quarterly
  • 16. Thanks! §  http://www.lucidimagination.com §  @gsingers §  grant@lucidimagination.com §  stump@lucene-eurocon.com
  • 17. Lucene Ecosystem Spark Storm Giraph
  • 18. Lucene Ecosystem Spark Storm Giraph