SlideShare uma empresa Scribd logo
1 de 19
GContext: A context-based query
 construction service for Google
   Ioannis Apostolatos and Ioannis Papadakis
                    Ionian University, Greece
Presentation outline
   Introduction
   Rationale
   Proposed approach
   Usage scenarios
   Discussion
Introduction
   At the web, information about virtually anything can
    be found, provided that a searcher knows where to
    look
   Searchers largely rely on large-scale web search
    engines – SE in order to get assistance in locating
    useful resources
   The quality of the search results depends on the
    ability of the searchers to accurately express their
    information needs as keywords in the search
    engine's input box
   How do SE aid their users in creating successful
    queries?
Rationale
   The query construction phase of a search session is
    crucial to the fulfillment of the searchers‟ information
    needs
   During the query construction phase, a searcher has
    to express his information needs according to the
    specific dialect (i.e. keywords-based) of the
    underlying SE
   The searcher has to 'guess' the words that the SE
    has chosen to index the web resources that
    correspond to such needs
Rationale
   Spoken languages have certain features that should
    be taken under consideration:
       Polysemy of words
           Polysemy occurs when a word has more than one sense
           A query that consists of an ambiguous word without further
            information that correctly disambiguates it may result in a
            search results list with completely useless information
       Synonymy of words
           Synonymy occurs when two or more words share the same
            meaning
           The probability of two persons using the same term in
            describing the same thing is less than 20%
Proposed approach
   A query construction/refinement service on top of
    Google SE that is powered by the LOD cloud and
    especially DBpedia
   The proposed service is a two-step process:
     1.       Initially, it provides autosuggest functionality by reacting to
              the corresponding keystrokes of a searcher
                Prefix search is performed to an index that is comprised of
                 words and/or phrases originating from Wikipedia and made
                 available through Dbpedia („article titles‟ dataset)
                Such functionality facilitates query disambiguations, since
                 Wikipedia's disambiguations follow a pattern that is promoted
                 by prefix search
                    i.e. <ambiguous word> (disambiguation info)) e.g. bass (fish)
                DBpedia‟s suggestions are appended to Google‟s original
                 suggestions
Proposed approach
   The proposed service is a two-step process:
    (continue…)
     2.    Upon selection of a suggestion, the searcher is offered
           the chance to refine the initial query through the
           appropriate interactions that are provided by the
           service (i.e. query replacements and refinements)
         Query replacements and refinements derive from the results of
          SPARQL queries that are addressed to DBpedia's endpoint
   Every interaction results to the construction of an
    appropriate query that is addressed to Google's
    Custom Search, which, in turn, provides the
    corresponding search results
Proposed approach – under the hood:
Query replacements
    Words or phrases that correspond to alternatives to
     the suggestion the user has chosen from the search
     box
    They are actually Wikipedia's redirections of the
     article's title that the user selected from the search
     box
        SPARQL query evolves around the
         <http://dbpedia.org/ontology/wikiPageRedirects>
         predicate
Proposed approach – under the hood:
Query refinements

    Query refinements are keywords that a user can add to the initial query
     in order to semantically refine it. They are organized in three groups:
        Categories
        Wordnet categories and
        Context words
    The 'Categories' group is populated with the categories of the
     Wikipedia's article that the user selected from the search box
        Corr. SPARQL query evolves around the <http://purl.org/dc/terms/subject>
         predicate
    The 'Wordnet categories' group is populated with the wordnet
     categories of the Wikipedia's title that the user selected from the search
     box
        Corr. SPARQL query evolves around the
         <http://dbpedia.org/property/wordnet_type> predicate
    The group 'Context words' is populated with information deriving from
     the infobox of the corresponding Wikipedia's article
        Corr. SPARQL query evolves around the <http://dbpedia.org/property/.*>
         predicate along with numerous „FILTER‟ clauses
Usage scenarios: Autosuggestions

Dealing with ambiguous queries: Jaguar the hero from Archie Comics
Usage scenarios: Autosuggestions

Dealing with ambiguous queries: Jaguar the hero from Archie
Comics
Usage scenarios: Autosuggestions
Dealing with ambiguous queries: Jaguar the hero from Archie Comics
Usage scenarios: Query replacements
Usage scenarios: Query refinements
Usage scenarios: Query refinements
Usage scenarios: Query refinements
Discussion
    So, can we compete Google? Certainly not:
        Linked data is full of „noise‟
            Things could improve if we all put some effort into it:
             http://pedantic-web.org/
        SPARQL endpoints are often too slow to respond
            Unions are expensive
            “FILTER regex” clauses take forever to resolve
                Maybe the Database community provides solutions that will speed
                 things up
        Size matters
            Google‟s index size is far greater and fresher
        And much more…
Discussion

    Then, why bother?
        We believe that GContext can be seamlessly integrated
         with any major search engine that provides access to it‟s
         search box
    What about the „knowledge graph‟?
        Too early to jump to any conclusions. It was announced
         on May 16th, so far only partially deployed
        A proof that we are on the right tracks:
            “… go deeper and broader” i.e. infoboxes from DBpedia
            “… Find the right thing” i.e. PageRedirects from DBpedia
Discussion

    Thank you very much,




        Questions?

Mais conteúdo relacionado

Mais procurados

Searching the Internet
Searching the Internet Searching the Internet
Searching the Internet guest32ae6
 
Working Of Search Engine
Working Of Search EngineWorking Of Search Engine
Working Of Search EngineNIKHIL NAIR
 
Surfing the internet
Surfing the internetSurfing the internet
Surfing the internetEveferro
 
working of search engine & SEO
working of search engine & SEOworking of search engine & SEO
working of search engine & SEODeepak Singh
 
How a search engine works slide
How a search engine works slideHow a search engine works slide
How a search engine works slideSovan Misra
 
WT - Web & Working of Search Engine
WT - Web & Working of Search EngineWT - Web & Working of Search Engine
WT - Web & Working of Search Enginevinay arora
 
Working of search engine
Working of search engineWorking of search engine
Working of search engineNikhil Deswal
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawlerishmecse13
 
How google search engine work
How google search engine workHow google search engine work
How google search engine workLạc Lạc
 
Internet Tutorial 03
Internet  Tutorial 03Internet  Tutorial 03
Internet Tutorial 03dpd
 
google search engine
google search enginegoogle search engine
google search engineway2go
 
Advance searching techniques
Advance searching techniquesAdvance searching techniques
Advance searching techniquesHumayun Khan
 
Google Search Engine
Google Search Engine Google Search Engine
Google Search Engine Aniket_1415
 

Mais procurados (20)

Web Search Engine
Web Search EngineWeb Search Engine
Web Search Engine
 
Searching the Internet
Searching the Internet Searching the Internet
Searching the Internet
 
Working Of Search Engine
Working Of Search EngineWorking Of Search Engine
Working Of Search Engine
 
Surfing the internet
Surfing the internetSurfing the internet
Surfing the internet
 
Search Engine Demystified
Search Engine DemystifiedSearch Engine Demystified
Search Engine Demystified
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
working of search engine & SEO
working of search engine & SEOworking of search engine & SEO
working of search engine & SEO
 
Searching techniques
Searching techniquesSearching techniques
Searching techniques
 
How a search engine works slide
How a search engine works slideHow a search engine works slide
How a search engine works slide
 
Search engine ppt
Search engine pptSearch engine ppt
Search engine ppt
 
Search Engine
Search EngineSearch Engine
Search Engine
 
WT - Web & Working of Search Engine
WT - Web & Working of Search EngineWT - Web & Working of Search Engine
WT - Web & Working of Search Engine
 
Working of search engine
Working of search engineWorking of search engine
Working of search engine
 
Search Engine ppt
Search Engine pptSearch Engine ppt
Search Engine ppt
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawler
 
How google search engine work
How google search engine workHow google search engine work
How google search engine work
 
Internet Tutorial 03
Internet  Tutorial 03Internet  Tutorial 03
Internet Tutorial 03
 
google search engine
google search enginegoogle search engine
google search engine
 
Advance searching techniques
Advance searching techniquesAdvance searching techniques
Advance searching techniques
 
Google Search Engine
Google Search Engine Google Search Engine
Google Search Engine
 

Destaque

C:\fakepath\bioit world2010
C:\fakepath\bioit world2010C:\fakepath\bioit world2010
C:\fakepath\bioit world2010guestdde063f8
 
Advanced query parsing techniques
Advanced query parsing techniquesAdvanced query parsing techniques
Advanced query parsing techniqueslucenerevolution
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through EntitiesPeter Mika
 
Query formulation process
Query formulation processQuery formulation process
Query formulation processmalathimurugan
 
Phishing attack, with SSL Encryption and HTTPS Working
Phishing attack, with SSL Encryption and HTTPS WorkingPhishing attack, with SSL Encryption and HTTPS Working
Phishing attack, with SSL Encryption and HTTPS WorkingSachin Saini
 
Advanced Query Parsing Techniques
Advanced Query Parsing TechniquesAdvanced Query Parsing Techniques
Advanced Query Parsing TechniquesSearch Technologies
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanPost Planner
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionIn a Rocket
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting PersonalKirsty Hulse
 

Destaque (9)

C:\fakepath\bioit world2010
C:\fakepath\bioit world2010C:\fakepath\bioit world2010
C:\fakepath\bioit world2010
 
Advanced query parsing techniques
Advanced query parsing techniquesAdvanced query parsing techniques
Advanced query parsing techniques
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through Entities
 
Query formulation process
Query formulation processQuery formulation process
Query formulation process
 
Phishing attack, with SSL Encryption and HTTPS Working
Phishing attack, with SSL Encryption and HTTPS WorkingPhishing attack, with SSL Encryption and HTTPS Working
Phishing attack, with SSL Encryption and HTTPS Working
 
Advanced Query Parsing Techniques
Advanced Query Parsing TechniquesAdvanced Query Parsing Techniques
Advanced Query Parsing Techniques
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 

Semelhante a GContext: A context-based query construction service for Google

Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)Bradley Allen
 
Understanding Seo At A Glance
Understanding Seo At A GlanceUnderstanding Seo At A Glance
Understanding Seo At A Glancepoojagupta267
 
Tutorial 3 - Searcing the Web
Tutorial 3 - Searcing the WebTutorial 3 - Searcing the Web
Tutorial 3 - Searcing the Webdpd
 
Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...ijsrd.com
 
The Future Of Access to Articles
The Future Of Access to ArticlesThe Future Of Access to Articles
The Future Of Access to ArticlesSteve Toub
 
Search Engines After The Semanatic Web
Search Engines After The Semanatic WebSearch Engines After The Semanatic Web
Search Engines After The Semanatic Websamar_slideshare
 
Context Based Web Indexing For Semantic Web
Context Based Web Indexing For Semantic WebContext Based Web Indexing For Semantic Web
Context Based Web Indexing For Semantic WebIOSR Journals
 
Slawek Korea
Slawek KoreaSlawek Korea
Slawek KoreaSlawek
 
DM110 - Week 10 - Semantic Web / Web 3.0
DM110 - Week 10 - Semantic Web / Web 3.0DM110 - Week 10 - Semantic Web / Web 3.0
DM110 - Week 10 - Semantic Web / Web 3.0John Breslin
 
NNg Visioneering-MKish
NNg Visioneering-MKishNNg Visioneering-MKish
NNg Visioneering-MKishkishmc
 
Introduction to internet.
Introduction to internet.Introduction to internet.
Introduction to internet.Anish Thomas
 
PoolParty Thesaurus Management - ISKO UK, London 2010
PoolParty Thesaurus Management - ISKO UK, London 2010PoolParty Thesaurus Management - ISKO UK, London 2010
PoolParty Thesaurus Management - ISKO UK, London 2010Andreas Blumauer
 
Building a Better Search: Development of a WordPress Search API
Building a Better Search: Development of a WordPress Search APIBuilding a Better Search: Development of a WordPress Search API
Building a Better Search: Development of a WordPress Search APIJustin Shreve
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Researchadameq
 
Strategies To Make Library Resources Discovable
Strategies To Make Library Resources DiscovableStrategies To Make Library Resources Discovable
Strategies To Make Library Resources DiscovableSuhui Ho
 

Semelhante a GContext: A context-based query construction service for Google (20)

Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)
 
EDS for JIBS
EDS for JIBSEDS for JIBS
EDS for JIBS
 
Semantic Web, e-commerce
Semantic Web, e-commerceSemantic Web, e-commerce
Semantic Web, e-commerce
 
Understanding Seo At A Glance
Understanding Seo At A GlanceUnderstanding Seo At A Glance
Understanding Seo At A Glance
 
Tutorial 3 - Searcing the Web
Tutorial 3 - Searcing the WebTutorial 3 - Searcing the Web
Tutorial 3 - Searcing the Web
 
Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...
 
The Future Of Access to Articles
The Future Of Access to ArticlesThe Future Of Access to Articles
The Future Of Access to Articles
 
Search Engines After The Semanatic Web
Search Engines After The Semanatic WebSearch Engines After The Semanatic Web
Search Engines After The Semanatic Web
 
Context Based Web Indexing For Semantic Web
Context Based Web Indexing For Semantic WebContext Based Web Indexing For Semantic Web
Context Based Web Indexing For Semantic Web
 
Slawek Korea
Slawek KoreaSlawek Korea
Slawek Korea
 
DM110 - Week 10 - Semantic Web / Web 3.0
DM110 - Week 10 - Semantic Web / Web 3.0DM110 - Week 10 - Semantic Web / Web 3.0
DM110 - Week 10 - Semantic Web / Web 3.0
 
NNg Visioneering-MKish
NNg Visioneering-MKishNNg Visioneering-MKish
NNg Visioneering-MKish
 
Introduction to internet.
Introduction to internet.Introduction to internet.
Introduction to internet.
 
PoolParty Thesaurus Management - ISKO UK, London 2010
PoolParty Thesaurus Management - ISKO UK, London 2010PoolParty Thesaurus Management - ISKO UK, London 2010
PoolParty Thesaurus Management - ISKO UK, London 2010
 
Search strategies
Search strategiesSearch strategies
Search strategies
 
Swetswise Linker & Google Scholar
Swetswise Linker & Google ScholarSwetswise Linker & Google Scholar
Swetswise Linker & Google Scholar
 
Building a Better Search: Development of a WordPress Search API
Building a Better Search: Development of a WordPress Search APIBuilding a Better Search: Development of a WordPress Search API
Building a Better Search: Development of a WordPress Search API
 
yolink teacher guide
yolink teacher guideyolink teacher guide
yolink teacher guide
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Research
 
Strategies To Make Library Resources Discovable
Strategies To Make Library Resources DiscovableStrategies To Make Library Resources Discovable
Strategies To Make Library Resources Discovable
 

Último

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 

Último (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

GContext: A context-based query construction service for Google

  • 1. GContext: A context-based query construction service for Google Ioannis Apostolatos and Ioannis Papadakis Ionian University, Greece
  • 2. Presentation outline  Introduction  Rationale  Proposed approach  Usage scenarios  Discussion
  • 3. Introduction  At the web, information about virtually anything can be found, provided that a searcher knows where to look  Searchers largely rely on large-scale web search engines – SE in order to get assistance in locating useful resources  The quality of the search results depends on the ability of the searchers to accurately express their information needs as keywords in the search engine's input box  How do SE aid their users in creating successful queries?
  • 4. Rationale  The query construction phase of a search session is crucial to the fulfillment of the searchers‟ information needs  During the query construction phase, a searcher has to express his information needs according to the specific dialect (i.e. keywords-based) of the underlying SE  The searcher has to 'guess' the words that the SE has chosen to index the web resources that correspond to such needs
  • 5. Rationale  Spoken languages have certain features that should be taken under consideration:  Polysemy of words  Polysemy occurs when a word has more than one sense  A query that consists of an ambiguous word without further information that correctly disambiguates it may result in a search results list with completely useless information  Synonymy of words  Synonymy occurs when two or more words share the same meaning  The probability of two persons using the same term in describing the same thing is less than 20%
  • 6. Proposed approach  A query construction/refinement service on top of Google SE that is powered by the LOD cloud and especially DBpedia  The proposed service is a two-step process: 1. Initially, it provides autosuggest functionality by reacting to the corresponding keystrokes of a searcher  Prefix search is performed to an index that is comprised of words and/or phrases originating from Wikipedia and made available through Dbpedia („article titles‟ dataset)  Such functionality facilitates query disambiguations, since Wikipedia's disambiguations follow a pattern that is promoted by prefix search  i.e. <ambiguous word> (disambiguation info)) e.g. bass (fish)  DBpedia‟s suggestions are appended to Google‟s original suggestions
  • 7. Proposed approach  The proposed service is a two-step process: (continue…) 2. Upon selection of a suggestion, the searcher is offered the chance to refine the initial query through the appropriate interactions that are provided by the service (i.e. query replacements and refinements)  Query replacements and refinements derive from the results of SPARQL queries that are addressed to DBpedia's endpoint  Every interaction results to the construction of an appropriate query that is addressed to Google's Custom Search, which, in turn, provides the corresponding search results
  • 8. Proposed approach – under the hood: Query replacements  Words or phrases that correspond to alternatives to the suggestion the user has chosen from the search box  They are actually Wikipedia's redirections of the article's title that the user selected from the search box  SPARQL query evolves around the <http://dbpedia.org/ontology/wikiPageRedirects> predicate
  • 9. Proposed approach – under the hood: Query refinements  Query refinements are keywords that a user can add to the initial query in order to semantically refine it. They are organized in three groups:  Categories  Wordnet categories and  Context words  The 'Categories' group is populated with the categories of the Wikipedia's article that the user selected from the search box  Corr. SPARQL query evolves around the <http://purl.org/dc/terms/subject> predicate  The 'Wordnet categories' group is populated with the wordnet categories of the Wikipedia's title that the user selected from the search box  Corr. SPARQL query evolves around the <http://dbpedia.org/property/wordnet_type> predicate  The group 'Context words' is populated with information deriving from the infobox of the corresponding Wikipedia's article  Corr. SPARQL query evolves around the <http://dbpedia.org/property/.*> predicate along with numerous „FILTER‟ clauses
  • 10. Usage scenarios: Autosuggestions Dealing with ambiguous queries: Jaguar the hero from Archie Comics
  • 11. Usage scenarios: Autosuggestions Dealing with ambiguous queries: Jaguar the hero from Archie Comics
  • 12. Usage scenarios: Autosuggestions Dealing with ambiguous queries: Jaguar the hero from Archie Comics
  • 13. Usage scenarios: Query replacements
  • 14. Usage scenarios: Query refinements
  • 15. Usage scenarios: Query refinements
  • 16. Usage scenarios: Query refinements
  • 17. Discussion  So, can we compete Google? Certainly not:  Linked data is full of „noise‟  Things could improve if we all put some effort into it: http://pedantic-web.org/  SPARQL endpoints are often too slow to respond  Unions are expensive  “FILTER regex” clauses take forever to resolve  Maybe the Database community provides solutions that will speed things up  Size matters  Google‟s index size is far greater and fresher  And much more…
  • 18. Discussion  Then, why bother?  We believe that GContext can be seamlessly integrated with any major search engine that provides access to it‟s search box  What about the „knowledge graph‟?  Too early to jump to any conclusions. It was announced on May 16th, so far only partially deployed  A proof that we are on the right tracks:  “… go deeper and broader” i.e. infoboxes from DBpedia  “… Find the right thing” i.e. PageRedirects from DBpedia
  • 19. Discussion  Thank you very much,  Questions?

Notas do Editor

  1. Impact of large-scale web search engines in information seekingAccording to Alexa: Google, F/b, Youtube, Yahoo!, Baidu, Wikipedia, windows live, twitter, qq, amazon