SlideShare uma empresa Scribd logo
1 de 20
Searching
Political Data
   by Strategy
 Roberto Cornacchia
    Jaap Kamps
    Wouter Alink
 Arjen P. de Vries
 info@spinque.com
Search by Strategy
 An iterative 2-stage search process
   Express domain knowledge as high-level
    search strategies
   Generate search engine from the strategy
     A dynamic REST API
     UI controls for unspecified parameters
 Separate search strategy definition (the
  how) from actual searching and browsing
  of data collections (the what)
https://devel.spinque.com/ExPoSeApp-20130116/?config=demo#
                                         dashboard/demo04:
                                           /p/topic/Mokken
Search by Strategy captures:
 Arbitrary retrieval unit types (not just
  documents)
   E.g., expert finding, entity search
 “Semantic” search
   The building blocks operate on scored triples
 Semi-structured search
   Data objects may be structured in hierarchies
 Exploratory search
   Use facets as preferences
Exposé
 Searching the parliamentary proceedings
  of the Dutch parliament
   Complete transcripts of everything said in
    parliament
   Organized by parliamentary session
   Detailing who sais what in what role and
    context
Exposé
 Original data is PDF, transformed into
  XML by award-winning project Political
  Mashup
   http://politicalmashup.nl/
In Politics…
 Essence is not only what is said, but also
  by who and to whom, and why
 Concrete example:
   Wilders sais “knettergek” in parliament (in
    2007) – is this remarkable?
“Knettergek” case
The word “knettergek” has been used many
 times in parliament…

… but never to address a member of the
 government
Varying result types
Utterances




     Person / Party / …
Flexibility
 Concrete case:
   Maarten: “I cannot find Prof. Mokken, who I
    know has been spoken about in parliament
    multiple times!”
Flexibility
 Default indexing uses stemming and
  normalization
 But… searching for people’s names (and,
  as we mention it, many other domain
  specific terminology) can be negatively
  affected by stemming
     “Mokken” transformed into “mok”, leading us to
      geographic locations “Mook” and “De Mok”, but not
      to the famous professor!
https://devel.spinque.com/ExPoSeApp-20130116/?config=demo#dashboard/demo05:
                                        /p/topic/mokken/p/emphasis_stemming/0
Joins to the rescue!




Which house speakers from the Rotterdam harbour
                                       say what about “Amsterdam”?
Semantic Search
biographies



   describes



        person



              utterance
Advantages
 Define and execute custom build search
  strategies
   Specialized to the task, or even to the search
    at hand
 Search multiple data sources at once
 Explore and refine results interactively
 “Search provenance”
   Complete transparency on how search results
    were obtained
Position Statement
 Search professionals think in terms of
  search strategies already
 Let them design their own strategies, and
  thereby tailor their search engines
 So they learn to trust what we claim to be
  the effective information retrieval
  techniques!

Mais conteúdo relacionado

Semelhante a Searching Political Data by Strategy

Presentation Timo Kouwenhoven FIATIFTA
Presentation Timo Kouwenhoven FIATIFTAPresentation Timo Kouwenhoven FIATIFTA
Presentation Timo Kouwenhoven FIATIFTATimo Kouwenhoven
 
Multi-language Content Discovery Through Entity Driven Search: Presented by A...
Multi-language Content Discovery Through Entity Driven Search: Presented by A...Multi-language Content Discovery Through Entity Driven Search: Presented by A...
Multi-language Content Discovery Through Entity Driven Search: Presented by A...Lucidworks
 
Sem tech2013 tutorial
Sem tech2013 tutorialSem tech2013 tutorial
Sem tech2013 tutorialThengo Kim
 
Recent Trends in Semantic Search Technologies
Recent Trends in Semantic Search TechnologiesRecent Trends in Semantic Search Technologies
Recent Trends in Semantic Search TechnologiesThanh Tran
 
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...martingarland
 
Multi-language Content Discovery Through Entity Driven Search
Multi-language Content Discovery Through Entity Driven SearchMulti-language Content Discovery Through Entity Driven Search
Multi-language Content Discovery Through Entity Driven SearchAlessandro Benedetti
 
Falling in and out and in love with Information Architecture
Falling in and out and in love with Information ArchitectureFalling in and out and in love with Information Architecture
Falling in and out and in love with Information ArchitectureLouis Rosenfeld
 
Smart Literature Searching by Susanne Noll
Smart Literature Searching by Susanne NollSmart Literature Searching by Susanne Noll
Smart Literature Searching by Susanne Nollpvhead123
 
Textkernel talks - introduction to Textkernel
Textkernel talks - introduction to TextkernelTextkernel talks - introduction to Textkernel
Textkernel talks - introduction to TextkernelTextkernel
 
Search Analytics For Content Strategists @CSofNYC
Search Analytics For Content Strategists @CSofNYCSearch Analytics For Content Strategists @CSofNYC
Search Analytics For Content Strategists @CSofNYCWIKOLO
 
Henry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurstHenry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurstWIKOLO
 
2008 web-managers-hwilfert-final
2008 web-managers-hwilfert-final2008 web-managers-hwilfert-final
2008 web-managers-hwilfert-finalHallie Wilfert
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Peter Mika
 
Marlabs - Navigation vs Search Final
Marlabs - Navigation vs Search FinalMarlabs - Navigation vs Search Final
Marlabs - Navigation vs Search FinalMarlabs
 

Semelhante a Searching Political Data by Strategy (20)

Presentation Timo Kouwenhoven FIATIFTA
Presentation Timo Kouwenhoven FIATIFTAPresentation Timo Kouwenhoven FIATIFTA
Presentation Timo Kouwenhoven FIATIFTA
 
Line,,NATIONAL SEMINAR ORGANIZED BY KULISAA 15.01.2015
Line,,NATIONAL SEMINAR ORGANIZED BY KULISAA 15.01.2015Line,,NATIONAL SEMINAR ORGANIZED BY KULISAA 15.01.2015
Line,,NATIONAL SEMINAR ORGANIZED BY KULISAA 15.01.2015
 
Multi-language Content Discovery Through Entity Driven Search: Presented by A...
Multi-language Content Discovery Through Entity Driven Search: Presented by A...Multi-language Content Discovery Through Entity Driven Search: Presented by A...
Multi-language Content Discovery Through Entity Driven Search: Presented by A...
 
Sem tech2013 tutorial
Sem tech2013 tutorialSem tech2013 tutorial
Sem tech2013 tutorial
 
Recent Trends in Semantic Search Technologies
Recent Trends in Semantic Search TechnologiesRecent Trends in Semantic Search Technologies
Recent Trends in Semantic Search Technologies
 
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
 
FAST Search-webinar-06-29-2010
FAST Search-webinar-06-29-2010FAST Search-webinar-06-29-2010
FAST Search-webinar-06-29-2010
 
Transform unstructured e&p information
Transform unstructured e&p informationTransform unstructured e&p information
Transform unstructured e&p information
 
Multi-language Content Discovery Through Entity Driven Search
Multi-language Content Discovery Through Entity Driven SearchMulti-language Content Discovery Through Entity Driven Search
Multi-language Content Discovery Through Entity Driven Search
 
Document repositories-and-metadata
Document repositories-and-metadataDocument repositories-and-metadata
Document repositories-and-metadata
 
Semantic search
Semantic searchSemantic search
Semantic search
 
Falling in and out and in love with Information Architecture
Falling in and out and in love with Information ArchitectureFalling in and out and in love with Information Architecture
Falling in and out and in love with Information Architecture
 
Smart Literature Searching by Susanne Noll
Smart Literature Searching by Susanne NollSmart Literature Searching by Susanne Noll
Smart Literature Searching by Susanne Noll
 
Textkernel talks - introduction to Textkernel
Textkernel talks - introduction to TextkernelTextkernel talks - introduction to Textkernel
Textkernel talks - introduction to Textkernel
 
Search Analytics For Content Strategists @CSofNYC
Search Analytics For Content Strategists @CSofNYCSearch Analytics For Content Strategists @CSofNYC
Search Analytics For Content Strategists @CSofNYC
 
Starting a search application
Starting a search applicationStarting a search application
Starting a search application
 
Henry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurstHenry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurst
 
2008 web-managers-hwilfert-final
2008 web-managers-hwilfert-final2008 web-managers-hwilfert-final
2008 web-managers-hwilfert-final
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
 
Marlabs - Navigation vs Search Final
Marlabs - Navigation vs Search FinalMarlabs - Navigation vs Search Final
Marlabs - Navigation vs Search Final
 

Mais de Arjen de Vries

Masterclass Big Data (leerlingen)
Masterclass Big Data (leerlingen) Masterclass Big Data (leerlingen)
Masterclass Big Data (leerlingen) Arjen de Vries
 
Beverwedstrijd Big Data (klas 3/4/5/6)
Beverwedstrijd Big Data (klas 3/4/5/6) Beverwedstrijd Big Data (klas 3/4/5/6)
Beverwedstrijd Big Data (klas 3/4/5/6) Arjen de Vries
 
Beverwedstrijd Big Data (groep 5/6 en klas 1/2)
Beverwedstrijd Big Data (groep 5/6 en klas 1/2)Beverwedstrijd Big Data (groep 5/6 en klas 1/2)
Beverwedstrijd Big Data (groep 5/6 en klas 1/2)Arjen de Vries
 
Web Archives and the dream of the Personal Search Engine
Web Archives and the dream of the Personal Search EngineWeb Archives and the dream of the Personal Search Engine
Web Archives and the dream of the Personal Search EngineArjen de Vries
 
Information Retrieval and Social Media
Information Retrieval and Social MediaInformation Retrieval and Social Media
Information Retrieval and Social MediaArjen de Vries
 
Information Retrieval intro TMM
Information Retrieval intro TMMInformation Retrieval intro TMM
Information Retrieval intro TMMArjen de Vries
 
ACM SIGIR 2017 - Opening - PC Chairs
ACM SIGIR 2017 - Opening - PC ChairsACM SIGIR 2017 - Opening - PC Chairs
ACM SIGIR 2017 - Opening - PC ChairsArjen de Vries
 
Data Science Master Specialisation
Data Science Master SpecialisationData Science Master Specialisation
Data Science Master SpecialisationArjen de Vries
 
PUC Masterclass Big Data
PUC Masterclass Big DataPUC Masterclass Big Data
PUC Masterclass Big DataArjen de Vries
 
Bigdata processing with Spark - part II
Bigdata processing with Spark - part IIBigdata processing with Spark - part II
Bigdata processing with Spark - part IIArjen de Vries
 
Bigdata processing with Spark
Bigdata processing with SparkBigdata processing with Spark
Bigdata processing with SparkArjen de Vries
 
TREC 2016: Looking Forward Panel
TREC 2016: Looking Forward PanelTREC 2016: Looking Forward Panel
TREC 2016: Looking Forward PanelArjen de Vries
 
The personal search engine
The personal search engineThe personal search engine
The personal search engineArjen de Vries
 
Better Contextual Suggestions by Applying Domain Knowledge
Better Contextual Suggestions by Applying Domain KnowledgeBetter Contextual Suggestions by Applying Domain Knowledge
Better Contextual Suggestions by Applying Domain KnowledgeArjen de Vries
 
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013Arjen de Vries
 
Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?Arjen de Vries
 
Twente ir-course 20-10-2010
Twente ir-course 20-10-2010Twente ir-course 20-10-2010
Twente ir-course 20-10-2010Arjen de Vries
 
Context Adaptation in Image Search
Context Adaptation in Image SearchContext Adaptation in Image Search
Context Adaptation in Image SearchArjen de Vries
 

Mais de Arjen de Vries (20)

Doing a PhD @ DOSSIER
Doing a PhD @ DOSSIERDoing a PhD @ DOSSIER
Doing a PhD @ DOSSIER
 
Masterclass Big Data (leerlingen)
Masterclass Big Data (leerlingen) Masterclass Big Data (leerlingen)
Masterclass Big Data (leerlingen)
 
Beverwedstrijd Big Data (klas 3/4/5/6)
Beverwedstrijd Big Data (klas 3/4/5/6) Beverwedstrijd Big Data (klas 3/4/5/6)
Beverwedstrijd Big Data (klas 3/4/5/6)
 
Beverwedstrijd Big Data (groep 5/6 en klas 1/2)
Beverwedstrijd Big Data (groep 5/6 en klas 1/2)Beverwedstrijd Big Data (groep 5/6 en klas 1/2)
Beverwedstrijd Big Data (groep 5/6 en klas 1/2)
 
Web Archives and the dream of the Personal Search Engine
Web Archives and the dream of the Personal Search EngineWeb Archives and the dream of the Personal Search Engine
Web Archives and the dream of the Personal Search Engine
 
Information Retrieval and Social Media
Information Retrieval and Social MediaInformation Retrieval and Social Media
Information Retrieval and Social Media
 
Information Retrieval intro TMM
Information Retrieval intro TMMInformation Retrieval intro TMM
Information Retrieval intro TMM
 
ACM SIGIR 2017 - Opening - PC Chairs
ACM SIGIR 2017 - Opening - PC ChairsACM SIGIR 2017 - Opening - PC Chairs
ACM SIGIR 2017 - Opening - PC Chairs
 
Data Science Master Specialisation
Data Science Master SpecialisationData Science Master Specialisation
Data Science Master Specialisation
 
PUC Masterclass Big Data
PUC Masterclass Big DataPUC Masterclass Big Data
PUC Masterclass Big Data
 
Bigdata processing with Spark - part II
Bigdata processing with Spark - part IIBigdata processing with Spark - part II
Bigdata processing with Spark - part II
 
Bigdata processing with Spark
Bigdata processing with SparkBigdata processing with Spark
Bigdata processing with Spark
 
TREC 2016: Looking Forward Panel
TREC 2016: Looking Forward PanelTREC 2016: Looking Forward Panel
TREC 2016: Looking Forward Panel
 
The personal search engine
The personal search engineThe personal search engine
The personal search engine
 
Better Contextual Suggestions by Applying Domain Knowledge
Better Contextual Suggestions by Applying Domain KnowledgeBetter Contextual Suggestions by Applying Domain Knowledge
Better Contextual Suggestions by Applying Domain Knowledge
 
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
 
Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?
 
Twente ir-course 20-10-2010
Twente ir-course 20-10-2010Twente ir-course 20-10-2010
Twente ir-course 20-10-2010
 
Context Adaptation in Image Search
Context Adaptation in Image SearchContext Adaptation in Image Search
Context Adaptation in Image Search
 
Diversity (in Media)
Diversity (in Media)Diversity (in Media)
Diversity (in Media)
 

Searching Political Data by Strategy

  • 1. Searching Political Data by Strategy Roberto Cornacchia Jaap Kamps Wouter Alink Arjen P. de Vries info@spinque.com
  • 2. Search by Strategy  An iterative 2-stage search process  Express domain knowledge as high-level search strategies  Generate search engine from the strategy  A dynamic REST API  UI controls for unspecified parameters  Separate search strategy definition (the how) from actual searching and browsing of data collections (the what)
  • 3.
  • 5. Search by Strategy captures:  Arbitrary retrieval unit types (not just documents)  E.g., expert finding, entity search  “Semantic” search  The building blocks operate on scored triples  Semi-structured search  Data objects may be structured in hierarchies  Exploratory search  Use facets as preferences
  • 6. Exposé  Searching the parliamentary proceedings of the Dutch parliament  Complete transcripts of everything said in parliament  Organized by parliamentary session  Detailing who sais what in what role and context
  • 7. Exposé  Original data is PDF, transformed into XML by award-winning project Political Mashup  http://politicalmashup.nl/
  • 8. In Politics…  Essence is not only what is said, but also by who and to whom, and why  Concrete example:  Wilders sais “knettergek” in parliament (in 2007) – is this remarkable?
  • 9.
  • 10. “Knettergek” case The word “knettergek” has been used many times in parliament… … but never to address a member of the government
  • 11. Varying result types Utterances Person / Party / …
  • 12. Flexibility  Concrete case:  Maarten: “I cannot find Prof. Mokken, who I know has been spoken about in parliament multiple times!”
  • 13. Flexibility  Default indexing uses stemming and normalization  But… searching for people’s names (and, as we mention it, many other domain specific terminology) can be negatively affected by stemming  “Mokken” transformed into “mok”, leading us to geographic locations “Mook” and “De Mok”, but not to the famous professor!
  • 14.
  • 15.
  • 17. Joins to the rescue! Which house speakers from the Rotterdam harbour say what about “Amsterdam”?
  • 18. Semantic Search biographies describes person utterance
  • 19. Advantages  Define and execute custom build search strategies  Specialized to the task, or even to the search at hand  Search multiple data sources at once  Explore and refine results interactively  “Search provenance”  Complete transparency on how search results were obtained
  • 20. Position Statement  Search professionals think in terms of search strategies already  Let them design their own strategies, and thereby tailor their search engines  So they learn to trust what we claim to be the effective information retrieval techniques!