SlideShare uma empresa Scribd logo
1 de 31
Collection Ranking & Selection
for Federated Entity Search
Krisztian Balog, Robert Neumayer, and Kjetil Nørvåg
Norwegian University of Science and Technology




19th International Symposium on String Processing and Information Retrieval (SPIRE 2012)
Cartagena de Indias, Colombia, October 2012
Motivation
- Entities are ubiquitous
  Many information needs revolve around entities
  (people, products, organisations, places, etc.)
- Growing amount of structured data
  Again, organised around entities
- Entities are often searched by their name
  If we know (heard of) it, we just ask for it by the name
Top searches of 2011*
       Web search                  Mobile search
       1. iPhone                   1. iPhone 5
       2. Casey Anthony            2. Powerball
       3. Kim Kardashian           3. MLB
       4. Katy Perry               4. Scrabble cheat
       5. Jennifer Lopez           5. Casey Anthony
       6. Lindsay Lohan            6. Hurricane Irene 2011
       7. American Idol            7. Kim Kardashian
       8. Jennifer Aniston         8. Translator
       9. Japan earthquake         9. Amy Winehouse
       10. Osama bin Laden         10. May 21, 2011 Rapture

* http://yearinreview.yahoo.com/
Top searches of 2011*
       Web search                  Mobile search
       1. iPhone                   1. iPhone 5
       2. Casey Anthony            2. Powerball
    “users have learned that search engine relevance
       3. Kim Kardashian          3. MLB
    decreases with longer queries and have grown
       4. Katy Perry              4. Scrabble cheat
    accustomed to reducing their query Anthony
       5. Jennifer Lopez          5. Casey (at least
    initially) to the name of an entity”
       6. Lindsay Lohan           6. Hurricane Irene 2011
       7. American Idol            7. Kim Kardashian
       8. Jennifer Aniston
                                      [Blanco et al., 2011]
                                   8. Translator
       9. Japan earthquake         9. Amy Winehouse
       10. Osama bin Laden         10. May 21, 2011 Rapture

* http://yearinreview.yahoo.com/
The Web of Data
           # Linking Open Data datasets
  300



  200



  100



    0
    2007   2008       2009        2010    2011
LOD in 2007
                                      Fresh-       ECS
                                                  South-           SIOC
                                       meat
                                                  ampton
                        BBC
                       Later +                                                     NEW!
                                  Sem-                               NEW!
                        TOTP      Web-
                                                                                   SW
                                 Central                           Onto-        Conference
                                                                   world          Corpus
       Music-           Magna-                   FOAF
       brainz            tune
                                                                                     Open-
                                                                                     Guides

                                                                     Revyu
                                                                                              RDF Book
 Jamendo             Geo-                                                                      Mashup
                    names
                                           DBpedia
                                                                                 DBLP
                                                                                 Berlin
        US
                        World                                      NEW!
      Census
                        Fact-
       Data             book                   NEW!                  lingvoj

                                 Euro-                    flickr
                                  stat                   wrappr
                     Wiki-                 Open                                  DBLP
                   company                  Cyc                                Hannover
           Gov-
           Track                                      Project
                                  W3C                 Guten-
                                 WordNet               berg
LOD in 2011
                                                                                                                                                                                                                  Linked
                                                                                                                                                                                                      LOV          User            Slideshare         tags2con
                                                                                                                                                                                    Audio
                                                                                                                                                                                                                 Feedback             2RDF            delicious
                                                                                                                                                                 Moseley          Scrobbler                                                                             Bricklink        Sussex
                                                                                                                                                                  Folk             (DBTune)                                                                                              Reading            St.
                                                                                                                                                 GTAA
                                                                                                                                Magna-                                                                                                                                                    Lists          Andrews
                                                                                                                                                                                                                          Klapp-
                                                                                                                                 tune                                                                                     stuhl-                                                                         Resource           NTU
                                                                                                                 DB                                                                                                        club                                                                            Lists          Resource
                                                                                                               Tropes                                                                                       Lotico                         Semantic        yovisto
                                                                                                                                          John                     Music                                                                                                     Man-                                           Lists
                                                                                                                                                                                        Music                                               Tweet                           chester
                                                                                           Hellenic                                       Peel                     Brainz                                                                                                                                                                       NDL
                                                                                                                                         (DBTune)                  (Data                Brainz                                                                              Reading
                                                                                                                                                                                                                                                                                                                                              subjects
                                                                                            FBD                                                                                        (zitgist)                                                                             Lists                     Open
                                                                                                           EUTC                                                  Incubator)                                                Linked
                                                                          Hellenic                                                                                                                                                                                                                    Library                    Open                           t4gm
                                                                                                          Produc-                                                                                                         Crunch-
                                                                            PD                                               Surge                                                                       RDF                                                                                                                                                     info
                                                                                                           tions
                                                                                                                                                 Discogs                                                                    base                                                                                                Library
                                                                                                                             Radio                                                                                                          Ontos          Source Code
                                                             Crime                                                                                                                                      ohloh                                                                          Plymouth                                 (Talis)
                                                                                                                                                   (Data                                                                                    News                                                                                                                            LEM
                                                                                                                                                                                                                                                            Ecosystem                   Reading                                                   RAMEAU
                                                            Reports                           business                                           Incubator)
                                                                                  Crime       data.gov.                                                                                                                                     Portal         Linked Data                    Lists                                                     SH
                                                              UK                                                                                                      Music             Jamendo
                                                                                   (En-          uk
                                                                                                                                                                     Brainz             (DBtune)                                                                                                              LinkedL
                                                Ox                               AKTing)                      FanHubz                                                                                                 gnoss                                                                                                                                                                  ntnusc
                                                                                                                                                                    (DBTune)                                                                                                    SSW                             CCN
                                               Points                                                                                                                                                                                                                                                                             Thesau-
                                                                                                                                Last.FM                                                              Poké-                                                                     Thesaur
                                                                      Popula-                                                    artists                                                             pédia                                     Didactal                          us                                                rus W                                    LIBRIS
                                                                     tion (En-                                                  (DBTune)                Last.FM                                                                                   ia                                               theses.                                                LCSH                                         Rådata
                                  reegle                                                research          patents                                                                                                                                                                                                       MARC
                                                                      AKTing)                                                                           (rdfize)                                                                                                    my                                fr                                                                                                nå!
                                                                                        data.gov.         data.go                                                                                                                                                                                                       Codes
                       Ren.
                                                     NHS                                   uk              v.uk                                                                                                 Good-                                             Experi-
                                                                                                                                                                           Classical                                                                                                                                     List
                      Energy                         (En-                                                                                                                                                        win               flickr                          ment
                                                                                                                                                                             (DB             Pokedex                                                                                                                                                                                                             Norwe-
                      Genera-                       AKTing)                Mortality                                           BBC                                                                              Family            wrappr                                            Sudoc                                               PSH
                                                                                                                                                                            Tune)                                                                                                                                                                                                                                 gian
                                                                            (En-
                       tors                                                                                                  Program                                                                                                                                                                                                                                                                              MeSH
                                                                           AKTing)                                                                                                                                                              semantic
                                                                                                                               mes                  BBC                                                                                                                                                IdRef                                                               GND
                                                            CO2                         educatio          OpenEI                                                                                                                                web.org               SW
                                         Energy                                                                                                                                                                                                                                                        Sudoc                                           ndlna
                                                          Emission                      n.data.g                                                    Music                                                                                                            Dog                                                                                                                              VIAF
               EEA                        (En-                                                                                                                     Chronic-                             Linked
                                                            (En-                         ov.uk                                                                                                                            Portu-                                     Food                                                             UB
                                         AKTing)                                                                                                                     ling              Event             MDB
                                                          AKTing)                                                                                                                                                         guese                                                                                                      Mann-                                                                              Europeana
                                                                                                                                       BBC                         America             Media
                                                                                                                                                                                                                         DBpedia                                                                                     Calames         heim
                                                                           Ord-                                       Recht-          Wildlife                                                                                                                                                                                                                      Deutsche
               Open                                                                                                                                                                                                                          Revyu                                     DDC
                                                                                                     Openly           spraak.         Finder                                                                                                                                                                                                                           Bio-              lobid
              Election                                                    nance
                                 legislation                                                          Local              nl                                                                                                                                     RDF                                                                                                  graphie
                                                                                                                                                                                                                                                                                                                                                                                    Resources                 NSZL               Swedish
                Data                                                      Survey                                                                     Tele-                                                                                                                                            data                                            Ulm
     EU                                                                                                                                                                     New                                                                                 Book
              Project           data.gov.uk                                                                                                         graphis                                                                                                                                           bnf.fr                                                                                                 Catalog              Open
    Insti-                                                                                                                                                                  York
                                                                                                                                                                                              URI                                 Open                         Mashup                                                                                                                                                            Cultural
   tutions                                                                                                                                                                 Times                                 Greek                                                                                                            P20
                                                        UK Post-                                                                                                                             Burner                               Calais                                                                                                                                                                                         Heritage
                                                         codes                                                                                                                                                  DBpedia                                                                                                                                     ECS             Wiki
                                                                                        statistics                                                                                                                                                                                                                                                                                                  lobid
                GovWILD                                                                 data.gov.                                 Taxon                                                                                                        iServe                                                                                                      South-                                  Organi-
                                                                                                              LOIUS                                                                                                                                                                 BNB
Brazilian
                                                                                           uk                                    Concept                                                                                                                                                                                 ECS                               ampton                                  sations
                                                                                                                                                         Geo                 World                                                                                 OS                                 BibBase                                                                                                          STW            GESIS
  Poli-                           ESD                                                                                                                                                                                                                                                                                   South-                ECS
                                                                                                                                                        Names                Fact-                                                                                                                                                          (RKB
 ticians                         stan-         reference                                                                                                                                                                                                                                                                ampton
                                                                     data.gov.uk                                                                                             book              Freebase                                                                                                                                   Explorer)                                  Budapest
                                 dards         data.gov.                                                               NASA                                                                                                                                                                                             EPrints
                                                   uk                 intervals                                                                                                                                                                      Project                                                                                                        OAI
                 Lichfield                                                                     transport               (Data                                                                                              DBpedia                                       data
                                                                                                                                                                                                                                                     Guten-                                                                                                                                              Pisa
                  Spen-                                                                        data.gov.               Incu-                                                                                                                                            dcs                                                                                                                                             RESEX         Scholaro-
    ISTAT          ding                                                                                                bator)               Fishes                                                                                                    berg                              DBLP                 DBLP
                                                                                                  uk                                                            Geo
                                                                                                                                                                                                                                                                                                                                                                                                                                       meter
   Immi-                        Scotland                                                                                                   of Texas                                                                                                                                      (FU                 (L3S)
                                Pupils &                                                                                                                                       Uberblic                                                                                                                                      DBLP
   gration                                                                                                                                                     Species                                                                                                                 Berlin)                                                                                      IRIT
                                 Exams                                                                        Euro-                                                                                    dbpedia                                                     data-                                                     (RKB
                                               London                                                                                                                                                                                        TCM                                                                                                    ACM
                                                                                                               stat                                                                                      lite                                                      open-                                                   Explorer)                                                                                            NVD
                                               Gazette                                                        (FUB)                                                                                                                          Gene                                                                                                                                                       IBM
                      Traffic                                                                                                     Geo                                                                                                                              ac-uk
                     Scotland                                      TWC LOGD                Eurostat                                                                                                                       Daily               DIT
                                                                                                                                 Linked                                                                                                                                                                    UN/
   Data                                                                                                                                             UMBEL                                                                 Med                                                                ERA
                                                                                                                                  Data                                                                                                                                                                   LOCODE                                                                                                    DEPLOY
   Gov.ie                         CORDIS                                                                                                                             YAGO                                                                                                                                                                                                           New-
                                                                                                                                                                                          lingvoj                                                           Disea-
                                   (RKB                                                                                                                                                                                                                     some               SIDER                                                                         RAE2001                castle                                        LOCAH
                CORDIS           Explorer)                                                                   Linked                                                                                                                                                                                                   Eurécom
                                                                                   Eurostat                                                                                                                                                 Drug                                                                                   CiteSeer                                                            Roma
                 (FUB)                                                                                    Sensor Data
                                                   GovTrack                       (Ontology                (Kno.e.sis)                                    Open                                                                              Bank                                                      Pfam                                                                                                         Course-
                                                                                   Central)                                          riese                                                           Enipedia
                                                                                                                                                           Cyc              Lexvo                                      LinkedCT                                                                                                                                                                                     ware
                                    Linked                                                                                                                                                                                                                                          PDB
                                                                                                                                                                                                                                                           UniProt                                                         VIVO
             EURES                 EDGAR                                                                                                                                                                                                                                                                                                                                         dotAC
                                                                     US SEC                                                                                                                                                                                                                                               Indiana                ePrints                                                IEEE
                                  (Ontology                                                                                                                                             totl.net
                                                                   (rdfabout)
                                   Central)                                                                                                         WordNet                                                                                                                                                                                                                                                             RISKS
                                                                                                                                                     (VUA)                                                        Taxono               UniProt
                                                                                       US Census               EUNIS              Twarql                                                                                                                                                             HGNC
                                                Semantic                                                                                                               Cornetto                                                        (Bio2RDF)
                                                                                       (rdfabout)                                                                                                                   my                                                                                                                   VIVO
                      FTS                         XBRL                                                                                                                                                                                                         PRO-            ProDom                                 STITCH            Cornell                LAAS
                                                                                                                                                                                                                                                               SITE                                                                                                                        KISTI                NSF
                                  Scotland
                                    Geo-                        GeoWord                                                                                                                       LODE
                                   graphy                         Net                                                                                WordNet           WordNet                                                                                                                                                                                            JISC
                                                                                                                                                      (W3C)              (RKB
                                                                                                           Climbing
                                                                                                                                 Linked                                                                                       Affy-                                                                                                KEGG
                                                                                         SMC                                                                           Explorer)                              SISVU                                                                                            Pub                                    VIVO UF
                                                    Piedmont                                                                    GeoData                                                                                       metrix                                                                                               Drug
                                                                                                                                                                                                                                                                                                                                                                                                 ECCO-
                                    Finnish                                            Journals                                                                                                                                                 PubMed                    Gene                SGD             Chem
                                    Munici-
                                                    Accomo-               El                                                                                                           AGROV                                                                             Ontology                                                                                                                 TCP                           Media
                                                     dations                                                                                         Alpine                                                                                                                                                                                                             bible
                                    palities                           Viajero                                                                                                          OC
                                                                                                                                                      Ski                                                                                                                                                                                                             ontology
                                                                       Tourism                                                                                                                                                                                                                                                                 KEGG
                                                                                        Ocean
                                                                                                                                                     Austria
                                                                                                                                                                                                                                                                                                                                              Enzyme                                     PBAC                           Geographic
                                                                                                                        Metoffice                                  GEMET                                             ChEMBL
                                                           Italian                     Drilling                                                                                                                                        OMIM                                                                                KEGG
                                                                                                                         Weather                                                    Open
                                                            public                     Codices            AEMET                                                                                      Linked                                                                         MGI                                   Pathway
                                                           schools                                                      Forecasts
                                                                                                                                                                                    Data
                                                                                                                                                                                                      Open                                                  InterPro                                    GeneID                                                                                                          Publications
                                                                                                                                                 EARTh                             Thesau-                                                                                                                                                                           KEGG
                                                                         Turismo
                                                                                                                                                                                     rus             Colors                                                                                                                                                         Reaction
                                                                            de
                                                                        Zaragoza                                                                                Product                                                Smart                                                                                                                           KEGG
                                                                                                                                                                                                                                                                                                                                                                                                       User-generated content
                                                                                                                                  Weather                         DB                                                    Link                                                                  Medi                                                     Glycan
                                                                                               Janus                              Stations                                    Product                                                                                                         Care                                       KEGG
                                                                                                AMP                                                                                                                                 UniParc             UniRef              UniSTS                                                                                                                                     Government
                                                                                                                                                                               Types                Italian
                                                                                                                                                                                                                                                                                                                        Homolo           Com-
                                                                                                                    Yahoo!                       Airports                                          Museums                                                                                                                               pound
                                                                                                                                                                              Ontology                          Google
                                                                                                                                                                                                                                                                                                                         Gene
                                                                                                                     Geo                                                                                          Art
                                                                                                                    Planet        National
                                                                                                                                                                                                                wrapper
                                                                                                                                                                                                                                                                                                         Chem2                                                                                                        Cross-domain
                                                                                                                                   Radio-                                                                                                                                                               Bio2RDF
                                                                                                                                  activity                                                                                                                                                UniPath
                                                                                                                                     JP                        Sears                Open                                            Linked                               OGOLOD            way
                                                                                                                                                                                                                                                                                                                                                                                                                       Life sciences
                                                                                                                                                                                   Corpo-           Amster-                                          Reactome
                                                                                                                                                                                                     dam              medu-          Open
                                                                                                                                                                                    rates                                          Numbers
                                                                                                                                                                                                    Museum            cator
                                                                                                                                                                                                                                                                                                                                                                                           As of September 2011
Ad-hoc entity search
At the 2010/11 Semantic Search Challenge

 - Task
   Given a keyword query, targeting a particular entity,
   provide a ranked list of relevant entities (i.e., URIs)
 - Queries
   Sampled from web search engine logs (142 in total)
 - Data collection
   Billion Triple Challenge 2009 (BTC) dataset
 - Relevance judgments
   On a 3-point scale, collected using crowdsourcing
In this talk
- Address the ad-hoc entity retrieval task in a
  distributed setting
  - The Web of Data is inherently distributed
  - Some data sources may not be crawleable at all
- Specifically, our focus is on the collection ranking
  and collection selection steps
Federated search
A typical broker-based architecture

1   Collection ranking   Central broker


                             Summary A

                             Summary B
                             Summary C




                  Q      1         A

                                   C

                                   B
Federated search
A typical broker-based architecture

1   Collection ranking     Central broker
2   Collection selection
                               Summary A

                               Summary B
                               Summary C           Collection A



                                               Q
                    Q      1         A
                                                   Collection B
                                     C
                                           2
                                     B
                                               Q
                                                   Collection C
Federated search
A typical broker-based architecture

1   Collection ranking     Central broker
2   Collection selection
3   Result merging             Summary A

                               Summary B
                               Summary C           Collection A



                                               Q
                    Q      1         A
                                                   Collection B
                                     C
                                           2
                                     B
                                               Q
                                           3       Collection C
Next: baseline models
for collection ranking and selection

1   Collection ranking     Central broker
2   Collection selection
3   Result merging             Summary A

                               Summary B
                               Summary C           Collection A



                                               Q
                     Q     1         A
                                                   Collection B
                                     C
                                           2
                                     B
                                               Q
                                           3       Collection C
Collection ranking
Collection-centric (CC)

 - Lexicon-based method
   Treat and score each collection as if it was a single,
   large document


                                              Collection A
             Q                                Collection C

                                              Collection B

                           ...                   ...
                                      Y
                  P (c|q) / P (c) ·         P (t|✓c )
                                      t2q
Collection ranking
Entity-centric (EC)

 - Document-surrogate method
   Model and query individual documents (entities) and
   aggregate their relevance scores

                                            Collection A

                                            Collection C
            Q
                                            Collection B

                                                ...
                         ...
                                 X
                 P (c|q) /                 P (e|q)
                             e2c,r(e,q)<
Collection selection
Top-K selection

 - Choosing a fixed cutoff (K) ahead of time
   K is usually set between 5 and 20

                         A

                         D

                         C

                         E

                         B

                         F
Our method: AENN
“All that an Entity Needs is a Name”

 - Exploit that entities are searched by their name
 - The central broker maintains a complete dictionary
   of entity names (and corresponding identifiers)
 - Utilise this information in the collection selection step
   to dynamically adjust the #collections selected
AENN collection ranking
- Key observation
  Different methods—collection-centric (CC) vs. entity-
  centric (EC)—work best for different queries




- Idea
  Combination should give better results than any of the
  two methods alone
        AENN(c, q) = (1   ) · CC(c, q) +   · EC(c, q)
AENN collection ranking
Example


              CC         EC         AENN

          A   0.35   A   0.65   A    0.5

          D   0.3    B   0.3    B    0.2

          C   0.2    C   0.05   D    0.15

          E   0.15              C    0.125

          B   0.1               E    0.075

          F   0.05              F    0.025
AENN collection selection
- Key observation
  CC has higher recall, while EC has better precision
- Idea
  Use the collection rankings generated by EC and/or
  CC to dynamically adjust the set of collections selected
  - Precision-oriented selection
  - Recall-oriented selection
  - Balanced selection
AENN collection selection
Precision-oriented selection (AENN(p))

 - Only select collections returned by EC


              CC         EC         AENN     AENN(p)

         A    0.35   A   0.65   A    0.5     A   0.5

         D    0.3    B   0.3    B    0.2     B   0.2

         C    0.2    C   0.05   D    0.15    C   0.125

          E   0.15              C    0.125

         B    0.1               E    0.075

          F   0.05              F    0.025
AENN collection selection
Recall-oriented selection (AENN(r))

 - Include collections from CC until all from EC are
   covered. This defines the cutoff point for AENN

               CC         EC         AENN     AENN(r)

           A   0.35   A   0.65   A    0.5     A   0.5

           D   0.3    B   0.3    B    0.2     B   0.2

           C   0.2    C   0.05   D    0.15    D   0.15

           E   0.15              C    0.125   C   0.125

           B   0.1               E    0.075   E   0.075

           F   0.05              F    0.025
AENN collection selection
Balanced selection (AENN(b))

 - Include collections from AENN until all from EC
   are covered

             CC         EC         AENN     AENN(b)

         A   0.35   A   0.65   A    0.5     A   0.5

         D   0.3    B   0.3    B    0.2     B   0.2

         C   0.2    C   0.05   D    0.15    D   0.15

         E   0.15              C    0.125   C   0.125

         B   0.1               E    0.075

         F   0.05              F    0.025
AENN collection selection
Comparison of approaches


              AENN(p)     AENN(r)     AENN(b)

              A   0.5     A   0.5     A   0.5

              B   0.2     B   0.2     B   0.2

              C   0.125   D   0.15    D   0.15

                          C   0.125   C   0.125

                          E   0.075
Experimental setup
Based on the 2010/11 Semantic Search Challenge

 - Distributed environment
   Top 100 largest second-level domains from BTC
   - Three sets with different handling of DBpedia
 - Relevance
   Considered the #relevant entities from each collection
 - Metrics
   - Collection ranking: Standard IR metrics (MAP, MRR, nDCG)
   - Collection selection: Analogues of precision and recall, plus
     the avg. #coll. selected
Test collections
                                          BTC
                                 BTC             DBpedia
                                        DBpedia
#Entities                         68.8M    60.5M    8.8M
#Collections                       100       99      100
#Queries                           136      116      130
Avg. #rel. entities/query          14.9      4.8    10.1
Avg. #rel. entities/collection      3.4      2.8     9.4
Results
Collection ranking (BTC)

                   CC         EC   AENN
           0.75



           0.50
     MAP




           0.25



             0
                  Name-only        Full content
Results
      Different collection selection strategies (BTCDBpedia)

              Precision                             Recall                          Avg. #coll. selected
0.9                                   1.0                                  50



0.6                                   0.9                                  33



0.3                                   0.7                                  17



 0                                    0.6                                  0
      1   3    10       20   50 100         1   3   10       20   50 100        1      3   10       20   50   100
                    K                                    K                                      K

                                 AENN(p)            AENN(r)            AENN(b)
Results
      Collection selection (DBpedia)

              Precision                             Recall                           Avg. #coll. selected
0.5                                   1.0                                  100



0.3                                   0.7                                   67



0.2                                   0.3                                   33



 0                                     0                                    0
      1   3    10       20   50 100         1   3   10       20   50 100         1      3   10       20   50   100
                    K                                    K                                       K

                                  CC-N              EC-C             AENN(b)
Summary
- Addressed the task of ad-hoc entity retrieval in a
  distributed setting
- Introduced AENN, a novel collection ranking and
  selection method based on a lean name-based
  entity representation
- Showed experimentally that AENN can outperform
  standard baselines that consider all entity content
- Further, AENN can be geared towards high precision,
  high recall, or a balanced setting
Questions?
Resources are available at http://bit.ly/OzfYK2




                                               Contact
                                        @krisztianbalog
                                    krisztianbalog.com

Mais conteúdo relacionado

Destaque

New Product Deep Dive
New Product Deep DiveNew Product Deep Dive
New Product Deep DiveCody Bernard
 
エーピーコミュニケーションズ セキュリティ分析チームのご紹介
エーピーコミュニケーションズ セキュリティ分析チームのご紹介 エーピーコミュニケーションズ セキュリティ分析チームのご紹介
エーピーコミュニケーションズ セキュリティ分析チームのご紹介 APCommunications-recruit
 
Ο Δικαιωματούλης στην Κύπρο-Α4
Ο Δικαιωματούλης στην Κύπρο-Α4Ο Δικαιωματούλης στην Κύπρο-Α4
Ο Δικαιωματούλης στην Κύπρο-Α4Stella Mavidou
 
Future of Mass Media
Future of Mass MediaFuture of Mass Media
Future of Mass Mediasagecast
 
Ya Es Una Realidad
Ya Es Una RealidadYa Es Una Realidad
Ya Es Una Realidadedier araque
 
English Grammar - Past Tense 1
English Grammar - Past Tense 1English Grammar - Past Tense 1
English Grammar - Past Tense 1Ajarn Ken
 
ENGAGE attack of giant viruses
ENGAGE attack of giant virusesENGAGE attack of giant viruses
ENGAGE attack of giant virusesAlexandra Okada
 
OPNFVをインストールしてみた
OPNFVをインストールしてみたOPNFVをインストールしてみた
OPNFVをインストールしてみたMibu Ryota
 
Conceptual frameworks in language course design
Conceptual frameworks in language course designConceptual frameworks in language course design
Conceptual frameworks in language course designCarlos Mayora
 
Introduction to Optical Backbone Networks
Introduction to Optical Backbone NetworksIntroduction to Optical Backbone Networks
Introduction to Optical Backbone NetworksAnuradha Udunuwara
 
Mirantis超簡単Fuel Openstack インストール
Mirantis超簡単Fuel Openstack インストールMirantis超簡単Fuel Openstack インストール
Mirantis超簡単Fuel Openstack インストールKamon Nobuchika
 
Riding promotion team
Riding promotion teamRiding promotion team
Riding promotion teamJungkoo Kim
 
College Financing
College Financing College Financing
College Financing Lisa Allard
 

Destaque (17)

New Product Deep Dive
New Product Deep DiveNew Product Deep Dive
New Product Deep Dive
 
エーピーコミュニケーションズ セキュリティ分析チームのご紹介
エーピーコミュニケーションズ セキュリティ分析チームのご紹介 エーピーコミュニケーションズ セキュリティ分析チームのご紹介
エーピーコミュニケーションズ セキュリティ分析チームのご紹介
 
Ο Δικαιωματούλης στην Κύπρο-Α4
Ο Δικαιωματούλης στην Κύπρο-Α4Ο Δικαιωματούλης στην Κύπρο-Α4
Ο Δικαιωματούλης στην Κύπρο-Α4
 
Logos flash cs5
Logos flash cs5Logos flash cs5
Logos flash cs5
 
LSAMP Winter 2014 Presentation HK
LSAMP Winter 2014 Presentation HKLSAMP Winter 2014 Presentation HK
LSAMP Winter 2014 Presentation HK
 
Future of Mass Media
Future of Mass MediaFuture of Mass Media
Future of Mass Media
 
Ya Es Una Realidad
Ya Es Una RealidadYa Es Una Realidad
Ya Es Una Realidad
 
English Grammar - Past Tense 1
English Grammar - Past Tense 1English Grammar - Past Tense 1
English Grammar - Past Tense 1
 
ENGAGE attack of giant viruses
ENGAGE attack of giant virusesENGAGE attack of giant viruses
ENGAGE attack of giant viruses
 
Poster Urban Inquiries
Poster Urban InquiriesPoster Urban Inquiries
Poster Urban Inquiries
 
OPNFVをインストールしてみた
OPNFVをインストールしてみたOPNFVをインストールしてみた
OPNFVをインストールしてみた
 
Conceptual frameworks in language course design
Conceptual frameworks in language course designConceptual frameworks in language course design
Conceptual frameworks in language course design
 
Introduction to Optical Backbone Networks
Introduction to Optical Backbone NetworksIntroduction to Optical Backbone Networks
Introduction to Optical Backbone Networks
 
Mirantis超簡単Fuel Openstack インストール
Mirantis超簡単Fuel Openstack インストールMirantis超簡単Fuel Openstack インストール
Mirantis超簡単Fuel Openstack インストール
 
Revisao de direito_administrativo_2_para AV1
Revisao de direito_administrativo_2_para AV1 Revisao de direito_administrativo_2_para AV1
Revisao de direito_administrativo_2_para AV1
 
Riding promotion team
Riding promotion teamRiding promotion team
Riding promotion team
 
College Financing
College Financing College Financing
College Financing
 

Semelhante a Collection Ranking and Selection for Federated Entity Search

Cshals Tech Talk
Cshals Tech TalkCshals Tech Talk
Cshals Tech Talkvisha1gupta
 
Case acquisition from text: Ontology-based information extraction with SCOOBI...
Case acquisition from text: Ontology-based information extraction with SCOOBI...Case acquisition from text: Ontology-based information extraction with SCOOBI...
Case acquisition from text: Ontology-based information extraction with SCOOBI...Thomas Roth-Berghofer
 
Anywhere, anytime, any place - embrace the Martini Principle
Anywhere, anytime, any place - embrace the Martini PrincipleAnywhere, anytime, any place - embrace the Martini Principle
Anywhere, anytime, any place - embrace the Martini PrincipleHarold Teunissen
 
Provenance-awareness: A pre-requisite to explanation-awareness
Provenance-awareness: A pre-requisite to explanation-awarenessProvenance-awareness: A pre-requisite to explanation-awareness
Provenance-awareness: A pre-requisite to explanation-awarenessThomas Roth-Berghofer
 
Mathematical Semantics of Statistical Data
Mathematical Semantics of Statistical DataMathematical Semantics of Statistical Data
Mathematical Semantics of Statistical DataChristoph Lange
 
LinkedDataと地理空間情報
LinkedDataと地理空間情報LinkedDataと地理空間情報
LinkedDataと地理空間情報Fumihiro Kato
 
Serendipity in Linked Open Data
Serendipity in Linked Open DataSerendipity in Linked Open Data
Serendipity in Linked Open Datai_serena
 
Semantic Representations for Research
Semantic Representations for ResearchSemantic Representations for Research
Semantic Representations for ResearchRinke Hoekstra
 
Why Linked Data?
Why Linked Data?Why Linked Data?
Why Linked Data?Paul Miller
 
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF:  Lifting NLP Extraction Results to the Linked Data CloudNERD meets NIF:  Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data CloudGiuseppe Rizzo
 

Semelhante a Collection Ranking and Selection for Federated Entity Search (12)

The RDFa, seo wave
The RDFa, seo waveThe RDFa, seo wave
The RDFa, seo wave
 
Cshals Tech Talk
Cshals Tech TalkCshals Tech Talk
Cshals Tech Talk
 
Case acquisition from text: Ontology-based information extraction with SCOOBI...
Case acquisition from text: Ontology-based information extraction with SCOOBI...Case acquisition from text: Ontology-based information extraction with SCOOBI...
Case acquisition from text: Ontology-based information extraction with SCOOBI...
 
Anywhere, anytime, any place - embrace the Martini Principle
Anywhere, anytime, any place - embrace the Martini PrincipleAnywhere, anytime, any place - embrace the Martini Principle
Anywhere, anytime, any place - embrace the Martini Principle
 
Provenance-awareness: A pre-requisite to explanation-awareness
Provenance-awareness: A pre-requisite to explanation-awarenessProvenance-awareness: A pre-requisite to explanation-awareness
Provenance-awareness: A pre-requisite to explanation-awareness
 
Mathematical Semantics of Statistical Data
Mathematical Semantics of Statistical DataMathematical Semantics of Statistical Data
Mathematical Semantics of Statistical Data
 
LinkedDataと地理空間情報
LinkedDataと地理空間情報LinkedDataと地理空間情報
LinkedDataと地理空間情報
 
The NERD project
The NERD projectThe NERD project
The NERD project
 
Serendipity in Linked Open Data
Serendipity in Linked Open DataSerendipity in Linked Open Data
Serendipity in Linked Open Data
 
Semantic Representations for Research
Semantic Representations for ResearchSemantic Representations for Research
Semantic Representations for Research
 
Why Linked Data?
Why Linked Data?Why Linked Data?
Why Linked Data?
 
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF:  Lifting NLP Extraction Results to the Linked Data CloudNERD meets NIF:  Lifting NLP Extraction Results to the Linked Data Cloud
NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud
 

Mais de krisztianbalog

Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...krisztianbalog
 
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...krisztianbalog
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?krisztianbalog
 
Personal Knowledge Graphs
Personal Knowledge GraphsPersonal Knowledge Graphs
Personal Knowledge Graphskrisztianbalog
 
Entities for Augmented Intelligence
Entities for Augmented IntelligenceEntities for Augmented Intelligence
Entities for Augmented Intelligencekrisztianbalog
 
On Entities and Evaluation
On Entities and EvaluationOn Entities and Evaluation
On Entities and Evaluationkrisztianbalog
 
Table Retrieval and Generation
Table Retrieval and GenerationTable Retrieval and Generation
Table Retrieval and Generationkrisztianbalog
 
Entity Search: The Last Decade and the Next
Entity Search: The Last Decade and the NextEntity Search: The Last Decade and the Next
Entity Search: The Last Decade and the Nextkrisztianbalog
 
Overview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search EditionOverview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search Editionkrisztianbalog
 
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF LabOverview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Labkrisztianbalog
 
Evaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented SearchEvaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented Searchkrisztianbalog
 
Entity Retrieval (tutorial organized by Radialpoint in Montreal)
Entity Retrieval (tutorial organized by Radialpoint in Montreal)Entity Retrieval (tutorial organized by Radialpoint in Montreal)
Entity Retrieval (tutorial organized by Radialpoint in Montreal)krisztianbalog
 
Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)krisztianbalog
 
Time-aware Evaluation of Cumulative Citation Recommendation Systems
Time-aware Evaluation of Cumulative Citation Recommendation SystemsTime-aware Evaluation of Cumulative Citation Recommendation Systems
Time-aware Evaluation of Cumulative Citation Recommendation Systemskrisztianbalog
 
Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)krisztianbalog
 
Multi-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation RecommendationMulti-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation Recommendationkrisztianbalog
 
Entity Retrieval (WWW 2013 tutorial)
Entity Retrieval (WWW 2013 tutorial)Entity Retrieval (WWW 2013 tutorial)
Entity Retrieval (WWW 2013 tutorial)krisztianbalog
 
Semistructured Data Seach
Semistructured Data SeachSemistructured Data Seach
Semistructured Data Seachkrisztianbalog
 

Mais de krisztianbalog (19)

Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
 
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
 
Personal Knowledge Graphs
Personal Knowledge GraphsPersonal Knowledge Graphs
Personal Knowledge Graphs
 
Entities for Augmented Intelligence
Entities for Augmented IntelligenceEntities for Augmented Intelligence
Entities for Augmented Intelligence
 
On Entities and Evaluation
On Entities and EvaluationOn Entities and Evaluation
On Entities and Evaluation
 
Table Retrieval and Generation
Table Retrieval and GenerationTable Retrieval and Generation
Table Retrieval and Generation
 
Entity Search: The Last Decade and the Next
Entity Search: The Last Decade and the NextEntity Search: The Last Decade and the Next
Entity Search: The Last Decade and the Next
 
Overview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search EditionOverview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search Edition
 
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF LabOverview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
 
Entity Linking
Entity LinkingEntity Linking
Entity Linking
 
Evaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented SearchEvaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented Search
 
Entity Retrieval (tutorial organized by Radialpoint in Montreal)
Entity Retrieval (tutorial organized by Radialpoint in Montreal)Entity Retrieval (tutorial organized by Radialpoint in Montreal)
Entity Retrieval (tutorial organized by Radialpoint in Montreal)
 
Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)
 
Time-aware Evaluation of Cumulative Citation Recommendation Systems
Time-aware Evaluation of Cumulative Citation Recommendation SystemsTime-aware Evaluation of Cumulative Citation Recommendation Systems
Time-aware Evaluation of Cumulative Citation Recommendation Systems
 
Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)
 
Multi-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation RecommendationMulti-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation Recommendation
 
Entity Retrieval (WWW 2013 tutorial)
Entity Retrieval (WWW 2013 tutorial)Entity Retrieval (WWW 2013 tutorial)
Entity Retrieval (WWW 2013 tutorial)
 
Semistructured Data Seach
Semistructured Data SeachSemistructured Data Seach
Semistructured Data Seach
 

Último

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 

Último (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 

Collection Ranking and Selection for Federated Entity Search

  • 1. Collection Ranking & Selection for Federated Entity Search Krisztian Balog, Robert Neumayer, and Kjetil Nørvåg Norwegian University of Science and Technology 19th International Symposium on String Processing and Information Retrieval (SPIRE 2012) Cartagena de Indias, Colombia, October 2012
  • 2. Motivation - Entities are ubiquitous Many information needs revolve around entities (people, products, organisations, places, etc.) - Growing amount of structured data Again, organised around entities - Entities are often searched by their name If we know (heard of) it, we just ask for it by the name
  • 3. Top searches of 2011* Web search Mobile search 1. iPhone 1. iPhone 5 2. Casey Anthony 2. Powerball 3. Kim Kardashian 3. MLB 4. Katy Perry 4. Scrabble cheat 5. Jennifer Lopez 5. Casey Anthony 6. Lindsay Lohan 6. Hurricane Irene 2011 7. American Idol 7. Kim Kardashian 8. Jennifer Aniston 8. Translator 9. Japan earthquake 9. Amy Winehouse 10. Osama bin Laden 10. May 21, 2011 Rapture * http://yearinreview.yahoo.com/
  • 4. Top searches of 2011* Web search Mobile search 1. iPhone 1. iPhone 5 2. Casey Anthony 2. Powerball “users have learned that search engine relevance 3. Kim Kardashian 3. MLB decreases with longer queries and have grown 4. Katy Perry 4. Scrabble cheat accustomed to reducing their query Anthony 5. Jennifer Lopez 5. Casey (at least initially) to the name of an entity” 6. Lindsay Lohan 6. Hurricane Irene 2011 7. American Idol 7. Kim Kardashian 8. Jennifer Aniston [Blanco et al., 2011] 8. Translator 9. Japan earthquake 9. Amy Winehouse 10. Osama bin Laden 10. May 21, 2011 Rapture * http://yearinreview.yahoo.com/
  • 5. The Web of Data # Linking Open Data datasets 300 200 100 0 2007 2008 2009 2010 2011
  • 6. LOD in 2007 Fresh- ECS South- SIOC meat ampton BBC Later + NEW! Sem- NEW! TOTP Web- SW Central Onto- Conference world Corpus Music- Magna- FOAF brainz tune Open- Guides Revyu RDF Book Jamendo Geo- Mashup names DBpedia DBLP Berlin US World NEW! Census Fact- Data book NEW! lingvoj Euro- flickr stat wrappr Wiki- Open DBLP company Cyc Hannover Gov- Track Project W3C Guten- WordNet berg
  • 7. LOD in 2011 Linked LOV User Slideshare tags2con Audio Feedback 2RDF delicious Moseley Scrobbler Bricklink Sussex Folk (DBTune) Reading St. GTAA Magna- Lists Andrews Klapp- tune stuhl- Resource NTU DB club Lists Resource Tropes Lotico Semantic yovisto John Music Man- Lists Music Tweet chester Hellenic Peel Brainz NDL (DBTune) (Data Brainz Reading subjects FBD (zitgist) Lists Open EUTC Incubator) Linked Hellenic Library Open t4gm Produc- Crunch- PD Surge RDF info tions Discogs base Library Radio Ontos Source Code Crime ohloh Plymouth (Talis) (Data News LEM Ecosystem Reading RAMEAU Reports business Incubator) Crime data.gov. Portal Linked Data Lists SH UK Music Jamendo (En- uk Brainz (DBtune) LinkedL Ox AKTing) FanHubz gnoss ntnusc (DBTune) SSW CCN Points Thesau- Last.FM Poké- Thesaur Popula- artists pédia Didactal us rus W LIBRIS tion (En- (DBTune) Last.FM ia theses. LCSH Rådata reegle research patents MARC AKTing) (rdfize) my fr nå! data.gov. data.go Codes Ren. NHS uk v.uk Good- Experi- Classical List Energy (En- win flickr ment (DB Pokedex Norwe- Genera- AKTing) Mortality BBC Family wrappr Sudoc PSH Tune) gian (En- tors Program MeSH AKTing) semantic mes BBC IdRef GND CO2 educatio OpenEI web.org SW Energy Sudoc ndlna Emission n.data.g Music Dog VIAF EEA (En- Chronic- Linked (En- ov.uk Portu- Food UB AKTing) ling Event MDB AKTing) guese Mann- Europeana BBC America Media DBpedia Calames heim Ord- Recht- Wildlife Deutsche Open Revyu DDC Openly spraak. Finder Bio- lobid Election nance legislation Local nl RDF graphie Resources NSZL Swedish Data Survey Tele- data Ulm EU New Book Project data.gov.uk graphis bnf.fr Catalog Open Insti- York URI Open Mashup Cultural tutions Times Greek P20 UK Post- Burner Calais Heritage codes DBpedia ECS Wiki statistics lobid GovWILD data.gov. Taxon iServe South- Organi- LOIUS BNB Brazilian uk Concept ECS ampton sations Geo World OS BibBase STW GESIS Poli- ESD South- ECS Names Fact- (RKB ticians stan- reference ampton data.gov.uk book Freebase Explorer) Budapest dards data.gov. NASA EPrints uk intervals Project OAI Lichfield transport (Data DBpedia data Guten- Pisa Spen- data.gov. Incu- dcs RESEX Scholaro- ISTAT ding bator) Fishes berg DBLP DBLP uk Geo meter Immi- Scotland of Texas (FU (L3S) Pupils & Uberblic DBLP gration Species Berlin) IRIT Exams Euro- dbpedia data- (RKB London TCM ACM stat lite open- Explorer) NVD Gazette (FUB) Gene IBM Traffic Geo ac-uk Scotland TWC LOGD Eurostat Daily DIT Linked UN/ Data UMBEL Med ERA Data LOCODE DEPLOY Gov.ie CORDIS YAGO New- lingvoj Disea- (RKB some SIDER RAE2001 castle LOCAH CORDIS Explorer) Linked Eurécom Eurostat Drug CiteSeer Roma (FUB) Sensor Data GovTrack (Ontology (Kno.e.sis) Open Bank Pfam Course- Central) riese Enipedia Cyc Lexvo LinkedCT ware Linked PDB UniProt VIVO EURES EDGAR dotAC US SEC Indiana ePrints IEEE (Ontology totl.net (rdfabout) Central) WordNet RISKS (VUA) Taxono UniProt US Census EUNIS Twarql HGNC Semantic Cornetto (Bio2RDF) (rdfabout) my VIVO FTS XBRL PRO- ProDom STITCH Cornell LAAS SITE KISTI NSF Scotland Geo- GeoWord LODE graphy Net WordNet WordNet JISC (W3C) (RKB Climbing Linked Affy- KEGG SMC Explorer) SISVU Pub VIVO UF Piedmont GeoData metrix Drug ECCO- Finnish Journals PubMed Gene SGD Chem Munici- Accomo- El AGROV Ontology TCP Media dations Alpine bible palities Viajero OC Ski ontology Tourism KEGG Ocean Austria Enzyme PBAC Geographic Metoffice GEMET ChEMBL Italian Drilling OMIM KEGG Weather Open public Codices AEMET Linked MGI Pathway schools Forecasts Data Open InterPro GeneID Publications EARTh Thesau- KEGG Turismo rus Colors Reaction de Zaragoza Product Smart KEGG User-generated content Weather DB Link Medi Glycan Janus Stations Product Care KEGG AMP UniParc UniRef UniSTS Government Types Italian Homolo Com- Yahoo! Airports Museums pound Ontology Google Gene Geo Art Planet National wrapper Chem2 Cross-domain Radio- Bio2RDF activity UniPath JP Sears Open Linked OGOLOD way Life sciences Corpo- Amster- Reactome dam medu- Open rates Numbers Museum cator As of September 2011
  • 8. Ad-hoc entity search At the 2010/11 Semantic Search Challenge - Task Given a keyword query, targeting a particular entity, provide a ranked list of relevant entities (i.e., URIs) - Queries Sampled from web search engine logs (142 in total) - Data collection Billion Triple Challenge 2009 (BTC) dataset - Relevance judgments On a 3-point scale, collected using crowdsourcing
  • 9. In this talk - Address the ad-hoc entity retrieval task in a distributed setting - The Web of Data is inherently distributed - Some data sources may not be crawleable at all - Specifically, our focus is on the collection ranking and collection selection steps
  • 10. Federated search A typical broker-based architecture 1 Collection ranking Central broker Summary A Summary B Summary C Q 1 A C B
  • 11. Federated search A typical broker-based architecture 1 Collection ranking Central broker 2 Collection selection Summary A Summary B Summary C Collection A Q Q 1 A Collection B C 2 B Q Collection C
  • 12. Federated search A typical broker-based architecture 1 Collection ranking Central broker 2 Collection selection 3 Result merging Summary A Summary B Summary C Collection A Q Q 1 A Collection B C 2 B Q 3 Collection C
  • 13. Next: baseline models for collection ranking and selection 1 Collection ranking Central broker 2 Collection selection 3 Result merging Summary A Summary B Summary C Collection A Q Q 1 A Collection B C 2 B Q 3 Collection C
  • 14. Collection ranking Collection-centric (CC) - Lexicon-based method Treat and score each collection as if it was a single, large document Collection A Q Collection C Collection B ... ... Y P (c|q) / P (c) · P (t|✓c ) t2q
  • 15. Collection ranking Entity-centric (EC) - Document-surrogate method Model and query individual documents (entities) and aggregate their relevance scores Collection A Collection C Q Collection B ... ... X P (c|q) / P (e|q) e2c,r(e,q)<
  • 16. Collection selection Top-K selection - Choosing a fixed cutoff (K) ahead of time K is usually set between 5 and 20 A D C E B F
  • 17. Our method: AENN “All that an Entity Needs is a Name” - Exploit that entities are searched by their name - The central broker maintains a complete dictionary of entity names (and corresponding identifiers) - Utilise this information in the collection selection step to dynamically adjust the #collections selected
  • 18. AENN collection ranking - Key observation Different methods—collection-centric (CC) vs. entity- centric (EC)—work best for different queries - Idea Combination should give better results than any of the two methods alone AENN(c, q) = (1 ) · CC(c, q) + · EC(c, q)
  • 19. AENN collection ranking Example CC EC AENN A 0.35 A 0.65 A 0.5 D 0.3 B 0.3 B 0.2 C 0.2 C 0.05 D 0.15 E 0.15 C 0.125 B 0.1 E 0.075 F 0.05 F 0.025
  • 20. AENN collection selection - Key observation CC has higher recall, while EC has better precision - Idea Use the collection rankings generated by EC and/or CC to dynamically adjust the set of collections selected - Precision-oriented selection - Recall-oriented selection - Balanced selection
  • 21. AENN collection selection Precision-oriented selection (AENN(p)) - Only select collections returned by EC CC EC AENN AENN(p) A 0.35 A 0.65 A 0.5 A 0.5 D 0.3 B 0.3 B 0.2 B 0.2 C 0.2 C 0.05 D 0.15 C 0.125 E 0.15 C 0.125 B 0.1 E 0.075 F 0.05 F 0.025
  • 22. AENN collection selection Recall-oriented selection (AENN(r)) - Include collections from CC until all from EC are covered. This defines the cutoff point for AENN CC EC AENN AENN(r) A 0.35 A 0.65 A 0.5 A 0.5 D 0.3 B 0.3 B 0.2 B 0.2 C 0.2 C 0.05 D 0.15 D 0.15 E 0.15 C 0.125 C 0.125 B 0.1 E 0.075 E 0.075 F 0.05 F 0.025
  • 23. AENN collection selection Balanced selection (AENN(b)) - Include collections from AENN until all from EC are covered CC EC AENN AENN(b) A 0.35 A 0.65 A 0.5 A 0.5 D 0.3 B 0.3 B 0.2 B 0.2 C 0.2 C 0.05 D 0.15 D 0.15 E 0.15 C 0.125 C 0.125 B 0.1 E 0.075 F 0.05 F 0.025
  • 24. AENN collection selection Comparison of approaches AENN(p) AENN(r) AENN(b) A 0.5 A 0.5 A 0.5 B 0.2 B 0.2 B 0.2 C 0.125 D 0.15 D 0.15 C 0.125 C 0.125 E 0.075
  • 25. Experimental setup Based on the 2010/11 Semantic Search Challenge - Distributed environment Top 100 largest second-level domains from BTC - Three sets with different handling of DBpedia - Relevance Considered the #relevant entities from each collection - Metrics - Collection ranking: Standard IR metrics (MAP, MRR, nDCG) - Collection selection: Analogues of precision and recall, plus the avg. #coll. selected
  • 26. Test collections BTC BTC DBpedia DBpedia #Entities 68.8M 60.5M 8.8M #Collections 100 99 100 #Queries 136 116 130 Avg. #rel. entities/query 14.9 4.8 10.1 Avg. #rel. entities/collection 3.4 2.8 9.4
  • 27. Results Collection ranking (BTC) CC EC AENN 0.75 0.50 MAP 0.25 0 Name-only Full content
  • 28. Results Different collection selection strategies (BTCDBpedia) Precision Recall Avg. #coll. selected 0.9 1.0 50 0.6 0.9 33 0.3 0.7 17 0 0.6 0 1 3 10 20 50 100 1 3 10 20 50 100 1 3 10 20 50 100 K K K AENN(p) AENN(r) AENN(b)
  • 29. Results Collection selection (DBpedia) Precision Recall Avg. #coll. selected 0.5 1.0 100 0.3 0.7 67 0.2 0.3 33 0 0 0 1 3 10 20 50 100 1 3 10 20 50 100 1 3 10 20 50 100 K K K CC-N EC-C AENN(b)
  • 30. Summary - Addressed the task of ad-hoc entity retrieval in a distributed setting - Introduced AENN, a novel collection ranking and selection method based on a lean name-based entity representation - Showed experimentally that AENN can outperform standard baselines that consider all entity content - Further, AENN can be geared towards high precision, high recall, or a balanced setting
  • 31. Questions? Resources are available at http://bit.ly/OzfYK2 Contact @krisztianbalog krisztianbalog.com

Notas do Editor

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. This is how the Linking Open Data cloud looked like sometime in 2007\n
  7. And this is how it looks like more-or-less now.\n
  8. Main lessons learned: fielded variants of standard document retrieval methods work well on this task\nAll approaches, however, assumed that a centralised index encompassing all entities is available\n
  9. \n
  10. As I said earlier, existing approaches build document-based representations of entities. Therefore, federated search techniques for document retrieval can easily be applied.\n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. Acronym comes from the very insightful observation that ...\n\nPrevious methods rely on sampling\n\n\n
  18. Note that the central broker contains only the names of entities\n
  19. Ok, let&amp;#x2019;s familiarise ourselves with these acronyms, because we&amp;#x2019;ll be using them a lot in the following slides.\n
  20. \n
  21. Recall that EC operates on a local ranking of entities; we expect this to be the final list of results\nConsequently, only collection that contribute to this list will be selected.\n
  22. We start at the top of the CC ranking and keep including collections until we have all from EC covered. \n
  23. We start at the top of the CC ranking and keep including collections until we have all from EC covered. \n
  24. \n
  25. No standard test collection exists for this task.\n
  26. \n
  27. \n
  28. \n
  29. CC-N: lower bound\nEC-C: upper bound (essentially, a centralised index of all entities with full content)\n
  30. \n
  31. \n