SlideShare a Scribd company logo
1 of 25
Download to read offline
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Expert Finding in Social Networks
Matteo Silvestri Giuliano Vesci
Politecnico di Milano
25-07-2012
Silvestri, Vesci: Expert Finding in Social Networks 1 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Outline
1 Introduction
2 Definition of the problem
3 Techniques for Expertise Retrieval
4 Tests
5 Conclusions
Silvestri, Vesci: Expert Finding in Social Networks 2 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Expert Finding in Social Networks
There are several problems that require looking for expert users
inside online social networks. For example:
friends experts in cinema
friends that know about a particular disease
friends able to use a particular technology
The task of finding experts able to answer specific informative
needs is called expert finding.
In particular, we studied this problem in the human computation
field of CrowdSearcher, an approach that bridges conventional
search experiences to crowdsourcing.
Problem: assign CrowdSearcher tasks to expert users
Silvestri, Vesci: Expert Finding in Social Networks 3 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Research Questions
Can the analysis of social actions (e.g. posts, tweets,
interaction with social groups, etc.) help in providing a better
characterization of users for search tasks?
Is the combined use of social network information useful to
better characterize a user?
Among the available approaches to expert finding, which one
is better suited in the context of social networks?
Are social networks oriented toward specific domains of
expertise?
Goal: methodologies and tools for the selection of best experts in a
set of trusted users in (multiple) social networks.
Silvestri, Vesci: Expert Finding in Social Networks 4 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Outline
1 Introduction
2 Definition of the problem
3 Techniques for Expertise Retrieval
4 Tests
5 Conclusions
Silvestri, Vesci: Expert Finding in Social Networks 5 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Definition of the problem
Automatically reach experts for crowdsourced queries:
Given a query q and a set CE = (ce1, ce2, ..., cem) of social
users that are candidate experts, find a ordered subset
S(CE) ⊂ CE of n users with the highest scores score(q, cei).
Score(q, cei) S(CE)CE
q
Estimating the scoring function score(q, ei) is the main task of
this work
Silvestri, Vesci: Expert Finding in Social Networks 6 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Social Network Characterization
Two types of social information characterize users:
explicit information: static profiles
implicit information: social dynamic activities
Social network users can perform several activities and publish
informative materials, that we call resources.
The idea is to collect evidence of expertise from multiple resources
associated to a candidate.
Silvestri, Vesci: Expert Finding in Social Networks 7 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Resources Levels
Resources are related to the user through a path in the graph.
We consider resources connected to a user through a path of
length <= 2.
Post@09.10
ALICE
Post@09.00
Post@09.05
owns
owns / creates
annotates
(likes)
Facebook
Group
relatesTo
(belongs)Post@08.00
Post@08.05
contains
contains
creates
Level 0
Level 1
Level 2
Silvestri, Vesci: Expert Finding in Social Networks 8 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Outline
1 Introduction
2 Definition of the problem
3 Techniques for Expertise Retrieval
4 Tests
5 Conclusions
Silvestri, Vesci: Expert Finding in Social Networks 9 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Analysis
Resources have to be analyzed to infer expertise information
Crawling (API) Url Extraction
Text
Preprocessing
Language
Detection
Named Entity
Extraction
Crawling: extraction of resources’ textual content exploiting Social
Networks API
Url Extraction: extract the content of eventual external websites and
append it to the resource’s text
Language Detection: it is not recommended having different languages
in the same index in information retrieval systems, so we detect the
language of resources.
Named Entity Extraction: extraction of entities like people, cities and
movies
Textual Preprocessing: we remove stop-words (common words), filter
out html tags, perform stemming.
Silvestri, Vesci: Expert Finding in Social Networks 10 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Model 1: Resource Based
Query Resource Candidate
weight(r,c)score(q,r)
It is based on resources, considered as documents in a classic Vector Space
Model. Resources are represented both as term vectors and entity id vectors
1 First, the similarity between the query and resources is computed:
score(q, r) = α·
t∈q
tf (t, r) · idf (t)2
+ β ·
e∈q
tf (e, r) · idf (t)2
· eConf (e, r)
2 Then, users related to best resources are extracted as possible experts:
score(q, ce) =
ri ∈S(R)
score(q, ri )
max
rj ∈S(R)
score(q, rj )
· weight(ri , ce)
Varying on α and β, we obtain three matching methods:
Mixed: α > 0, β > 0
TextOnly: α = 1, β = 0
EntityOnly: α = 0, β = 1
Silvestri, Vesci: Expert Finding in Social Networks 11 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Model 2: User Based
Query Domain
Candidate
Expertise
score(q,ce)
EntityResource
s(d,e)s(d,r)s(d,ce)
We refer to about 70 Freebase domains such as sports, location,
education, book, comics, videogames, tv.
For each entity e in a resource, a score s(d, e) is computed, denoting how
much the entity is related to a domain of expertise d:
s(d, e) =
j∈I(d)
1
log2(1+j)
v
i=1
1
log2(1+i)
,
Then a similar score is computed for each resource s(d, r), given all the
entities in the resource related to the domain d:
s(d, r) =
e∈E(r)
s(d, e) · rel(e, r),
where rel(e, r) is a measure of relevance of the entity in the resource
Silvestri, Vesci: Expert Finding in Social Networks 12 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Model 2: User Based - User/Domain Matrix
Finally, the score s(d, ce) is computed for each candidate
expert-domain couple, to build a model of the users as a
matrix CE, D:
s(d, ce) =
r∈S(R,ce)
weight(r, ce) · s(d, r)
r∈S(R,ce)
weight(r, ce)
Sport Music TV Education Movies ...
Candidate Expert 1 .033 .012 .068 .037 .034 ...
Candidate Expert 2 .057 .056 .000 .019 .018 ...
Candidate Expert 3 .086 .044 .000 .059 .074 ...
... ... ... ... ... ... ...
For each query is computed s(d, q), similarly to resources
Looking at the matrix of expertise, the score for a user is
computed as:
score(q, ce) = expertise(q) • expertise(ce) =
d∈D(q)
s(d, q) · s(d, ce)
Silvestri, Vesci: Expert Finding in Social Networks 13 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Outline
1 Introduction
2 Definition of the problem
3 Techniques for Expertise Retrieval
4 Tests
5 Conclusions
Silvestri, Vesci: Expert Finding in Social Networks 14 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Experimental Setup
Dataset built through a recruitment campaign:
Facebook Twitter LinkedIn
#Users 39 23 28
#English Resources 107,956 33,022 11,486
#Italian Resources 124,537 14,038 4,133
#Total Resources 232,493 47,060 15,619
Test suite of 30 information needs, or queries, involving
various domains:
Which php function can I use to obtain the length of a string?
Can you list some restaurant in Milan?
Ground truth: graded relevance judgments of users’ expertise
are obtained from the users themselves trough an online
questionnaire
Silvestri, Vesci: Expert Finding in Social Networks 15 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Tests - Resource based configurations comparison
Model Metrics
type level entity MAP MRR NDCG NDCG@10
Resource Based
0
text only .2034 .6264 .2963 .3183
entity only .0454 .2500 .0731 .0821
mixed .2026 .6014 .2832 .3020
1
text only .3330 .8048 .4348 .4542
entity only .2767 .8050 .3807 .4059
mixed .3150 .8000 .4272 .4335
2
text only .2932 .8111 .4338 .4448
entity only .3363 .8122 .4485 .4292
mixed .3245 .8444 .4454 .4581
Data showed in the table were obtained considering:
english resources
as relevants users, the ones above the average,
for each query
entityConf (e, r) = 1 + tagMeScore(e, r)
top 50 resources
for the mixed matching method: α = 1, β = 2
weight(e, r) = 1∀r ∈ Lv0, Lv1, weight(e, r) =
0.2∀r ∈ Lv2
Silvestri, Vesci: Expert Finding in Social Networks 16 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Tests - Resources window
Another experiment was made by varying the number of resources
considered in the score. We call that size window
For simplicity, we only considered Lv2-Mixed and Lv1-TextOnly
configurations
Considering more resources increases system quality till the 3-4%. Then,
the curves stabilize: increasing the window size does not lead to
significantly better results
Silvestri, Vesci: Expert Finding in Social Networks 17 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Tests - User based
Model Metrics
type level MAP MRR NDCG NDCG@10
User Based
0 .3685 .7603 .4907 .4332
1 .3546 .7306 .4990 .4526
2 .3424 .8178 .4770 .4288
Table: Overall-comparison-User-Based
Data showed in the table were obtained considering:
english resources
as relevants users, the ones above
the average, for each query
top 20 users
weight(e, r) = 1∀r ∈
Lv0, Lv1, weight(e, r) = 0.2∀r ∈
Lv2
Silvestri, Vesci: Expert Finding in Social Networks 18 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Tests - User/Resource based models comparison
The two models presented are evaluated in terms of results
quality and performances.
We considered the best configuration for both: Lv2-Mixed for
the resource based and Lv1 for the user based
Silvestri, Vesci: Expert Finding in Social Networks 19 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Tests - User/Resource based models comparison
The index size is showed in logarithmic scale: index expertise
as a pre-built user-domain matrix provides evident advantages
For the resource based model, the query time is linear on the
window size, while it is constant for the user based one.
Silvestri, Vesci: Expert Finding in Social Networks 20 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Tests - Verticalization
An additional and interesting experiment is given by
considering only resources of a single domain and channel
For semplicity, we only considered Lv2-Mixed configuration,
with the window size fixed to 50.
Domain
Channel
FB TW Lin
computer eng. .2112 .5858 .4472
location .1852 .3549 .2033
movies & tv .2794 .4296 .1578
music .2868 .4229 .2672
science .1827 .4260 .3827
sport .2856 .4225 .1933
tech. & games .2297 .4186 .2052
All domains .2526 .4296 .2670
Table: MAP
Domain
Channel
FB TW Lin
computer eng. .5038 .7014 .4904
location .4423 .4172 .3517
movies & tv .4460 .4960 .2028
music .3957 .4631 .4226
science .3004 .4366 .4977
sport .5497 .4092 .3298
tech. & games .3641 .4545 .2352
All domains .4415 .4791 .3473
Table: NDCG@10
Silvestri, Vesci: Expert Finding in Social Networks 21 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Outline
1 Introduction
2 Definition of the problem
3 Techniques for Expertise Retrieval
4 Tests
5 Conclusions
Silvestri, Vesci: Expert Finding in Social Networks 22 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Conclusions
We classified resources in two main classes: static resources
and dynamic resources
We adopted and extended two models of experts finding
The analysis of social activities can help to better characterize
the expertise of users
The adoption of multiple social networks can greatly improve
the representation of a user for expert finding purposes, but,
for specific domains, it is better to stress single platforms.
Silvestri, Vesci: Expert Finding in Social Networks 23 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Open questions
Exploiting social graph to
improve experts retrieval
Domain specific queries
require a less general
approach
Example: Geolocalized
queries!
Silvestri, Vesci: Expert Finding in Social Networks 24 / 25
Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions
Questions & Answers
Silvestri, Vesci: Expert Finding in Social Networks 25 / 25

More Related Content

What's hot

A Two Step Ranking Solution for Twitter User Engagement
A Two Step Ranking Solution for Twitter User Engagement�A Two Step Ranking Solution for Twitter User Engagement�
A Two Step Ranking Solution for Twitter User EngagementBehnoush Abdollahi
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question AnsweringSujit Pal
 
Memory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringMemory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringAkram El-Korashy
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Alessandro Suglia
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningLior Rokach
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Avkash Chauhan
 
IRJET - Automated Essay Grading System using Deep Learning
IRJET -  	  Automated Essay Grading System using Deep LearningIRJET -  	  Automated Essay Grading System using Deep Learning
IRJET - Automated Essay Grading System using Deep LearningIRJET Journal
 

What's hot (8)

A Two Step Ranking Solution for Twitter User Engagement
A Two Step Ranking Solution for Twitter User Engagement�A Two Step Ranking Solution for Twitter User Engagement�
A Two Step Ranking Solution for Twitter User Engagement
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
Memory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringMemory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question Answering
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
 
IRJET - Automated Essay Grading System using Deep Learning
IRJET -  	  Automated Essay Grading System using Deep LearningIRJET -  	  Automated Essay Grading System using Deep Learning
IRJET - Automated Essay Grading System using Deep Learning
 
Chapter 11b
Chapter 11bChapter 11b
Chapter 11b
 

Similar to Expert Finding in Social Networks

Automated evaluation of crowdsourced annotations in the cultural heritage domain
Automated evaluation of crowdsourced annotations in the cultural heritage domainAutomated evaluation of crowdsourced annotations in the cultural heritage domain
Automated evaluation of crowdsourced annotations in the cultural heritage domaindreamgirl314
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentAmrapali Zaveri, PhD
 
How to build recommender system
How to build recommender systemHow to build recommender system
How to build recommender systemMitko Gurbanski
 
DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...
DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...
DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...DataMind-slides
 
Empirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender SystemsEmpirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender SystemsUniversity of Bergen
 
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques  Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques ijsc
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTao Xie
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings Matthew Rowe
 
Dad (Data Analysis And Design)
Dad (Data Analysis And Design)Dad (Data Analysis And Design)
Dad (Data Analysis And Design)Jill Lyons
 
11_04_2019 EDUCON eMadrid special session on "Moods in MOOCs: analysing emoti...
11_04_2019 EDUCON eMadrid special session on "Moods in MOOCs: analysing emoti...11_04_2019 EDUCON eMadrid special session on "Moods in MOOCs: analysing emoti...
11_04_2019 EDUCON eMadrid special session on "Moods in MOOCs: analysing emoti...eMadrid network
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxJadna Almeida
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxJadna Almeida
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionClaudio Greco
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfhemangppatel
 
Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesijsc
 
Big-Data Analytics for Media Management
Big-Data Analytics for Media ManagementBig-Data Analytics for Media Management
Big-Data Analytics for Media Managementtechkrish
 

Similar to Expert Finding in Social Networks (20)

My experiment
My experimentMy experiment
My experiment
 
Automated evaluation of crowdsourced annotations in the cultural heritage domain
Automated evaluation of crowdsourced annotations in the cultural heritage domainAutomated evaluation of crowdsourced annotations in the cultural heritage domain
Automated evaluation of crowdsourced annotations in the cultural heritage domain
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
How to build recommender system
How to build recommender systemHow to build recommender system
How to build recommender system
 
DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...
DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...
DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...
 
Empirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender SystemsEmpirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender Systems
 
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques  Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to Practice
 
2-IJCSE-00536
2-IJCSE-005362-IJCSE-00536
2-IJCSE-00536
 
2-IJCSE-00536
2-IJCSE-005362-IJCSE-00536
2-IJCSE-00536
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
 
Dad (Data Analysis And Design)
Dad (Data Analysis And Design)Dad (Data Analysis And Design)
Dad (Data Analysis And Design)
 
11_04_2019 EDUCON eMadrid special session on "Moods in MOOCs: analysing emoti...
11_04_2019 EDUCON eMadrid special session on "Moods in MOOCs: analysing emoti...11_04_2019 EDUCON eMadrid special session on "Moods in MOOCs: analysing emoti...
11_04_2019 EDUCON eMadrid special session on "Moods in MOOCs: analysing emoti...
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptx
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptx
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdf
 
OpenSciMatch
OpenSciMatchOpenSciMatch
OpenSciMatch
 
Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniques
 
Big-Data Analytics for Media Management
Big-Data Analytics for Media ManagementBig-Data Analytics for Media Management
Big-Data Analytics for Media Management
 

Recently uploaded

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Recently uploaded (20)

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Expert Finding in Social Networks

  • 1. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Expert Finding in Social Networks Matteo Silvestri Giuliano Vesci Politecnico di Milano 25-07-2012 Silvestri, Vesci: Expert Finding in Social Networks 1 / 25
  • 2. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Outline 1 Introduction 2 Definition of the problem 3 Techniques for Expertise Retrieval 4 Tests 5 Conclusions Silvestri, Vesci: Expert Finding in Social Networks 2 / 25
  • 3. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Expert Finding in Social Networks There are several problems that require looking for expert users inside online social networks. For example: friends experts in cinema friends that know about a particular disease friends able to use a particular technology The task of finding experts able to answer specific informative needs is called expert finding. In particular, we studied this problem in the human computation field of CrowdSearcher, an approach that bridges conventional search experiences to crowdsourcing. Problem: assign CrowdSearcher tasks to expert users Silvestri, Vesci: Expert Finding in Social Networks 3 / 25
  • 4. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Research Questions Can the analysis of social actions (e.g. posts, tweets, interaction with social groups, etc.) help in providing a better characterization of users for search tasks? Is the combined use of social network information useful to better characterize a user? Among the available approaches to expert finding, which one is better suited in the context of social networks? Are social networks oriented toward specific domains of expertise? Goal: methodologies and tools for the selection of best experts in a set of trusted users in (multiple) social networks. Silvestri, Vesci: Expert Finding in Social Networks 4 / 25
  • 5. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Outline 1 Introduction 2 Definition of the problem 3 Techniques for Expertise Retrieval 4 Tests 5 Conclusions Silvestri, Vesci: Expert Finding in Social Networks 5 / 25
  • 6. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Definition of the problem Automatically reach experts for crowdsourced queries: Given a query q and a set CE = (ce1, ce2, ..., cem) of social users that are candidate experts, find a ordered subset S(CE) ⊂ CE of n users with the highest scores score(q, cei). Score(q, cei) S(CE)CE q Estimating the scoring function score(q, ei) is the main task of this work Silvestri, Vesci: Expert Finding in Social Networks 6 / 25
  • 7. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Social Network Characterization Two types of social information characterize users: explicit information: static profiles implicit information: social dynamic activities Social network users can perform several activities and publish informative materials, that we call resources. The idea is to collect evidence of expertise from multiple resources associated to a candidate. Silvestri, Vesci: Expert Finding in Social Networks 7 / 25
  • 8. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Resources Levels Resources are related to the user through a path in the graph. We consider resources connected to a user through a path of length <= 2. Post@09.10 ALICE Post@09.00 Post@09.05 owns owns / creates annotates (likes) Facebook Group relatesTo (belongs)Post@08.00 Post@08.05 contains contains creates Level 0 Level 1 Level 2 Silvestri, Vesci: Expert Finding in Social Networks 8 / 25
  • 9. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Outline 1 Introduction 2 Definition of the problem 3 Techniques for Expertise Retrieval 4 Tests 5 Conclusions Silvestri, Vesci: Expert Finding in Social Networks 9 / 25
  • 10. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Analysis Resources have to be analyzed to infer expertise information Crawling (API) Url Extraction Text Preprocessing Language Detection Named Entity Extraction Crawling: extraction of resources’ textual content exploiting Social Networks API Url Extraction: extract the content of eventual external websites and append it to the resource’s text Language Detection: it is not recommended having different languages in the same index in information retrieval systems, so we detect the language of resources. Named Entity Extraction: extraction of entities like people, cities and movies Textual Preprocessing: we remove stop-words (common words), filter out html tags, perform stemming. Silvestri, Vesci: Expert Finding in Social Networks 10 / 25
  • 11. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Model 1: Resource Based Query Resource Candidate weight(r,c)score(q,r) It is based on resources, considered as documents in a classic Vector Space Model. Resources are represented both as term vectors and entity id vectors 1 First, the similarity between the query and resources is computed: score(q, r) = α· t∈q tf (t, r) · idf (t)2 + β · e∈q tf (e, r) · idf (t)2 · eConf (e, r) 2 Then, users related to best resources are extracted as possible experts: score(q, ce) = ri ∈S(R) score(q, ri ) max rj ∈S(R) score(q, rj ) · weight(ri , ce) Varying on α and β, we obtain three matching methods: Mixed: α > 0, β > 0 TextOnly: α = 1, β = 0 EntityOnly: α = 0, β = 1 Silvestri, Vesci: Expert Finding in Social Networks 11 / 25
  • 12. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Model 2: User Based Query Domain Candidate Expertise score(q,ce) EntityResource s(d,e)s(d,r)s(d,ce) We refer to about 70 Freebase domains such as sports, location, education, book, comics, videogames, tv. For each entity e in a resource, a score s(d, e) is computed, denoting how much the entity is related to a domain of expertise d: s(d, e) = j∈I(d) 1 log2(1+j) v i=1 1 log2(1+i) , Then a similar score is computed for each resource s(d, r), given all the entities in the resource related to the domain d: s(d, r) = e∈E(r) s(d, e) · rel(e, r), where rel(e, r) is a measure of relevance of the entity in the resource Silvestri, Vesci: Expert Finding in Social Networks 12 / 25
  • 13. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Model 2: User Based - User/Domain Matrix Finally, the score s(d, ce) is computed for each candidate expert-domain couple, to build a model of the users as a matrix CE, D: s(d, ce) = r∈S(R,ce) weight(r, ce) · s(d, r) r∈S(R,ce) weight(r, ce) Sport Music TV Education Movies ... Candidate Expert 1 .033 .012 .068 .037 .034 ... Candidate Expert 2 .057 .056 .000 .019 .018 ... Candidate Expert 3 .086 .044 .000 .059 .074 ... ... ... ... ... ... ... ... For each query is computed s(d, q), similarly to resources Looking at the matrix of expertise, the score for a user is computed as: score(q, ce) = expertise(q) • expertise(ce) = d∈D(q) s(d, q) · s(d, ce) Silvestri, Vesci: Expert Finding in Social Networks 13 / 25
  • 14. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Outline 1 Introduction 2 Definition of the problem 3 Techniques for Expertise Retrieval 4 Tests 5 Conclusions Silvestri, Vesci: Expert Finding in Social Networks 14 / 25
  • 15. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Experimental Setup Dataset built through a recruitment campaign: Facebook Twitter LinkedIn #Users 39 23 28 #English Resources 107,956 33,022 11,486 #Italian Resources 124,537 14,038 4,133 #Total Resources 232,493 47,060 15,619 Test suite of 30 information needs, or queries, involving various domains: Which php function can I use to obtain the length of a string? Can you list some restaurant in Milan? Ground truth: graded relevance judgments of users’ expertise are obtained from the users themselves trough an online questionnaire Silvestri, Vesci: Expert Finding in Social Networks 15 / 25
  • 16. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Tests - Resource based configurations comparison Model Metrics type level entity MAP MRR NDCG NDCG@10 Resource Based 0 text only .2034 .6264 .2963 .3183 entity only .0454 .2500 .0731 .0821 mixed .2026 .6014 .2832 .3020 1 text only .3330 .8048 .4348 .4542 entity only .2767 .8050 .3807 .4059 mixed .3150 .8000 .4272 .4335 2 text only .2932 .8111 .4338 .4448 entity only .3363 .8122 .4485 .4292 mixed .3245 .8444 .4454 .4581 Data showed in the table were obtained considering: english resources as relevants users, the ones above the average, for each query entityConf (e, r) = 1 + tagMeScore(e, r) top 50 resources for the mixed matching method: α = 1, β = 2 weight(e, r) = 1∀r ∈ Lv0, Lv1, weight(e, r) = 0.2∀r ∈ Lv2 Silvestri, Vesci: Expert Finding in Social Networks 16 / 25
  • 17. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Tests - Resources window Another experiment was made by varying the number of resources considered in the score. We call that size window For simplicity, we only considered Lv2-Mixed and Lv1-TextOnly configurations Considering more resources increases system quality till the 3-4%. Then, the curves stabilize: increasing the window size does not lead to significantly better results Silvestri, Vesci: Expert Finding in Social Networks 17 / 25
  • 18. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Tests - User based Model Metrics type level MAP MRR NDCG NDCG@10 User Based 0 .3685 .7603 .4907 .4332 1 .3546 .7306 .4990 .4526 2 .3424 .8178 .4770 .4288 Table: Overall-comparison-User-Based Data showed in the table were obtained considering: english resources as relevants users, the ones above the average, for each query top 20 users weight(e, r) = 1∀r ∈ Lv0, Lv1, weight(e, r) = 0.2∀r ∈ Lv2 Silvestri, Vesci: Expert Finding in Social Networks 18 / 25
  • 19. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Tests - User/Resource based models comparison The two models presented are evaluated in terms of results quality and performances. We considered the best configuration for both: Lv2-Mixed for the resource based and Lv1 for the user based Silvestri, Vesci: Expert Finding in Social Networks 19 / 25
  • 20. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Tests - User/Resource based models comparison The index size is showed in logarithmic scale: index expertise as a pre-built user-domain matrix provides evident advantages For the resource based model, the query time is linear on the window size, while it is constant for the user based one. Silvestri, Vesci: Expert Finding in Social Networks 20 / 25
  • 21. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Tests - Verticalization An additional and interesting experiment is given by considering only resources of a single domain and channel For semplicity, we only considered Lv2-Mixed configuration, with the window size fixed to 50. Domain Channel FB TW Lin computer eng. .2112 .5858 .4472 location .1852 .3549 .2033 movies & tv .2794 .4296 .1578 music .2868 .4229 .2672 science .1827 .4260 .3827 sport .2856 .4225 .1933 tech. & games .2297 .4186 .2052 All domains .2526 .4296 .2670 Table: MAP Domain Channel FB TW Lin computer eng. .5038 .7014 .4904 location .4423 .4172 .3517 movies & tv .4460 .4960 .2028 music .3957 .4631 .4226 science .3004 .4366 .4977 sport .5497 .4092 .3298 tech. & games .3641 .4545 .2352 All domains .4415 .4791 .3473 Table: NDCG@10 Silvestri, Vesci: Expert Finding in Social Networks 21 / 25
  • 22. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Outline 1 Introduction 2 Definition of the problem 3 Techniques for Expertise Retrieval 4 Tests 5 Conclusions Silvestri, Vesci: Expert Finding in Social Networks 22 / 25
  • 23. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Conclusions We classified resources in two main classes: static resources and dynamic resources We adopted and extended two models of experts finding The analysis of social activities can help to better characterize the expertise of users The adoption of multiple social networks can greatly improve the representation of a user for expert finding purposes, but, for specific domains, it is better to stress single platforms. Silvestri, Vesci: Expert Finding in Social Networks 23 / 25
  • 24. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Open questions Exploiting social graph to improve experts retrieval Domain specific queries require a less general approach Example: Geolocalized queries! Silvestri, Vesci: Expert Finding in Social Networks 24 / 25
  • 25. Introduction Definition of the problem Techniques for Expertise Retrieval Tests Conclusions Questions & Answers Silvestri, Vesci: Expert Finding in Social Networks 25 / 25