SlideShare a Scribd company logo
1 of 25
Maximizing the Diversity of Exposure
in a Social Network
Cigdem Aslay
Helsinki Algorithms Seminar
October 4, 2018
Maximizing the Diversity of Exposure in a Social Network
C. Aslay, A. Matakos, E. Galbrun, and A. Gionis. IEEE ICDM 2018.
https://arxiv.org/pdf/1809.04393.pdf
Outline
• Motivation
• Algorithmic Personalization and Filter Bubbles
• Information Propagation in Online Social Networks
• Diversity Exposure Maximization Problem
• Scalable Approximation Algorithm
• Experimental Results
• Future Work and Open Problems
Selective Exposure in Online Social Networks
• Online social networking platforms are “relevance maximizers”
• Relevant (=biased) content recommendation
• Relevant (=biased) posts from friends in social feed
• Content different from your viewpoint is less likely to reach you
Lack of exposure to diverse viewpoints
resulting from algorithmic personalisation
Filter bubble*
*The term was coined by internet activist Eli Pariser in 2010.
ImagefromGarimellaetal.,KDD2018TutorialonPolarization
“Filter bubbles are a serious problem with news.”
Bill Gates, 21 February 2017
“The internet has exacerbated phenomenon of
people having conversations in their own silos.”
“If you’re liberal, then you’re on MSNBC. If you’re a
conservative, you’re on Fox News.”
Barack Obama, 24 April 2017
“The two most discussed concerns this past year
were about diversity of viewpoints we see (filter
bubbles) and accuracy of information (fake news).”
Mark Zuckerberg, 16 February 2017
People are connected, they perform actions, actions propagate
nice read
indeed!
09:3009:00
post, like,
retweet,…
friends,
fans,
followers,..
like a virus
Information Propagation in Online Social Networks
PolarizationFilter Bubbles
Bursting Filter Bubbles
• Goal : We want users to be exposed to diverse content
• A user’s diversity exposure level depends on her political
leaning and the political leaning of the articles she consumes
• How : Recommend articles to users of a social network
• Articles maybe shared among users, creating possible
information cascades
Information propagation model
defined on the social graph
Bursting Filter Bubbles
Independent Cascade (IC) Model
• For each article i, each arc (u,v) is associated
with a propagation probability
• A node u activated at time t on article i tries to
activate each inactive neighbour v, succeeding
with probability
pi
u,v
pi
u,v
‣ Recommend articles matching users’ predisposition?
• Ensures higher spread but yields minimal increase of diversity
‣ Recommend articles radically opposing to users’ predisposition?
• High local diversity but hinders the spread of the articles
Diversity Exposure Maximization
• Given
• directed social graph G = (V,E)
• users’ leaning scores s(v), defined in [-1,1]
• set I of articles, each with leaning score s(i), defined in [-1,1]
• IC propagation parameters for each article
• users’ attention bound kv > 0
• total assignment size constraint k > 0
• Find a feasible assignment A of items to users that has the maximum
expected diversity exposure score E[F(A)]:
X
v2V
✓
max
i2E(v)
{s(i), s(v)} min
i2E(v)
{s(i), s(v)}
◆
E(v) : expected set of items that v is exposed to resulting from assignment A
Theoretical Analysis
• Diversity exposure score is monotone and submodular
• Monotonicity: expected diversity exposure score cannot decrease as
the assignment size increases
• Submodularity: marginal increase in expected diversity exposure score
shrinks as the assignment size increases
• Diversity exposure maximization is NP-Hard
• Reduction from the NP-Hard influence maximization problem (= select
k nodes that maximize expected spread)
• Restricted special case with one article i s.t. |s(i) - s(v)| = 1
Theoretical Analysis
• Family of feasible solutions form a matroid defined on the ground set
of (user,article) pairs
• (Matroid: structure that abstracts and generalizes the notion of
linear independence in vector spaces)
• Assignment size constraint: uniform matroid
• User attention bound constraint: partition matroid
• Intersection of these matroids: still a matroid
Theoretical Analysis
• Monotone submodular function maximization subject to a matroid
constraint
• Greedy algorithm provides 1/2 approximation*
• Select the feasible (user,article) pair giving the highest increase in
overall diversity-exposure score at each iteration
• Requires to check reachability by each article
• Use r MC simulations at each iteration: O(n * m * k * |I|2 * r)
• Extend recently developed techniques for scalable influence maximization
to solve a more general problem
* Fisher et al., "An analysis of approximations for maximizing submodular set functions", Polyhedral combinatorics 1978.
#P-hard!
Scalable Approximation
• Possible worlds model: G as a random directed edge-coloured multi-
graph
• Multiplicity of each edge : |I|
• Color-reachability: reachability only over edges of same colour
G = (V,E,p) g ~ G
Pr(g) =
Y
i2I
Y
(u,v)i2g
pi
uv
Y
(u,v)i2Eg
(1 pi
uv)
Scalable Approximation
• Generalize the reverse-reachability notion of influence maximization*
* Borgs et al., "Maximizing social influence in nearly optimal time.", SODA 2014.
Random Reverse Co-exposure Sets:
• Sample a possible world g from G: remove every edge (u,v)i with
probability
• Pick a target node v from G uniformly at random
• RC-set of v, Rv = {(user,article) pairs that can color-reach v via out-links in
g}
1 pi
uv
VU1
Rv = {(u1, blue), (u1, red), (u2, blue), (u2, red), (u3, blue), (u4, red), (u5, red)}
U2 U3
U4 U5
Random Reverse Co-exposure Sets
• Unbiased estimation from the weighted frequency of pairs appearing in
sample of random RC-sets:
• Weight of A on a random RC-set Rv = diversity exposure level of v
resulting from the pairs in A ∩ Rv
• Expected diversity exposure score E[F(A)] of A = n * expected
weight of A on a random Rv
• Estimate E[F(A)] by estimating the total weight w(A) of A on a
random sample of RC-sets
• A (user,article) pair that has high weight in a sample of random RC-sets
would provide high diversity exposure
Two-Phase Iterative Diversity-Exposure
Maximization (TDEM)
• So we want to have
Pr
h
|E[F(A)] n · w(A)|
✏
2
· OPT
i
 nh
k
• Generate a sample of random RC-sets
• Apply greedy to find an assignment of size k that has the maximum
estimated weight on a random sample of RC
• How many random RC-sets are enough??
• We want an approximate greedy solution s.t. w.p. at least 1 - 𝛿˜AG
E[F( ˜AG
)]
✓
1
2
✏
◆
· OPT
Sample size is a function of OPT!
Two-Phase Iterative Diversity-Exposure
Maximization (TDEM)
Determination of Sample Size
• Requires the value of OPT which is unknown and NP-hard to compute
• Estimate a tight lower bound LB on OPT
• Perform a statistical test* B(x) on O(log2 n + 1) values of x = n, n/2, …, 2
• If OPT < x, B(x) = false w.h.p.
• Adaptively sample 𝛉x random RC-sets until the stopping condition, i.e.,
B(x) = true, is satisfied
• Compute the lower bound on the sample size using LB = x
Phase 1: Parameter
Estimation
* Tang et al., "Influence maximization in near-linear time: A martingale approach.", SIGMOD 2015.
Two-Phase Iterative Diversity-Exposure
Maximization (TDEM)
• Derive the lower bound on the sample size replacing OPT with LB
• Discard the previously generated 𝛉x RC-sets? No!
• For each possible assignment, and a sequence of random RC-sets
R1, R2,…, define M1, M2, … where
• Show that M1, M2,…. is a martingale, i.e., E[Mj | M1, ..., Mj-1] = Mj-1
• Use martingale inequalities to find a lower bound on the sample size
• No independence assumption, allows to re-use RC-sets,
improved run-time
Phase 1: Parameter
Estimation
Mj =
jX
z=1
(wz w)
Two-Phase Iterative Diversity-Exposure
Maximization (TDEM)
* Tang et al., "Influence maximization in near-linear time: A martingale approach.", SIGMOD 2015.
Running time linear in the total size of the RC-sets sample!
• Run-time analysis based on “almost” weighted maximum
coverage problem
• Competitive to running IMM* for the restricted special case
where |I| = 1 and |s(i) - s(v)| = 1
Phase 1: Parameter
Estimation
Phase 2: Pair
Selection
˜AG
Experiments
• Twitter Datasets*
* Garimella et al., "Balancing information exposure in social networks”, NIPS 2017.
• Node leanings via estimated probabilities of users to retweet content from
either of the opposing sides
• Leaning-aware influence parameters
• Leanings of 25 items distributed between -1 and 1
Algorithms tested
!22
Experiments
https://github.com/aslayci/TDEM
• TDEM : our algorithm
• FAR : recommends articles to high-degree nodes opposing
their predisposition
• CLOSE : recommends articles to high-degree nodes
matching their predisposition
• WEIGHT : recommends articles based on highest
degree(u) ⇥ |s(u) s(i)|
Results
Experiments
• At least 50% gain in expected diversity exposure over the best-
performing degree heuristic, sometimes reaches upto %90 gain!
Future Work
• Leaning-aware information propagation models defined over
multi-dimensional political spectrum
• Diversity-exposure measures defined on refined leaning modelling
• Adaption of scalable approximation algorithms to new scoring
function (possibly non-monotone and not submodular)
• Objective political advertising mechanisms
Thank you!

More Related Content

What's hot

Finding bursty topics from microblogs
Finding bursty topics from microblogsFinding bursty topics from microblogs
Finding bursty topics from microblogs
moresmile
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
Nandhini S
 

What's hot (20)

Incremental collaborative filtering via evolutionary co clustering
Incremental collaborative filtering via evolutionary co clusteringIncremental collaborative filtering via evolutionary co clustering
Incremental collaborative filtering via evolutionary co clustering
 
Co-clustering of multi-view datasets: a parallelizable approach
Co-clustering of multi-view datasets: a parallelizable approachCo-clustering of multi-view datasets: a parallelizable approach
Co-clustering of multi-view datasets: a parallelizable approach
 
Hate speech detection
Hate speech detectionHate speech detection
Hate speech detection
 
Combinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning
 
(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim
 
[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)
 
Algorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyAlgorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at Spotify
 
Learning to compare: relation network for few shot learning
Learning to compare: relation network for few shot learningLearning to compare: relation network for few shot learning
Learning to compare: relation network for few shot learning
 
Speeding up Distributed Big Data Recommendation in Spark
Speeding up Distributed Big Data Recommendation in SparkSpeeding up Distributed Big Data Recommendation in Spark
Speeding up Distributed Big Data Recommendation in Spark
 
Finding bursty topics from microblogs
Finding bursty topics from microblogsFinding bursty topics from microblogs
Finding bursty topics from microblogs
 
Clustering
ClusteringClustering
Clustering
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
[Term project] Junction-point process
[Term project] Junction-point process[Term project] Junction-point process
[Term project] Junction-point process
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clustering
 
K-Means, its Variants and its Applications
K-Means, its Variants and its ApplicationsK-Means, its Variants and its Applications
K-Means, its Variants and its Applications
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
3.1 clustering
3.1 clustering3.1 clustering
3.1 clustering
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
 

Similar to Maximizing the Diversity of Exposure in a Social Network

STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
Albert Bifet
 
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on ClusteringAbility Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
KamleshKumar394
 
A Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence MaximizationA Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence Maximization
Surendra Gadwal
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
South West Data Meetup
 

Similar to Maximizing the Diversity of Exposure in a Social Network (20)

Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...
NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...
NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...
 
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling ApproachGeotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling Approach
 
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling ApproachGeotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling Approach
 
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler..."Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - PosterMediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
 
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning TasksPR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
 
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler..."Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
 
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...
 
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on ClusteringAbility Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
 
A Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence MaximizationA Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence Maximization
 
Relational machine-learning
Relational machine-learningRelational machine-learning
Relational machine-learning
 
CNN for modeling sentence
CNN for modeling sentenceCNN for modeling sentence
CNN for modeling sentence
 
Least Cost Influence in Multiplex Social Networks
Least Cost Influence in Multiplex Social NetworksLeast Cost Influence in Multiplex Social Networks
Least Cost Influence in Multiplex Social Networks
 
Approximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming ApplicationsApproximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming Applications
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
 

Recently uploaded

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 

Recently uploaded (20)

Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 

Maximizing the Diversity of Exposure in a Social Network

  • 1. Maximizing the Diversity of Exposure in a Social Network Cigdem Aslay Helsinki Algorithms Seminar October 4, 2018
  • 2. Maximizing the Diversity of Exposure in a Social Network C. Aslay, A. Matakos, E. Galbrun, and A. Gionis. IEEE ICDM 2018. https://arxiv.org/pdf/1809.04393.pdf
  • 3. Outline • Motivation • Algorithmic Personalization and Filter Bubbles • Information Propagation in Online Social Networks • Diversity Exposure Maximization Problem • Scalable Approximation Algorithm • Experimental Results • Future Work and Open Problems
  • 4. Selective Exposure in Online Social Networks • Online social networking platforms are “relevance maximizers” • Relevant (=biased) content recommendation • Relevant (=biased) posts from friends in social feed • Content different from your viewpoint is less likely to reach you Lack of exposure to diverse viewpoints resulting from algorithmic personalisation Filter bubble* *The term was coined by internet activist Eli Pariser in 2010.
  • 5. ImagefromGarimellaetal.,KDD2018TutorialonPolarization “Filter bubbles are a serious problem with news.” Bill Gates, 21 February 2017 “The internet has exacerbated phenomenon of people having conversations in their own silos.” “If you’re liberal, then you’re on MSNBC. If you’re a conservative, you’re on Fox News.” Barack Obama, 24 April 2017 “The two most discussed concerns this past year were about diversity of viewpoints we see (filter bubbles) and accuracy of information (fake news).” Mark Zuckerberg, 16 February 2017
  • 6. People are connected, they perform actions, actions propagate nice read indeed! 09:3009:00 post, like, retweet,… friends, fans, followers,.. like a virus Information Propagation in Online Social Networks
  • 8. Bursting Filter Bubbles • Goal : We want users to be exposed to diverse content • A user’s diversity exposure level depends on her political leaning and the political leaning of the articles she consumes • How : Recommend articles to users of a social network • Articles maybe shared among users, creating possible information cascades Information propagation model defined on the social graph
  • 9. Bursting Filter Bubbles Independent Cascade (IC) Model • For each article i, each arc (u,v) is associated with a propagation probability • A node u activated at time t on article i tries to activate each inactive neighbour v, succeeding with probability pi u,v pi u,v ‣ Recommend articles matching users’ predisposition? • Ensures higher spread but yields minimal increase of diversity ‣ Recommend articles radically opposing to users’ predisposition? • High local diversity but hinders the spread of the articles
  • 10. Diversity Exposure Maximization • Given • directed social graph G = (V,E) • users’ leaning scores s(v), defined in [-1,1] • set I of articles, each with leaning score s(i), defined in [-1,1] • IC propagation parameters for each article • users’ attention bound kv > 0 • total assignment size constraint k > 0 • Find a feasible assignment A of items to users that has the maximum expected diversity exposure score E[F(A)]: X v2V ✓ max i2E(v) {s(i), s(v)} min i2E(v) {s(i), s(v)} ◆ E(v) : expected set of items that v is exposed to resulting from assignment A
  • 11. Theoretical Analysis • Diversity exposure score is monotone and submodular • Monotonicity: expected diversity exposure score cannot decrease as the assignment size increases • Submodularity: marginal increase in expected diversity exposure score shrinks as the assignment size increases • Diversity exposure maximization is NP-Hard • Reduction from the NP-Hard influence maximization problem (= select k nodes that maximize expected spread) • Restricted special case with one article i s.t. |s(i) - s(v)| = 1
  • 12. Theoretical Analysis • Family of feasible solutions form a matroid defined on the ground set of (user,article) pairs • (Matroid: structure that abstracts and generalizes the notion of linear independence in vector spaces) • Assignment size constraint: uniform matroid • User attention bound constraint: partition matroid • Intersection of these matroids: still a matroid
  • 13. Theoretical Analysis • Monotone submodular function maximization subject to a matroid constraint • Greedy algorithm provides 1/2 approximation* • Select the feasible (user,article) pair giving the highest increase in overall diversity-exposure score at each iteration • Requires to check reachability by each article • Use r MC simulations at each iteration: O(n * m * k * |I|2 * r) • Extend recently developed techniques for scalable influence maximization to solve a more general problem * Fisher et al., "An analysis of approximations for maximizing submodular set functions", Polyhedral combinatorics 1978. #P-hard!
  • 14. Scalable Approximation • Possible worlds model: G as a random directed edge-coloured multi- graph • Multiplicity of each edge : |I| • Color-reachability: reachability only over edges of same colour G = (V,E,p) g ~ G Pr(g) = Y i2I Y (u,v)i2g pi uv Y (u,v)i2Eg (1 pi uv)
  • 15. Scalable Approximation • Generalize the reverse-reachability notion of influence maximization* * Borgs et al., "Maximizing social influence in nearly optimal time.", SODA 2014. Random Reverse Co-exposure Sets: • Sample a possible world g from G: remove every edge (u,v)i with probability • Pick a target node v from G uniformly at random • RC-set of v, Rv = {(user,article) pairs that can color-reach v via out-links in g} 1 pi uv VU1 Rv = {(u1, blue), (u1, red), (u2, blue), (u2, red), (u3, blue), (u4, red), (u5, red)} U2 U3 U4 U5
  • 16. Random Reverse Co-exposure Sets • Unbiased estimation from the weighted frequency of pairs appearing in sample of random RC-sets: • Weight of A on a random RC-set Rv = diversity exposure level of v resulting from the pairs in A ∩ Rv • Expected diversity exposure score E[F(A)] of A = n * expected weight of A on a random Rv • Estimate E[F(A)] by estimating the total weight w(A) of A on a random sample of RC-sets • A (user,article) pair that has high weight in a sample of random RC-sets would provide high diversity exposure
  • 17. Two-Phase Iterative Diversity-Exposure Maximization (TDEM) • So we want to have Pr h |E[F(A)] n · w(A)| ✏ 2 · OPT i  nh k • Generate a sample of random RC-sets • Apply greedy to find an assignment of size k that has the maximum estimated weight on a random sample of RC • How many random RC-sets are enough?? • We want an approximate greedy solution s.t. w.p. at least 1 - 𝛿˜AG E[F( ˜AG )] ✓ 1 2 ✏ ◆ · OPT Sample size is a function of OPT!
  • 18. Two-Phase Iterative Diversity-Exposure Maximization (TDEM) Determination of Sample Size • Requires the value of OPT which is unknown and NP-hard to compute • Estimate a tight lower bound LB on OPT • Perform a statistical test* B(x) on O(log2 n + 1) values of x = n, n/2, …, 2 • If OPT < x, B(x) = false w.h.p. • Adaptively sample 𝛉x random RC-sets until the stopping condition, i.e., B(x) = true, is satisfied • Compute the lower bound on the sample size using LB = x Phase 1: Parameter Estimation * Tang et al., "Influence maximization in near-linear time: A martingale approach.", SIGMOD 2015.
  • 19. Two-Phase Iterative Diversity-Exposure Maximization (TDEM) • Derive the lower bound on the sample size replacing OPT with LB • Discard the previously generated 𝛉x RC-sets? No! • For each possible assignment, and a sequence of random RC-sets R1, R2,…, define M1, M2, … where • Show that M1, M2,…. is a martingale, i.e., E[Mj | M1, ..., Mj-1] = Mj-1 • Use martingale inequalities to find a lower bound on the sample size • No independence assumption, allows to re-use RC-sets, improved run-time Phase 1: Parameter Estimation Mj = jX z=1 (wz w)
  • 20. Two-Phase Iterative Diversity-Exposure Maximization (TDEM) * Tang et al., "Influence maximization in near-linear time: A martingale approach.", SIGMOD 2015. Running time linear in the total size of the RC-sets sample! • Run-time analysis based on “almost” weighted maximum coverage problem • Competitive to running IMM* for the restricted special case where |I| = 1 and |s(i) - s(v)| = 1 Phase 1: Parameter Estimation Phase 2: Pair Selection ˜AG
  • 21. Experiments • Twitter Datasets* * Garimella et al., "Balancing information exposure in social networks”, NIPS 2017. • Node leanings via estimated probabilities of users to retweet content from either of the opposing sides • Leaning-aware influence parameters • Leanings of 25 items distributed between -1 and 1
  • 22. Algorithms tested !22 Experiments https://github.com/aslayci/TDEM • TDEM : our algorithm • FAR : recommends articles to high-degree nodes opposing their predisposition • CLOSE : recommends articles to high-degree nodes matching their predisposition • WEIGHT : recommends articles based on highest degree(u) ⇥ |s(u) s(i)|
  • 23. Results Experiments • At least 50% gain in expected diversity exposure over the best- performing degree heuristic, sometimes reaches upto %90 gain!
  • 24. Future Work • Leaning-aware information propagation models defined over multi-dimensional political spectrum • Diversity-exposure measures defined on refined leaning modelling • Adaption of scalable approximation algorithms to new scoring function (possibly non-monotone and not submodular) • Objective political advertising mechanisms