SlideShare uma empresa Scribd logo
1 de 53
Baixar para ler offline
A Network Pruning Based Approach for
Subset-Specific Influential Detection
Praphul Chandra, Arun Kalyanasundaram
Hewlett Packard Labs, Bangalore, India
ACM Web Science 2012
Can they really influence our decisions?
Can they really influence our decisions?
Who else can influence our decisions?
Who else can influence our decisions?
Who else can influence our decisions?
How do we exploit this spread of influence?
How do we exploit this spread of influence?
Viral Marketing
How do we exploit this spread of influence?
Viral Marketing
Influential Detection
• Identify a set of nodes (or individuals) to seed with some
information so as to maximize the spread of the seeded
information in the network. [Domingos, et al. 2001][1]
Influential Detection - Other Applications
Water Distribution Networks [Leskovec, et al. 2007][2]
Influential Detection - Other Applications
Water Distribution Networks [Leskovec, et al. 2007][2]
Preventing the spread of diseases [Christakis, Fowler 2007][3]
Influential Detection - A simple heuristic
b a
c
d
e
f
g
Most Influential
Finding the most influential node using the highest degree heuristic.
Our Problem - Subset Specific Influential Detection
• Aim : Maximize the spread of influence on a subset of nodes in
the network instead of the whole network.
Our Problem - Subset Specific Influential Detection
• Aim : Maximize the spread of influence on a subset of nodes in
the network instead of the whole network.
Subset Specific Influential Detection - Examples
Small Businesses - Locality based marketing Political Campaign
[Focus on Supporters / Detractors]
Targeted advertisements - Demographics
[Nationality, Age, Gender, etc.]
Subset Specific Influential Detection - Our Motivation
• Increase in size / density of networks.
• Opportunity to improve the efficiency of traditional approaches.
• Current state of the art “adapts” existing algorithms on influential
detection to the subset specific version. [Kempe, et al. 2003][4]
[Aggarwal, et al. 2011][5]
• We address the subset specific top-k influential detection problem
standalone.
Subset Specific Influential Detection - A simple heuristic
b a
c
d
e
f
g
Subset of nodes to maximize influence spread
Subset Specific Most Influential
Finding the subset specific influential using the highest relevant degree heuristic.
Our Contribution - A Summary
• An efficient algorithm for subset specific top-k influential detection.
• Performance vs. efficiency trade-off using a tunable parameter - γ.
• Analytical framework: For an iteratively pruned network.
• A lower bound to evaluate the influence spread.
• Proof of sub-modularity of the influence spread function.
Background - Models of Information Diffusion
• Aim: Capture the dynamics of diffusion in social networks.
[Granovetter, Mark 1978][6]
• For Example : Independent Cascade Model (ICM) [Goldenberg,
et al. 2001][7]
• Node u activates its neighbor v with an independent probability, puv .
• Stochastic.
• In general puv = p, the propagation probability.
Background - Models of Information Diffusion
• Aim: Capture the dynamics of diffusion in social networks.
[Granovetter, Mark 1978][6]
• For Example : Independent Cascade Model (ICM) [Goldenberg,
et al. 2001][7]
• Node u activates its neighbor v with an independent probability, puv .
• Stochastic.
• In general puv = p, the propagation probability.
Activation of a node v by a node u can be seen as the outcome of a coin flip with bias puv
Independent Cascade Model - Activation Graphs
a
b
c
d
e
f
g
0.1
0.2
0.01
0.1
0.05
0.3
0.15
a
b
c
d
e
f
g
a
b
c
d
e
f
g
Activation Graph 1 Activation Graph 2
• Activation Graph
• Generated by sampling edges based on puv (edge weight).
• Allows us to evaluate the expected influence spread [Kempe, et al.
2003].
Evaluating Influence Spread In ICM [Kempe, et al. 2003]
• Expected influence spread due to a node u :
• Mean number of nodes reachable from u in N activation graphs.
a
b
c
d
e
f
g
0.1
0.2
0.01
0.1
0.05
0.3
0.15
.
.
a
b
c
d
e
f
g Ra = 3
Rb = 3
Rc = 3Rd = 3
Re = 0
Rf = 1
Rg = 1
a
b
c
d
e
f
g Ra = 2
Rb = 2
Rc = 0Rd = 2
Re = 2
Rf = 2
Rg = 2
.
.
a
b
c
d
e
f
g
0.1
0.2
0.01
0.1
0.05
0.3
0.15
Most Influential
Activation graph 1
Activation graph N
N Outcomes}
Ru : Number of nodes reachable from u, not including u.
Previous Work - Greedy Algorithm [Kempe, et al. 2003]
• σ(A): Influence spread, due to a seed set A.
• δu: Marginal contribution of u, which is σ(A ∪ {u}) − σ(A)
• Approach : Iteratively choose a node u with highest δu.
• Performance guarantee : 63% of optimal solution.
• Running time scales exponentially with network size.
Greedy Algorithm - Pictorial Representation
ua
b
c
d
f
v
e
y
x
ua
b
c
d
f
v
e
y
x
Top-k Influential
Iteration 1
Node v chosen as the most influential node. Since, δv > δu > δa > ...
Greedy Algorithm - Pictorial Representation
ua
b
c
d
f
v
e
y
x
ua
b
c
d
f
v
e
y
x
Top-k Influential
Iteration 1
Node v chosen as the most influential node. Since, δv > δu > δa > ...
ua
b
c
d
f
v
e
w
x
ua
b
c
d
f
v
e
w
x
Iteration 2
After Iteration 1, δu drops below δa. Hence a is chosen next.
Our Approach
Problem Statement
Given a graph, G(V , E) and a destination set D0 ⊆ V , find the top-k nodes in
V which maximize the spread of influence on D0.
b a
c
d
e
f
g
Destination Set (D0)
b a
c
d
e
f
g
[ Subset specific most influential does NOT lie in D0 ]
[ Subset specific most influential does lie in D0 ]
Salient features:
• Top-k nodes may or may not be in D0.
• When D0 = V , it reduces to the general form.
Trivial Extension - Subset Adapted Greedy
• Expected influence spread on D0 due to a node u :
• Mean number of nodes in D0 reachable from u in N activation graphs.
a
b
c
d
e
f
g
0.1
0.2
0.01
0.1
0.05
0.3
0.15
.
.
a
b
c
d
e
f
g Ra = 2
Rb = 2
Rc = 2Rd = 3
Re = 0
Rf = 0
Rg = 0
a
b
c
d
e
f
g Ra = 1
Rb = 1
Rc = 0Rd = 2
Re = 0
Rf = 0
Rg = 0
.
.
a
b
c
d
e
f
g
0.1
0.2
0.01
0.1
0.05
0.3
0.15
Destination Set (D0)
Subset Specific Most Influential
Activation graph 1
Activation graph N
N Outcomes}
Ru : Number of nodes in D0 reachable from u, not including u
Iterative Pruning Approach - Central Idea
Central Idea
• Identify a set of nodes, ψ which are considered “influenced”.
• De-prioritize the spread of influence to all nodes in ψ.
When to consider a node as influenced
When to consider a node as influenced
• Based on a node’s susceptibility to influence.
When to consider a node as influenced
• Based on a node’s susceptibility to influence.
For Example : [S. Aral, D. Walker 2011][8]
When to consider a node as influenced
• Based on a node’s susceptibility to influence.
For Example : [S. Aral, D. Walker 2011][8]
• In our approach, we introduce a threshold parameter γu to model
the susceptibility of a node u.
Iterative Pruning Approach - In Three Steps
1. Compute Lu(A) ∈ [0, 1] : Likelihood that a node u would be influenced due
to a seed set, A.
• Lu(A) is the expectation that a node u will be active due to A.
Expected Influence Spread (σ(A)) =
u∈V
Lu(A)
Iterative Pruning Approach - In Three Steps
1. Compute Lu(A) ∈ [0, 1] : Likelihood that a node u would be influenced due
to a seed set, A.
• Lu(A) is the expectation that a node u will be active due to A.
Expected Influence Spread (σ(A)) =
u∈V
Lu(A)
2. Set a threshold γu : Add a node u to ψ, when Lu ≥ γu.
• Sociological perspective of γ : Susceptibility or Ease of Influencing.
• Incorporates potential influence that can reach from all over the network.
Iterative Pruning Approach - In Three Steps
1. Compute Lu(A) ∈ [0, 1] : Likelihood that a node u would be influenced due
to a seed set, A.
• Lu(A) is the expectation that a node u will be active due to A.
Expected Influence Spread (σ(A)) =
u∈V
Lu(A)
2. Set a threshold γu : Add a node u to ψ, when Lu ≥ γu.
• Sociological perspective of γ : Susceptibility or Ease of Influencing.
• Incorporates potential influence that can reach from all over the network.
a
b
c
d
e
f
g
0.1
0.2
0.01
0.1
0.05
0.3
0.15
Destination Set (D0)
Subset Specific Influential
Influenced set (ψ)
a
b
c
d
e
f
g
0.1
0.2
0.01
0.1
0.050.3
0.15
La = 0.05
γa = 0.05
Lb = 0.25
γb = 0.2
Lc = 0.15
γc = 0.2
Iterative Pruning Approach - In Three Steps
1. Compute Lu(A) ∈ [0, 1] : Likelihood that a node u would be influenced due
to a seed set, A.
• Lu(A) is the expectation that a node u will be active due to A.
Expected Influence Spread (σ(A)) =
u∈V
Lu(A)
2. Set a threshold γu : Add a node u to ψ, when Lu ≥ γu.
• Sociological perspective of γ : Susceptibility or Ease of Influencing.
• Incorporates potential influence that can reach from all over the network.
a
b
c
d
e
f
g
0.1
0.2
0.01
0.1
0.05
0.3
0.15
Destination Set (D0)
Subset Specific Influential
Influenced set (ψ)
a
b
c
d
e
f
g
0.1
0.2
0.01
0.1
0.050.3
0.15
La = 0.05
γa = 0.05
Lb = 0.25
γb = 0.2
Lc = 0.15
γc = 0.2
3. Pruning Process : Remove all paths that lead ONLY to nodes in ψ.
• Significantly improves the efficiency. Details to follow.
Iterative Pruning Approach - Pruning Process
• ψ : The set of nodes considered influenced.
• Two level pruning process:
1. For each node in ψ, remove all its adjacent edges.
2. Recursively remove all paths that do NOT lead to any node in D0 ψ.
• How does pruning help?
• Improves efficiency by reducing the density of the underlying graph.
Destination Set (D0)
Subset Specific Influential
Influenced set (ψ)
a
b
c
d
e
f
g
a
b
c
d
e
f
g
a
b
c
d
e
f
g
Level 1
Level 2
Experiments
• Datasets: Two real world co-authorship networks
1. High Energy Physics - Theory (HEPT) section of e-print arXiv
Dense network: 15233 nodes / 58891 edges
2. Conference on Software Maintenance and Re-engineering (SMRE)
Sparse network : 1336 nodes / 2200 edges
• Comparison with state of the art:
• Subset Adapted Greedy
• Subset Adapted CELF (Cost Effective Lazy Forward) [Leskovec, et
al. 2007]
• System parameter - γ : {p4
, p3
, p2
, p, 2p, 4p}
where p is the propagation probability in ICM.
Results: [ Dataset 1 ] Dense Network
• Iterative Pruning (γ = p4
) vs. Subset Adapted Greedy:
• 96% improvement in efficiency.
• 10% drop in performance (influence spread).
• Iterative Pruning with CELF (γ = p4
) vs. Subset Adapted CELF:
• 52% improvement in efficiency.
• 10% drop in performance.
Results: [ Dataset 2 ] Sparse Network
• Iterative Pruning (γ = p4
) vs. Subset Adapted Greedy:
• 73% improvement in efficiency.
• 21% drop in performance (influence spread).
• Iterative Pruning with CELF (γ = p4
) vs. Subset Adapted CELF:
• 38% improvement in efficiency.
• 21% drop in performance.
Key Inferences
• Low values of γ are highly efficient but at the cost of performance
loss.
• Choose a low value of γ for dense networks and a high value of γ for
sparse networks, in order to achieve a desirable performance.
• The relatively low efficiency gains with CELF is because the pruning
operation causes a simultaneous reduction in marginal contribution
of several nodes.
Analytical Framework
• Known:
Influence spread function σ(A) is sub-modular when the underlying graph
G(V , E) is static across iterations. [Kempe, et al. 2003]
Analytical Framework
• Known:
Influence spread function σ(A) is sub-modular when the underlying graph
G(V , E) is static across iterations. [Kempe, et al. 2003]
• Is σ<Gi >
(A) sub-modular when the underlying graph Gi (V , Ei ) is
iteratively pruned? Where Gi is the graph after ith
iteration.
Analytical Framework
• Known:
Influence spread function σ(A) is sub-modular when the underlying graph
G(V , E) is static across iterations. [Kempe, et al. 2003]
• Is σ<Gi >
(A) sub-modular when the underlying graph Gi (V , Ei ) is
iteratively pruned? Where Gi is the graph after ith
iteration.
Yes. Details in our paper.
Analytical Framework
• Known:
Influence spread function σ(A) is sub-modular when the underlying graph
G(V , E) is static across iterations. [Kempe, et al. 2003]
• Is σ<Gi >
(A) sub-modular when the underlying graph Gi (V , Ei ) is
iteratively pruned? Where Gi is the graph after ith
iteration.
Yes. Details in our paper.
• Can we estimate the σ(A) from σ<Gi >
(A)?
Analytical Framework
• Known:
Influence spread function σ(A) is sub-modular when the underlying graph
G(V , E) is static across iterations. [Kempe, et al. 2003]
• Is σ<Gi >
(A) sub-modular when the underlying graph Gi (V , Ei ) is
iteratively pruned? Where Gi is the graph after ith
iteration.
Yes. Details in our paper.
• Can we estimate the σ(A) from σ<Gi >
(A)?
No, but we derive the following lower bound.
σ(A) ≥ σ<Gi >
(A) +
i−1
j=1 u∈ψj ψj+1
Lu(A)
where ψj is the set of influenced nodes after jth
iteration.
Summary
• Iterative network pruning algorithm for subset specific top-k influential
detection.
• Evaluation of our algorithm on two real world datasets showed significant
efficiency gains with an acceptable drop in performance.
• A tunable parameter γ for performance vs. efficiency trade-off.
• Analytical framework to show the sub-modularity of influence spread
function when the underlying graph is iteratively pruned thus enabling
the evaluation of performance guarantees.
Scope for Future Work
• Design of more efficient algorithms.
• Evaluation with real world distributions of γ (susceptibility).
• Extension to non-progressive models of diffusion.
References
[1] P. Domingos and M. Richardson, “Mining the network value of customers,” in Proceedings of the seventh
ACM SIGKDD international conference on Knowledge discovery and data mining, ser. KDD ’01. ACM, 2001,
pp. 57–66. [Online]. Available: http://doi.acm.org/10.1145/502512.502525
[2] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance, “Cost-effective outbreak
detection in networks,” in Proceedings of the thirteenth ACM SIGKDD international conference on Knowledge
discovery and data mining, ser. KDD ’07. ACM, 2007, pp. 420–429. [Online]. Available:
http://doi.acm.org/10.1145/1281192.1281239
[3] N. A. Christakis and J. H. Fowler, “The spread of obesity in a large social network over 32 years,” The New
England Journal of Medicine, vol. 357, no. 4, pp. 370–379, July 2007. [Online]. Available:
http://health-equity.pitt.edu/767/
[4] D. Kempe, J. Kleinberg, and E. Tardos, “Maximizing the spread of influence through a social network,” in
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, ser.
KDD ’03. ACM, 2003, pp. 137–146. [Online]. Available: http://doi.acm.org/10.1145/956750.956769
[5] C. C. Aggarwal, A. Khan, and X. Yan, “On flow authority discovery in social networks,” in Proceedings of the
eleventh SIAM international conference on Data mining, ser. SDM ’11. SIAM / Omnipress, 2011, pp.
522–533.
[6] M. Granovetter, “Threshold Models of Collective Behavior,” American Journal of Sociology, vol. 83, no. 6, pp.
1420–1443, 1978. [Online]. Available: http://dx.doi.org/10.2307/2778111
[7] J. Goldenberg, B. Libai, and E. Muller, “Talk of the Network: A Complex Systems Look at the Underlying
Process of Word-of-Mouth,” Marketing Letters, vol. 3, no. 12, pp. 211–223, Aug. 2001. [Online]. Available:
http://www.ingentaconnect.com/content/klu/mark/2001/00000012/00000003/00350022
[8] S. Aral and D. Walker, “Creating Social Contagion Through Viral Product Design: A Randomized Trial of
Questions?
P. Chandra and A. Kalyanasundaram, “A Network Pruning Based Approach
for Subset Specific Influential Detection”, in 4th Annual ACM conference on
Web Science (WebSci 2012), Evanston, Illinois, USA, Jun. 2012.
a
b
c
d
e
f
g
Destination Set (D0)
Subset Specific Influential
Influenced set (ψ)
a
b
c
d
e
f
g
a
b
c
d
e
f
g
a
b
c
d
e
f
g
Thank You

Mais conteúdo relacionado

Mais procurados

[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...
[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...
[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...
Taiji Suzuki
 
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
Rohit Kumar Gupta
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MLAI2
 

Mais procurados (17)

Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation Survey
 
Enhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithmEnhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithm
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
 
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
 
[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions
 
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
 
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
 
Circuitanlys2
Circuitanlys2Circuitanlys2
Circuitanlys2
 
[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...
[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...
[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...
 
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
 
Spectral clustering - Houston ML Meetup
Spectral clustering - Houston ML MeetupSpectral clustering - Houston ML Meetup
Spectral clustering - Houston ML Meetup
 
Db Scan
Db ScanDb Scan
Db Scan
 
1 chayes
1 chayes1 chayes
1 chayes
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in Python
 

Semelhante a A network pruning based approach for subset specific influential detection

A Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence MaximizationA Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence Maximization
Surendra Gadwal
 
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Sigma web solutions pvt. ltd.
 
Scalable membership management
Scalable membership management Scalable membership management
Scalable membership management
Vinay Setty
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
SanjanaSaxena17
 
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
Wei Lu
 
A generic method for modeling accelerated life testing data
A generic method for modeling accelerated life testing dataA generic method for modeling accelerated life testing data
A generic method for modeling accelerated life testing data
ASQ Reliability Division
 
Automatic Visualization
Automatic VisualizationAutomatic Visualization
Automatic Visualization
Sri Ambati
 

Semelhante a A network pruning based approach for subset specific influential detection (20)

Analysis and reactive measures on the blackhole attack
Analysis and reactive measures on the blackhole attackAnalysis and reactive measures on the blackhole attack
Analysis and reactive measures on the blackhole attack
 
A Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence MaximizationA Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence Maximization
 
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
Least Cost Influence in Multiplex Social Networks
Least Cost Influence in Multiplex Social NetworksLeast Cost Influence in Multiplex Social Networks
Least Cost Influence in Multiplex Social Networks
 
Scalable membership management
Scalable membership management Scalable membership management
Scalable membership management
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Approximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming ApplicationsApproximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming Applications
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
 
Adaptive filters and band reject filters
Adaptive filters and band reject filtersAdaptive filters and band reject filters
Adaptive filters and band reject filters
 
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
 
2. filtering basics
2. filtering basics2. filtering basics
2. filtering basics
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
A generic method for modeling accelerated life testing data
A generic method for modeling accelerated life testing dataA generic method for modeling accelerated life testing data
A generic method for modeling accelerated life testing data
 
Automatic Visualization
Automatic VisualizationAutomatic Visualization
Automatic Visualization
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

A network pruning based approach for subset specific influential detection

  • 1. A Network Pruning Based Approach for Subset-Specific Influential Detection Praphul Chandra, Arun Kalyanasundaram Hewlett Packard Labs, Bangalore, India ACM Web Science 2012
  • 2.
  • 3. Can they really influence our decisions?
  • 4. Can they really influence our decisions?
  • 5. Who else can influence our decisions?
  • 6. Who else can influence our decisions?
  • 7. Who else can influence our decisions?
  • 8. How do we exploit this spread of influence?
  • 9. How do we exploit this spread of influence? Viral Marketing
  • 10. How do we exploit this spread of influence? Viral Marketing Influential Detection • Identify a set of nodes (or individuals) to seed with some information so as to maximize the spread of the seeded information in the network. [Domingos, et al. 2001][1]
  • 11. Influential Detection - Other Applications Water Distribution Networks [Leskovec, et al. 2007][2]
  • 12. Influential Detection - Other Applications Water Distribution Networks [Leskovec, et al. 2007][2] Preventing the spread of diseases [Christakis, Fowler 2007][3]
  • 13. Influential Detection - A simple heuristic b a c d e f g Most Influential Finding the most influential node using the highest degree heuristic.
  • 14. Our Problem - Subset Specific Influential Detection • Aim : Maximize the spread of influence on a subset of nodes in the network instead of the whole network.
  • 15. Our Problem - Subset Specific Influential Detection • Aim : Maximize the spread of influence on a subset of nodes in the network instead of the whole network.
  • 16. Subset Specific Influential Detection - Examples Small Businesses - Locality based marketing Political Campaign [Focus on Supporters / Detractors] Targeted advertisements - Demographics [Nationality, Age, Gender, etc.]
  • 17. Subset Specific Influential Detection - Our Motivation • Increase in size / density of networks. • Opportunity to improve the efficiency of traditional approaches. • Current state of the art “adapts” existing algorithms on influential detection to the subset specific version. [Kempe, et al. 2003][4] [Aggarwal, et al. 2011][5] • We address the subset specific top-k influential detection problem standalone.
  • 18. Subset Specific Influential Detection - A simple heuristic b a c d e f g Subset of nodes to maximize influence spread Subset Specific Most Influential Finding the subset specific influential using the highest relevant degree heuristic.
  • 19. Our Contribution - A Summary • An efficient algorithm for subset specific top-k influential detection. • Performance vs. efficiency trade-off using a tunable parameter - γ. • Analytical framework: For an iteratively pruned network. • A lower bound to evaluate the influence spread. • Proof of sub-modularity of the influence spread function.
  • 20. Background - Models of Information Diffusion • Aim: Capture the dynamics of diffusion in social networks. [Granovetter, Mark 1978][6] • For Example : Independent Cascade Model (ICM) [Goldenberg, et al. 2001][7] • Node u activates its neighbor v with an independent probability, puv . • Stochastic. • In general puv = p, the propagation probability.
  • 21. Background - Models of Information Diffusion • Aim: Capture the dynamics of diffusion in social networks. [Granovetter, Mark 1978][6] • For Example : Independent Cascade Model (ICM) [Goldenberg, et al. 2001][7] • Node u activates its neighbor v with an independent probability, puv . • Stochastic. • In general puv = p, the propagation probability. Activation of a node v by a node u can be seen as the outcome of a coin flip with bias puv
  • 22. Independent Cascade Model - Activation Graphs a b c d e f g 0.1 0.2 0.01 0.1 0.05 0.3 0.15 a b c d e f g a b c d e f g Activation Graph 1 Activation Graph 2 • Activation Graph • Generated by sampling edges based on puv (edge weight). • Allows us to evaluate the expected influence spread [Kempe, et al. 2003].
  • 23. Evaluating Influence Spread In ICM [Kempe, et al. 2003] • Expected influence spread due to a node u : • Mean number of nodes reachable from u in N activation graphs. a b c d e f g 0.1 0.2 0.01 0.1 0.05 0.3 0.15 . . a b c d e f g Ra = 3 Rb = 3 Rc = 3Rd = 3 Re = 0 Rf = 1 Rg = 1 a b c d e f g Ra = 2 Rb = 2 Rc = 0Rd = 2 Re = 2 Rf = 2 Rg = 2 . . a b c d e f g 0.1 0.2 0.01 0.1 0.05 0.3 0.15 Most Influential Activation graph 1 Activation graph N N Outcomes} Ru : Number of nodes reachable from u, not including u.
  • 24. Previous Work - Greedy Algorithm [Kempe, et al. 2003] • σ(A): Influence spread, due to a seed set A. • δu: Marginal contribution of u, which is σ(A ∪ {u}) − σ(A) • Approach : Iteratively choose a node u with highest δu. • Performance guarantee : 63% of optimal solution. • Running time scales exponentially with network size.
  • 25. Greedy Algorithm - Pictorial Representation ua b c d f v e y x ua b c d f v e y x Top-k Influential Iteration 1 Node v chosen as the most influential node. Since, δv > δu > δa > ...
  • 26. Greedy Algorithm - Pictorial Representation ua b c d f v e y x ua b c d f v e y x Top-k Influential Iteration 1 Node v chosen as the most influential node. Since, δv > δu > δa > ... ua b c d f v e w x ua b c d f v e w x Iteration 2 After Iteration 1, δu drops below δa. Hence a is chosen next.
  • 28. Problem Statement Given a graph, G(V , E) and a destination set D0 ⊆ V , find the top-k nodes in V which maximize the spread of influence on D0. b a c d e f g Destination Set (D0) b a c d e f g [ Subset specific most influential does NOT lie in D0 ] [ Subset specific most influential does lie in D0 ] Salient features: • Top-k nodes may or may not be in D0. • When D0 = V , it reduces to the general form.
  • 29. Trivial Extension - Subset Adapted Greedy • Expected influence spread on D0 due to a node u : • Mean number of nodes in D0 reachable from u in N activation graphs. a b c d e f g 0.1 0.2 0.01 0.1 0.05 0.3 0.15 . . a b c d e f g Ra = 2 Rb = 2 Rc = 2Rd = 3 Re = 0 Rf = 0 Rg = 0 a b c d e f g Ra = 1 Rb = 1 Rc = 0Rd = 2 Re = 0 Rf = 0 Rg = 0 . . a b c d e f g 0.1 0.2 0.01 0.1 0.05 0.3 0.15 Destination Set (D0) Subset Specific Most Influential Activation graph 1 Activation graph N N Outcomes} Ru : Number of nodes in D0 reachable from u, not including u
  • 30. Iterative Pruning Approach - Central Idea Central Idea • Identify a set of nodes, ψ which are considered “influenced”. • De-prioritize the spread of influence to all nodes in ψ.
  • 31. When to consider a node as influenced
  • 32. When to consider a node as influenced • Based on a node’s susceptibility to influence.
  • 33. When to consider a node as influenced • Based on a node’s susceptibility to influence. For Example : [S. Aral, D. Walker 2011][8]
  • 34. When to consider a node as influenced • Based on a node’s susceptibility to influence. For Example : [S. Aral, D. Walker 2011][8] • In our approach, we introduce a threshold parameter γu to model the susceptibility of a node u.
  • 35. Iterative Pruning Approach - In Three Steps 1. Compute Lu(A) ∈ [0, 1] : Likelihood that a node u would be influenced due to a seed set, A. • Lu(A) is the expectation that a node u will be active due to A. Expected Influence Spread (σ(A)) = u∈V Lu(A)
  • 36. Iterative Pruning Approach - In Three Steps 1. Compute Lu(A) ∈ [0, 1] : Likelihood that a node u would be influenced due to a seed set, A. • Lu(A) is the expectation that a node u will be active due to A. Expected Influence Spread (σ(A)) = u∈V Lu(A) 2. Set a threshold γu : Add a node u to ψ, when Lu ≥ γu. • Sociological perspective of γ : Susceptibility or Ease of Influencing. • Incorporates potential influence that can reach from all over the network.
  • 37. Iterative Pruning Approach - In Three Steps 1. Compute Lu(A) ∈ [0, 1] : Likelihood that a node u would be influenced due to a seed set, A. • Lu(A) is the expectation that a node u will be active due to A. Expected Influence Spread (σ(A)) = u∈V Lu(A) 2. Set a threshold γu : Add a node u to ψ, when Lu ≥ γu. • Sociological perspective of γ : Susceptibility or Ease of Influencing. • Incorporates potential influence that can reach from all over the network. a b c d e f g 0.1 0.2 0.01 0.1 0.05 0.3 0.15 Destination Set (D0) Subset Specific Influential Influenced set (ψ) a b c d e f g 0.1 0.2 0.01 0.1 0.050.3 0.15 La = 0.05 γa = 0.05 Lb = 0.25 γb = 0.2 Lc = 0.15 γc = 0.2
  • 38. Iterative Pruning Approach - In Three Steps 1. Compute Lu(A) ∈ [0, 1] : Likelihood that a node u would be influenced due to a seed set, A. • Lu(A) is the expectation that a node u will be active due to A. Expected Influence Spread (σ(A)) = u∈V Lu(A) 2. Set a threshold γu : Add a node u to ψ, when Lu ≥ γu. • Sociological perspective of γ : Susceptibility or Ease of Influencing. • Incorporates potential influence that can reach from all over the network. a b c d e f g 0.1 0.2 0.01 0.1 0.05 0.3 0.15 Destination Set (D0) Subset Specific Influential Influenced set (ψ) a b c d e f g 0.1 0.2 0.01 0.1 0.050.3 0.15 La = 0.05 γa = 0.05 Lb = 0.25 γb = 0.2 Lc = 0.15 γc = 0.2 3. Pruning Process : Remove all paths that lead ONLY to nodes in ψ. • Significantly improves the efficiency. Details to follow.
  • 39. Iterative Pruning Approach - Pruning Process • ψ : The set of nodes considered influenced. • Two level pruning process: 1. For each node in ψ, remove all its adjacent edges. 2. Recursively remove all paths that do NOT lead to any node in D0 ψ. • How does pruning help? • Improves efficiency by reducing the density of the underlying graph. Destination Set (D0) Subset Specific Influential Influenced set (ψ) a b c d e f g a b c d e f g a b c d e f g Level 1 Level 2
  • 40. Experiments • Datasets: Two real world co-authorship networks 1. High Energy Physics - Theory (HEPT) section of e-print arXiv Dense network: 15233 nodes / 58891 edges 2. Conference on Software Maintenance and Re-engineering (SMRE) Sparse network : 1336 nodes / 2200 edges • Comparison with state of the art: • Subset Adapted Greedy • Subset Adapted CELF (Cost Effective Lazy Forward) [Leskovec, et al. 2007] • System parameter - γ : {p4 , p3 , p2 , p, 2p, 4p} where p is the propagation probability in ICM.
  • 41. Results: [ Dataset 1 ] Dense Network • Iterative Pruning (γ = p4 ) vs. Subset Adapted Greedy: • 96% improvement in efficiency. • 10% drop in performance (influence spread). • Iterative Pruning with CELF (γ = p4 ) vs. Subset Adapted CELF: • 52% improvement in efficiency. • 10% drop in performance.
  • 42. Results: [ Dataset 2 ] Sparse Network • Iterative Pruning (γ = p4 ) vs. Subset Adapted Greedy: • 73% improvement in efficiency. • 21% drop in performance (influence spread). • Iterative Pruning with CELF (γ = p4 ) vs. Subset Adapted CELF: • 38% improvement in efficiency. • 21% drop in performance.
  • 43. Key Inferences • Low values of γ are highly efficient but at the cost of performance loss. • Choose a low value of γ for dense networks and a high value of γ for sparse networks, in order to achieve a desirable performance. • The relatively low efficiency gains with CELF is because the pruning operation causes a simultaneous reduction in marginal contribution of several nodes.
  • 44. Analytical Framework • Known: Influence spread function σ(A) is sub-modular when the underlying graph G(V , E) is static across iterations. [Kempe, et al. 2003]
  • 45. Analytical Framework • Known: Influence spread function σ(A) is sub-modular when the underlying graph G(V , E) is static across iterations. [Kempe, et al. 2003] • Is σ<Gi > (A) sub-modular when the underlying graph Gi (V , Ei ) is iteratively pruned? Where Gi is the graph after ith iteration.
  • 46. Analytical Framework • Known: Influence spread function σ(A) is sub-modular when the underlying graph G(V , E) is static across iterations. [Kempe, et al. 2003] • Is σ<Gi > (A) sub-modular when the underlying graph Gi (V , Ei ) is iteratively pruned? Where Gi is the graph after ith iteration. Yes. Details in our paper.
  • 47. Analytical Framework • Known: Influence spread function σ(A) is sub-modular when the underlying graph G(V , E) is static across iterations. [Kempe, et al. 2003] • Is σ<Gi > (A) sub-modular when the underlying graph Gi (V , Ei ) is iteratively pruned? Where Gi is the graph after ith iteration. Yes. Details in our paper. • Can we estimate the σ(A) from σ<Gi > (A)?
  • 48. Analytical Framework • Known: Influence spread function σ(A) is sub-modular when the underlying graph G(V , E) is static across iterations. [Kempe, et al. 2003] • Is σ<Gi > (A) sub-modular when the underlying graph Gi (V , Ei ) is iteratively pruned? Where Gi is the graph after ith iteration. Yes. Details in our paper. • Can we estimate the σ(A) from σ<Gi > (A)? No, but we derive the following lower bound. σ(A) ≥ σ<Gi > (A) + i−1 j=1 u∈ψj ψj+1 Lu(A) where ψj is the set of influenced nodes after jth iteration.
  • 49. Summary • Iterative network pruning algorithm for subset specific top-k influential detection. • Evaluation of our algorithm on two real world datasets showed significant efficiency gains with an acceptable drop in performance. • A tunable parameter γ for performance vs. efficiency trade-off. • Analytical framework to show the sub-modularity of influence spread function when the underlying graph is iteratively pruned thus enabling the evaluation of performance guarantees.
  • 50. Scope for Future Work • Design of more efficient algorithms. • Evaluation with real world distributions of γ (susceptibility). • Extension to non-progressive models of diffusion.
  • 51. References [1] P. Domingos and M. Richardson, “Mining the network value of customers,” in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, ser. KDD ’01. ACM, 2001, pp. 57–66. [Online]. Available: http://doi.acm.org/10.1145/502512.502525 [2] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance, “Cost-effective outbreak detection in networks,” in Proceedings of the thirteenth ACM SIGKDD international conference on Knowledge discovery and data mining, ser. KDD ’07. ACM, 2007, pp. 420–429. [Online]. Available: http://doi.acm.org/10.1145/1281192.1281239 [3] N. A. Christakis and J. H. Fowler, “The spread of obesity in a large social network over 32 years,” The New England Journal of Medicine, vol. 357, no. 4, pp. 370–379, July 2007. [Online]. Available: http://health-equity.pitt.edu/767/ [4] D. Kempe, J. Kleinberg, and E. Tardos, “Maximizing the spread of influence through a social network,” in Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, ser. KDD ’03. ACM, 2003, pp. 137–146. [Online]. Available: http://doi.acm.org/10.1145/956750.956769 [5] C. C. Aggarwal, A. Khan, and X. Yan, “On flow authority discovery in social networks,” in Proceedings of the eleventh SIAM international conference on Data mining, ser. SDM ’11. SIAM / Omnipress, 2011, pp. 522–533. [6] M. Granovetter, “Threshold Models of Collective Behavior,” American Journal of Sociology, vol. 83, no. 6, pp. 1420–1443, 1978. [Online]. Available: http://dx.doi.org/10.2307/2778111 [7] J. Goldenberg, B. Libai, and E. Muller, “Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth,” Marketing Letters, vol. 3, no. 12, pp. 211–223, Aug. 2001. [Online]. Available: http://www.ingentaconnect.com/content/klu/mark/2001/00000012/00000003/00350022 [8] S. Aral and D. Walker, “Creating Social Contagion Through Viral Product Design: A Randomized Trial of
  • 52. Questions? P. Chandra and A. Kalyanasundaram, “A Network Pruning Based Approach for Subset Specific Influential Detection”, in 4th Annual ACM conference on Web Science (WebSci 2012), Evanston, Illinois, USA, Jun. 2012. a b c d e f g Destination Set (D0) Subset Specific Influential Influenced set (ψ) a b c d e f g a b c d e f g a b c d e f g