SlideShare uma empresa Scribd logo
1 de 53
Scalable and Parallelizable Processing
of Influence Maximization
for Large-Scale Social Networks
for Large-Scale Social Networks
Apr 9, 2013
Jinha Kim, Seung-Keol Kim, Hwanjo Yu
Pohang University of Science and Technology (POSTECH)
2
Goal
• Boosting Influence Maximization processing
by efficient influence evaluation
3
4
Viral MarketingViral Marketing
Influence Maximization ProblemInfluence Maximization Problem
GraphGraph
DiffusionDiffusion
ModelModel
ProcessingProcessing
AlgorithmAlgorithm
5
Word of Mouth Effect
...
...
...
7
A Marketer’s Perspective
...
...
...PERSUADPERSUAD
EE
ONE!ONE!
MakingMaking
Money!!!!Money!!!!
9
How to find in an
algorithmic way?
10
Viral MarketingViral Marketing
Influence Maximization ProblemInfluence Maximization Problem
GraphGraph
DiffusionDiffusion
ModelModel
ProcessingProcessing
AlgorithmAlgorithm
11
Quantifying Influence
The expected number of users influenced by S
12
Influence Maximization
Problem (KKT 03)
13
Viral MarketingViral Marketing
Influence Maximization ProblemInfluence Maximization Problem
GraphGraph
DiffusionDiffusion
ModelModel
ProcessingProcessing
AlgorithmAlgorithm
14
Abstracting Social
Networks
15
Abstracting Social
Network
uu vv
e
16
Viral MarketingViral Marketing
Influence Maximization ProblemInfluence Maximization Problem
GraphGraph
DiffusionDiffusion
ModelModel
ProcessingProcessing
AlgorithmAlgorithm
17
Quantifying Influence
The expected number of entities influenced by S
DEPENDS ON
how influence is propagated through a graph
19
SEEDSSEEDS
Independent Cascade
(IC) model
active
inactive
t = 0
20
Independent
Cascade(IC) model
active at t = i
inactive
t = i + 1
active at t < i
21
Independent
Cascade(IC) model
inactive
active at t < j
t = j + 1
Propagation ends!!!
22
Viral MarketingViral Marketing
Influence Maximization ProblemInfluence Maximization Problem
GraphGraph
DiffusionDiffusion
ModelModel
ProcessingProcessing
AlgorithmAlgorithm
24
Processing AlgorithmProcessing Algorithm
Macro LevelMacro Level
ProcessingProcessing
Micro LevelMicro Level
ProcessingProcessing
25
Processing AlgorithmProcessing Algorithm
Macro LevelMacro Level
ProcessingProcessing
Micro LevelMicro Level
ProcessingProcessing
26
Macro Level (KKT 03)
• Finding the maximum from
cases
• Reducible to set-covering problem
(NP-Hard)
27
Greedy Algorithm
(KKT 03)
• Repeatedly selects the node which gives
the most marginal gain from
• and are two major
evaluation components
28
Processing AlgorithmProcessing Algorithm
Macro LevelMacro Level
ProcessingProcessing
Micro LevelMicro Level
ProcessingProcessing
29
Micro Level (CWW 10)
• Cannot count influence propagation routes
between two nodes
30
Evaluating (S)σ
• Monte-Carlo Simulation (KKT 03)
• Simultaneous simulation (CWY 09)
• Breaking down a graph into communities
(WCS 10)
• Shortest path between two nodes (KS 06)
• Local arborescence based on the most
probable path (CWW 10)
31
Processing AlgorithmProcessing Algorithm
Macro LevelMacro Level
ProcessingProcessing IPAIPA
33
Intuition
• How about extremely localizing influence??
• Influence path between two nodes as
influence evaluation unit !!
• Considering all path is not tractable
(#P-hard)
• Only considering meaningful influence
paths
36
Meaningful Influence
Path in IC model
vv11vv11 vv22vv22 vv33vv33 vv44vv44 vv55vv55
0.1 0.1 0.1 0.1
37
Traversing Graph
Graph A traversing tree from a
38
Extracting Paths
A traversing tree from a A path collection from a
39
Organizing Paths
A path collection from a
40
Approximating ({v})σ
Influence of a node v
infl. of v to itself
Influence of a node v to u
41
Parallel evaluation
• To approximate ({v}),σ
Pv V→ is required
• For v≠u, Pv V→ and Pu V→ do
not have common paths
• Independent evaluation
of ({v}) is guaranttedσ
vv11vv11
uu1111
uu1111
uu1n1n
uu1n1n
......
vv22vv22
uu2121
uu2121
uu2n2n
uu2n2n
......
42
Re-organizing
• Changing perspective from starting nodes
to ending nodes
43
• ({v}) ≠ (S {v}) - (S)σ σ ∪ σ
• influence blocking!!!!
• v blocks a path from u S∈
• We should detect blocked(invalid) paths
Approximating (S {v}) - (S)σ ∪ σ
is not trivial
is not trivial
uuuu vvvv
uuuu vvvv
before
after
44
Detecting influence
blocking
• Current seed set : S
• New seed node : v
• Valid Paths
uuuu vvvv
vvvv uuuu
45
Adding a seed node
46
Detect invalid paths
47
Approximating
(S {v}) - (S)σ ∪ σ
(S {v}) - (S)σ ∪ σ
Marginal infl. of a node v
infl. of v to itself
Infl. of seeds S to a node v
Only consider valid paths
51
Empirical EvaluationEmpirical Evaluation
52
Dataset
53
Algorithms
• Monte-Carlo[Greedy] (LKG 07)
• PMIA (CWW 10)
• SD (single discount)
• Random (baseline)
• IPA
54
Finding Threshold
55
Processing Time
57
Influence
58
Influence
59
Influence
60
Parallelization Effect
61
Q & A
62
References
63
• KKT 03 : Kempe, D., Kleinberg, J., andTardos, E. Maximizing
the spread of influence through a social network.
(KDD ’03)
• SC 06 : Kimura, M., and Saito, K.Tractable models for
information diffusion in social networks.
(PKDD ’06)
• LKG 07 : Leskovec, J., Krause,A., Guestrin, C., Faloutsos, C.,
VanBriesen, J., and Glance, N. Cost-effective outbreak
detection in networks.
(KDD ’07)
• CWY 09 : Chen,W.,Wang,Y., andYang, S. Efficient influence
maximization in social networks.
(KDD ’09)
64
• CWW 10 : Chen,W.,Wang, C., and Wang,Y. Scalable influence
maximization for prevalent viral marketing in large-scale social
networks.
(KDD ’10)
• WCS 10 : Wang,Y., Cong, G., Song, G., and Xie, K. Community-based
greedy algorithm for mining top- k influential nodes in mobile social
networks.
(KDD ’10)
• JSC 11 : Jiang, Q., Song, G., and Cong, G., Simulated Annealing Based
Influence Maximization in Social Networks.
(AAAI ’11)
• LYK 12 : Lee,W., Kim, J., andYu, H., CT-IC: Continuously activated
and Time-restricted Independent Cascade Model forViral Marketing
(ICDM ’12)

Mais conteúdo relacionado

Mais procurados

Présentation Modul\'Data Center
Présentation Modul\'Data CenterPrésentation Modul\'Data Center
Présentation Modul\'Data Centerncambazard
 
Mini projet statistique bahtat ayoub
Mini projet statistique bahtat ayoubMini projet statistique bahtat ayoub
Mini projet statistique bahtat ayoubAyoub BAHTAT
 
BigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceBigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceLilia Sfaxi
 
Chiffrement affine et césar par Zellagui Amine
Chiffrement affine et césar par Zellagui AmineChiffrement affine et césar par Zellagui Amine
Chiffrement affine et césar par Zellagui AmineZellagui Amine
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learningQuentin Ambard
 
Cloud computing
Cloud computingCloud computing
Cloud computingmourad50
 
Présentation les prestataires logistiques
Présentation les prestataires logistiquesPrésentation les prestataires logistiques
Présentation les prestataires logistiquesSabrina Chhibi
 
46441002 cours-logistique-2
46441002 cours-logistique-246441002 cours-logistique-2
46441002 cours-logistique-2redkouddane
 
Perceptron monocouche en français
Perceptron monocouche en françaisPerceptron monocouche en français
Perceptron monocouche en françaisHakim Nasaoui
 
Sécurité des Systèmes Répartis- Partie 1
Sécurité des Systèmes Répartis- Partie 1 Sécurité des Systèmes Répartis- Partie 1
Sécurité des Systèmes Répartis- Partie 1 Lilia Sfaxi
 
exercices business intelligence
exercices business intelligence exercices business intelligence
exercices business intelligence Yassine Badri
 
QCM système d'information
QCM système d'informationQCM système d'information
QCM système d'informationFrust Rados
 
ETUDE ET MISE EN PLACE D'UNE SOLUTION DE CLOUD COMPUTING PRIVÉ BASÉE SUR UN ...
ETUDE ET MISE EN PLACE D'UNE SOLUTION DE CLOUD COMPUTING  PRIVÉ BASÉE SUR UN ...ETUDE ET MISE EN PLACE D'UNE SOLUTION DE CLOUD COMPUTING  PRIVÉ BASÉE SUR UN ...
ETUDE ET MISE EN PLACE D'UNE SOLUTION DE CLOUD COMPUTING PRIVÉ BASÉE SUR UN ...Borel NZOGANG
 

Mais procurados (20)

Présentation cloud computing
Présentation cloud computingPrésentation cloud computing
Présentation cloud computing
 
Présentation Modul\'Data Center
Présentation Modul\'Data CenterPrésentation Modul\'Data Center
Présentation Modul\'Data Center
 
Mini projet statistique bahtat ayoub
Mini projet statistique bahtat ayoubMini projet statistique bahtat ayoub
Mini projet statistique bahtat ayoub
 
BigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceBigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-Reduce
 
Chiffrement affine et césar par Zellagui Amine
Chiffrement affine et césar par Zellagui AmineChiffrement affine et césar par Zellagui Amine
Chiffrement affine et césar par Zellagui Amine
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learning
 
Knn
KnnKnn
Knn
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Python
PythonPython
Python
 
Qu'est ce que le cloud computing
Qu'est ce que le cloud computingQu'est ce que le cloud computing
Qu'est ce que le cloud computing
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Présentation les prestataires logistiques
Présentation les prestataires logistiquesPrésentation les prestataires logistiques
Présentation les prestataires logistiques
 
46441002 cours-logistique-2
46441002 cours-logistique-246441002 cours-logistique-2
46441002 cours-logistique-2
 
Perceptron monocouche en français
Perceptron monocouche en françaisPerceptron monocouche en français
Perceptron monocouche en français
 
Cv oumaima
Cv oumaimaCv oumaima
Cv oumaima
 
Sécurité des Systèmes Répartis- Partie 1
Sécurité des Systèmes Répartis- Partie 1 Sécurité des Systèmes Répartis- Partie 1
Sécurité des Systèmes Répartis- Partie 1
 
exercices business intelligence
exercices business intelligence exercices business intelligence
exercices business intelligence
 
QCM système d'information
QCM système d'informationQCM système d'information
QCM système d'information
 
These
TheseThese
These
 
ETUDE ET MISE EN PLACE D'UNE SOLUTION DE CLOUD COMPUTING PRIVÉ BASÉE SUR UN ...
ETUDE ET MISE EN PLACE D'UNE SOLUTION DE CLOUD COMPUTING  PRIVÉ BASÉE SUR UN ...ETUDE ET MISE EN PLACE D'UNE SOLUTION DE CLOUD COMPUTING  PRIVÉ BASÉE SUR UN ...
ETUDE ET MISE EN PLACE D'UNE SOLUTION DE CLOUD COMPUTING PRIVÉ BASÉE SUR UN ...
 

Destaque

Maximizing Social Influence: A Case Study (or, GlassesA A Love Story
Maximizing Social Influence: A Case Study (or, GlassesA A Love Story Maximizing Social Influence: A Case Study (or, GlassesA A Love Story
Maximizing Social Influence: A Case Study (or, GlassesA A Love Story Tamsen Webster
 
A Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence MaximizationA Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence MaximizationSurendra Gadwal
 
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...Wei Lu
 
Viral Marketing Meets Social Advertising: Ad Allocation with Minimum Regret
Viral Marketing Meets Social Advertising: Ad Allocation with Minimum RegretViral Marketing Meets Social Advertising: Ad Allocation with Minimum Regret
Viral Marketing Meets Social Advertising: Ad Allocation with Minimum RegretCigdem Aslay
 
Aslay Ph.D. Defense
Aslay Ph.D. DefenseAslay Ph.D. Defense
Aslay Ph.D. DefenseCigdem Aslay
 
Advanced Search Techniques
Advanced Search TechniquesAdvanced Search Techniques
Advanced Search TechniquesShakil Ahmed
 
Spread influence on social networks
Spread influence on social networksSpread influence on social networks
Spread influence on social networksArmando Vieira
 
IMAX PRESENTATION
IMAX PRESENTATIONIMAX PRESENTATION
IMAX PRESENTATIONSebby23
 

Destaque (8)

Maximizing Social Influence: A Case Study (or, GlassesA A Love Story
Maximizing Social Influence: A Case Study (or, GlassesA A Love Story Maximizing Social Influence: A Case Study (or, GlassesA A Love Story
Maximizing Social Influence: A Case Study (or, GlassesA A Love Story
 
A Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence MaximizationA Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence Maximization
 
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
 
Viral Marketing Meets Social Advertising: Ad Allocation with Minimum Regret
Viral Marketing Meets Social Advertising: Ad Allocation with Minimum RegretViral Marketing Meets Social Advertising: Ad Allocation with Minimum Regret
Viral Marketing Meets Social Advertising: Ad Allocation with Minimum Regret
 
Aslay Ph.D. Defense
Aslay Ph.D. DefenseAslay Ph.D. Defense
Aslay Ph.D. Defense
 
Advanced Search Techniques
Advanced Search TechniquesAdvanced Search Techniques
Advanced Search Techniques
 
Spread influence on social networks
Spread influence on social networksSpread influence on social networks
Spread influence on social networks
 
IMAX PRESENTATION
IMAX PRESENTATIONIMAX PRESENTATION
IMAX PRESENTATION
 

Semelhante a Scalable and Parallelizable Processing of Influence Maximization for Large-Scale Social Networks

Socable Influence Maximization
Socable Influence MaximizationSocable Influence Maximization
Socable Influence Maximizationrobertlz
 
Modelling of Quality of Experience in No-Reference (NR) Model
Modelling of Quality of Experience in No-Reference (NR) ModelModelling of Quality of Experience in No-Reference (NR) Model
Modelling of Quality of Experience in No-Reference (NR) ModelMikolaj Leszczuk
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksMLReview
 
(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_KimSundong Kim
 
Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)tm1966
 
BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - D...
BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - D...BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - D...
BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - D...GeekPwn Keen
 
Diff thatmakesdiff viz
Diff thatmakesdiff vizDiff thatmakesdiff viz
Diff thatmakesdiff vizTony Hirst
 
A network pruning based approach for subset specific influential detection
A network pruning based approach for subset specific influential detectionA network pruning based approach for subset specific influential detection
A network pruning based approach for subset specific influential detectionArun Kalyanasundaram
 
Deep learning italia speech galazzo
Deep learning italia speech galazzoDeep learning italia speech galazzo
Deep learning italia speech galazzoDeep Learning Italia
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix DatasetBen Mabey
 
WWW 2021report public
WWW 2021report publicWWW 2021report public
WWW 2021report publicTakuma Oda
 
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...Julián Urbano
 
Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms
Diversified Recommendation on Graphs: Pitfalls, Measures, and AlgorithmsDiversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms
Diversified Recommendation on Graphs: Pitfalls, Measures, and AlgorithmsOnur Kucuktunc
 
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...acijjournal
 
Dagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsDagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsArijit Khan
 
Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석datasciencekorea
 

Semelhante a Scalable and Parallelizable Processing of Influence Maximization for Large-Scale Social Networks (20)

Socable Influence Maximization
Socable Influence MaximizationSocable Influence Maximization
Socable Influence Maximization
 
Interactive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social GraphsInteractive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social Graphs
 
Modelling of Quality of Experience in No-Reference (NR) Model
Modelling of Quality of Experience in No-Reference (NR) ModelModelling of Quality of Experience in No-Reference (NR) Model
Modelling of Quality of Experience in No-Reference (NR) Model
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
 
(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim
 
Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)
 
BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - D...
BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - D...BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - D...
BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - D...
 
Diff thatmakesdiff viz
Diff thatmakesdiff vizDiff thatmakesdiff viz
Diff thatmakesdiff viz
 
A network pruning based approach for subset specific influential detection
A network pruning based approach for subset specific influential detectionA network pruning based approach for subset specific influential detection
A network pruning based approach for subset specific influential detection
 
Deep learning italia speech galazzo
Deep learning italia speech galazzoDeep learning italia speech galazzo
Deep learning italia speech galazzo
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
WWW 2021report public
WWW 2021report publicWWW 2021report public
WWW 2021report public
 
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
 
Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms
Diversified Recommendation on Graphs: Pitfalls, Measures, and AlgorithmsDiversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms
Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms
 
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
 
Dagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsDagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphs
 
Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석
 
kdd_talk.pdf
kdd_talk.pdfkdd_talk.pdf
kdd_talk.pdf
 
kdd_talk.pdf
kdd_talk.pdfkdd_talk.pdf
kdd_talk.pdf
 

Último

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Scalable and Parallelizable Processing of Influence Maximization for Large-Scale Social Networks

Notas do Editor

  1. Hello, my name is Jinha Kim and let me present our research topic named scalable and pralleliazable processing of Influence Maximization for Large-Scale Social Networks. This is a joint work with Seung-Keol Kim and my advisor hwanjo Yu.
  2. The goal of this presentation is to devise a method that efficiently evaluates influence which is the most time-consuming part of the influence maximization problem.
  3. This diagram outlines this talk. First, how viral marketing exploits the social network is shown briefly. Then, to find the most effective users in viral marketing, the influence maximization problem is formulated. To concretize the problem, how social networks are abstracted as graphs and how influence is propagated throughout graphs are described briefly. Finally, how the influence maximization problem is solved using our method is described in detail.
  4. Let me show how viral marketing works in social networks.
  5. In social networks, a user’s opinion is spread throughout the network. For example, a twitter user writes an impressive posting and his or her followers may re-tweet it as a sign of agreement. Then, the followers of followers may re-tweet it again and this kind of chain reaction affects the whole network. This is called the ‘word of mouth’ effect.
  6. To exploit the word-of-mouth effect, marketers persuade some influential users and hope that the positive opinion of them is inflated into the social network. This is how the viral marketing works in social networks. Therefore, to be a successful marketing, finding the top most influential people is the most crucial task.
  7. Then, an important question arises. How can we find such users in an algorithmic way!
  8. The question is formulated as the influence maximization problem.
  9. First, the influence should quantified. When a user subset is given as S, a function of sigma S returns the expected number of users influenced by S. This is the quantified influence in networks.
  10. Then, the influence maximization problem is formulated as a combinatorial optimization expression.
  11. To define sigma S concretely, a graph and a influence diffusion process should be modeled.
  12. A social network can be abstracted as a weighted directed graph.
  13. For example, assuming that in facebook, a user ‘v’ likes a posting of his or her friend ‘u’. In the corresponding graph, user ‘u’ and ‘v’ become nodes and their friendship relation becomes an edge and how much user ‘v’ likes his or her friends ‘u’’s posting becomes the weight of the edge.
  14. And diffusion model should also be defined.
  15. Given a graph, the quantified influence sigma of S depends on how influence is propagated. In this research, our method is based on the independent cascade model which is simple but well-established. In the next few slides, I will explain how the independent cascade model works in an inductive way.
  16. At time zero, several seed nodes are activated by the marketers. and All the other nodes are inactive.
  17. At the time i plus one, as shown in the figure, active nodes which are activated at time i have one chance to activate its inactive out-neighbors. on the contrary active nodes which are activated before i do not have such chance.
  18. The influence propagation continues until no nodes are activated Assuming that no nodes are activated at time j consequently at time j plus one, no inactive nodes have chance to be activated
  19. After defining the graph and the diffusion model, the influence maximization problem can be solved.
  20. Before describing our method, let me show two challenges that the influence maximization processing confronts.
  21. At the macro level,
  22. The optimization expression itself is NP-hard. Intuitively, finding the optimal solution requires finding the best from all possible combinations. The expression is reducible to set-covering problem and proven to be NP-hard.
  23. To detour the NP-hard challenge, the greedy algorithm is proposed in the seminal paper of the influence maximization problem. The greedy algorithm repeatedly chooses the node which gives the most marginal influence increase from the current seed set. In the greedy algorithm, influence of each node and the marginal influence increase are two major evaluation components. However, evaluating the exact influence is also hard.
  24. We call it the micro level challenge of the influence maximization.
  25. The influence evaluation itself is included in the #P-hard problem class, which says we cannot count the number of all possible solutions of a given problem. In the influence evaluation perspective, it is related to the fact that we cannot count all influence propagation paths even between two nodes in a polynomial time.
  26. To overcome the micro level challenge, several methods are proposed. In the seminal paper of the influence maximization, influence is evaluated using Monte-Carlo simulation. in which actual diffusion process is repeated over ten thousand times and the average activated nodes are determined to be the influence. However, the Monte-Carlo simulation takes too much time. To boost the evaluation time, local structure such as shortest path between two nodes or local arborescence structures are used.
  27. Along with these methods, we propose a more efficient influence approximation heuristic, IPA.
  28. To evaluate influence efficiently, existing methods confines the influence diffusion locally. Our intuition is how about localizing influence extremely. That leads to set all meaningful paths between two nodes as influence evaluation unit. The word ‘meaningful’ in this context is formally defined in the next slide.
  29. When an influence path is a sequence of nodes, the influence propagation probability ipp(.) is defined as the product of the sequence of edge weights. We only consider influence paths whose ipp is no less than the pre-defined threshold theta. For example, assuming that all edge weights are 0.1 and the threshold is 0.001, we only consider influence paths of length up-to three, but paths longer than three will be ignored.
  30. With the definition of the meaningful influence paths, let us see how meaningful influence paths are collected and organized to evaluate the single node influence. Suppose that a graph is given as the left figure. IPA first traverses the graph from each node in a breadth first way. The right figure is the result of the traverse from node a. The traversal stops when a cycle is detected or ipp() becomes less than the threshold
  31. After the traversal, IPA extracts the influence paths from the tree. Influence paths are all the paths from the root to each non-root nodes in the traversal tree. From the traversal tree in the left figure, ten paths that start from node ‘a’ are extracted. We call such path set as P sub a to V.
  32. For each node, the graph traversal is conducted and all influence paths are collected. The paths are grouped by their starting nodes.
  33. Now, IPA can approximate the influence of a single node. hat symbol is used to indicate that it is an approximation. The influence of single node ‘v’ is the sum of one which is the influence of itself and the sum of influence from v to the nodes of ‘v’’s reaching area O sub v. The reaching area of ‘a’ in the example is b,c,d and e. The influence between two nodes are defined as the complement of the probability that no paths between them do not influence the sink node.
  34. The parallel evaluation of single node influence is simple. To approximate the influence of a node v, P sub v to node set V is required. For two different node u and v, P sub v to capital V and P sub u to capital V are required but do not have common paths. Therefore, parallel evaluation of single node influence is possible.
  35. Up-to now, IPA evaluated the single node influence. To evaluate the marginal influence increase, IPA re-organizes the paths. In the single node influence evaluation, paths are grouped by their starting nodes. Now, Paths are re-grouped by their ending nodes. By re-organizing, IPA can efficiently evaluate the marginal influence increase in parallel
  36. Now, we reach the marginal influence increase evaluation phase. It is complicated because the marginal increase is not equal to the mere difference of the influence before and after adding new seed candidate. For example, before adding v as a seed, a path of u,v and the remaining is valid. However, after adding, such path becomes meaningless because activation trial of u to v is impossible in the independent cascade model. We call this influence blocking and should detect such invalid paths
  37. For the current seed set S, among the paths that start from seed nodes ///----------------------of P sub capital S to capital V, all paths that have v as their element are invalidated. For the new seed node v, among the paths that start from v ///------------------------of P sub v to capical V all paths that have any current seed nodes are invalidated. In sum, a valid path contains only one seed node as its starting node.
  38. Let us see how invalid paths are detected. Suppose that the current seed set consists of only ‘a’ and ‘d’ is added as a new seed candidate. The left figure shows that before adding d, paths from a to e are only valid. After adding ‘d’ into seed set, five paths become candidate paths.
  39. However, adding new seed makes some paths invalid. For example, d blocks the influence of a in the path (a,d,e) and a blocks the influence of d in the path (d,a,c,e). In the end, only three paths are valid and used to evaluate the marginal influence increase.
  40. Using the valid paths, marginal influence increase is evaluated. The marginal influence increase is the sum of one which is the influence of a new seed v and the sum of the marginal influence increase from seed nodes to the v’s reaching area. The marginal influence increase of the seed nodes to a node u a member of v’s reaching area is the complement of the probability that no valid paths from seed nodes do not influence u. Similarly to the single node influence evaluation, green box is also parallelizable. This is all about how IPA evaluates influence.
  41. Now, let me show the empirical evaluation result of our method
  42. Five publicly available real datasets are used. The node size ranges from 75 thousand to 5mil and the edge size ranges from 500 thousand to 70m.
  43. Along with IPA, four other influence evaluation methods are used. Monte-Carlo is the Monte-Carlo simulation method which is used in the seminal paper of the problem. the number of repetition is 20,000. PMIA is the state of the art influence evaluation method which exploits the local arborescence structure. SD is an influence evaluation method that only counts on the graph structure but not influence diffusion model Random is random. All five influence evaluation methods are plugged into the greedy algorithm.
  44. First, we should find the threshold in each dataset for IPA and PMIA. As shown in the figure, although processing time and influence are both desirable features, they have trade-off relation. Thus, we find the elbow point in which neither feature sacrifices the other.
  45. This plot shows the log-scaled processing time of the five methods. Greedy is slow. In patent and livejournal, it couldn’t finish until one hundred thousand seconds elapsed. The single discount and random is trivially fast because they do not consider the influence diffusion but the influence of their solution is not good. IPA shows an order of magnitude shorter processing time than PMIA which is the state of the art. PMIA did not finish in livejournal due to the memory problem.
  46. Along with the processing time, we also evaluate how fast the next seed node is pop out after the first seed node is found. As shown in the plots, IPA
  47. These plots show the influence of the solutions of five methods in five datasets. In influence of the seed node, greedy is trivially the best because it repeats the influence diffusion simulation until stable influence is acquired. In Epinion, both IPA and PMIA shows influence close to that of greedy. Single discount and Random show low influence. In Stanford, IPA only loses 8% of influence compared to greedy, but PMIA loses over 20%.
  48. In DBLP, IPA shows slightly lower influence than PMIA, but the difference is not much compared to Stanford dataset
  49. In patent, IPA shows more influence as the number of seed nodes increases. In LiveJournal, only IPA produces meaningful influence.
  50. Finally, we report the parallelization effect. The parallelization effect is measured by the speed-up which is a fraction of the processing time of single threaded IPA over that of multi threaded IPA. As shown in the figure, IPA parallelizes more when the dataset size is bigger.
  51. That’s it. This is the end of this talk. Any questions??