SlideShare uma empresa Scribd logo
1 de 28
Organizational Overlap on
Social Networks and its
Applications
Mitul Tiwari
Joint work with Cho-Jui Hsieh, Deepak Agarwal,
Xinyi (Lisa) Huang, and Sam Shah
LinkedIn
Wednesday, May 15, 13
Who Am I
2
Wednesday, May 15, 13
Outline
• Motivation
• Organizational Overlap Model
• Problem Definition
• Data Analysis
• Mathematical Formulation
• Experimental Validation
• Applications
• Link Prediction
• Community Detection 3
Wednesday, May 15, 13
Motivation
• Social Networks : important for
• Sharing and Discovery
• Communication
• Networking
• Online Social Networks are partially observed
• Link Prediction and Recommending entities are important
4
Wednesday, May 15, 13
Motivation: Rich Member Profile
5
Wednesday, May 15, 13
Motivation: Network Is Important
6
Wednesday, May 15, 13
Motivation: People You May Know
7
Wednesday, May 15, 13
Motivation: Other Entities
8
Wednesday, May 15, 13
Recommender Ecosystem
9
Similar  Profiles
Connections
News
Skill  Endorsements
Wednesday, May 15, 13
Motivation
• Member profile contains various types of organizations
• Company, Schools, Groups, ...
• Can we compute edge affinity based on these organization
information?
• Useful for many applications:
• Recommending members to connect (link prediction)
• Recommending other entities from the same community (community
detection)
10
Wednesday, May 15, 13
Outline
• Motivation
• Organizational Overlap Model
• Problem Definition
• Data Analysis
• Mathematical Formulation
• Experimental Validation
• Applications
• Link Prediction
• Community Detection 11
Wednesday, May 15, 13
Organizational Overlap Problem
• Goal: compute the probability of connection based on the
organizational time overlap
• For a pair of members (A, B) who belonged to the same
organization and overlapped in time, we have organizational
time overlap: T(A, B, O)
• Probability that A and B are connected: P(A, B)
• Assume (A, B) only one common org: P(A, B) = f(T(A, B, O), O)
• A function of time overlapped in the organization O and Properties of the
organization O
• In short, P(t) = f(t, O), where t=T(A,B,O)
12
Wednesday, May 15, 13
Organizational Overlap Data Analysis
• Insight 1: Connection density increases with organizational
time overlap
13
Wednesday, May 15, 13
Organizational Overlap Data Analysis
• Insight 2: Connection density decreases with the size of
the organizational
14
Wednesday, May 15, 13
Organizational Overlap Model
15
Wednesday, May 15, 13
Organizational Overlap Model
16
Wednesday, May 15, 13
Organizational Overlap Model
Validation
• Empirical connection
density fits our model
17
Wednesday, May 15, 13
Organizational Overlap Model:
Estimating !
• !: organization dependent
parameter
• Members of smaller
organization is more likely to
know each other
• Empirical and MLE estimates
for log(!) ~ -0.8 log(|S|)
18
Wednesday, May 15, 13
Outline
• Motivation
• Organizational Overlap Model
• Problem Definition
• Data Analysis
• Mathematical Formulation
• Experimental Validation
• Applications
• Link Prediction
• Community Detection
19
Wednesday, May 15, 13
Application: Link Prediction
• Warm start: existing edges
• 2 features: org. overlap time
and size
• Common Neighbors (CN)
• Adamic-Adar (AA)
• Data Sets: LinkedIn, Enron
emails, Wiki talk
20
Wednesday, May 15, 13
Application: Link Prediction
• Cold start: no or sparse
edges
• All features:
• time overlap, company size,
company propensity, node
propensity, ...
• logistic regression model
21
Wednesday, May 15, 13
Application: Community Detection
• Good for candidate generation for an entity recommendation
systems, such as, companies to follow
• Graph Clustering algorithm (Graclus)
• Members as nodes and an edge between any pair of nodes with overlap
• Organizational overlap model for computing edge weight
• Graclus: minimizes the total weight of the cuts
• Evaluation using
• Virality of company follow within communities
• Virality of article updates
22
Wednesday, May 15, 13
Community Detection Evaluation
• Compared 3 methods
• Organizational overlap based
• Using social connections graph
• Random: partition the nodes in the
same company
• Using Spread of company follow
• Spread: avg # of companies
followed within d days of the
first follow event
• Propagation rate: norm. spread
23
Wednesday, May 15, 13
Community Detection Evaluation
• Virality of article updates within communities
24
Avg degree: 4-6 Avg degree: 12-14
Wednesday, May 15, 13
Related Work
25
Wednesday, May 15, 13
Summary
• Motivation
• Organizational Overlap Model
• Problem Definition
• Data Analysis
• Mathematical Formulation
• Experimental Validation
• Applications and Evaluation
• Link Prediction: cold and warm start
• Community Detection
26
Wednesday, May 15, 13
Acknowledgement
• http://data.linkedin.com
• We are hiring!
• Contact: mtiwari[at]linkedin.com
• Follow: @mitultiwari on Twitter
27
Wednesday, May 15, 13
Questions?
28
Wednesday, May 15, 13

Mais conteúdo relacionado

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Destaque

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
Simplilearn
 

Destaque (20)

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
 

Organizational Overlap on Social Networks and its Applications

  • 1. Organizational Overlap on Social Networks and its Applications Mitul Tiwari Joint work with Cho-Jui Hsieh, Deepak Agarwal, Xinyi (Lisa) Huang, and Sam Shah LinkedIn Wednesday, May 15, 13
  • 3. Outline • Motivation • Organizational Overlap Model • Problem Definition • Data Analysis • Mathematical Formulation • Experimental Validation • Applications • Link Prediction • Community Detection 3 Wednesday, May 15, 13
  • 4. Motivation • Social Networks : important for • Sharing and Discovery • Communication • Networking • Online Social Networks are partially observed • Link Prediction and Recommending entities are important 4 Wednesday, May 15, 13
  • 5. Motivation: Rich Member Profile 5 Wednesday, May 15, 13
  • 6. Motivation: Network Is Important 6 Wednesday, May 15, 13
  • 7. Motivation: People You May Know 7 Wednesday, May 15, 13
  • 10. Motivation • Member profile contains various types of organizations • Company, Schools, Groups, ... • Can we compute edge affinity based on these organization information? • Useful for many applications: • Recommending members to connect (link prediction) • Recommending other entities from the same community (community detection) 10 Wednesday, May 15, 13
  • 11. Outline • Motivation • Organizational Overlap Model • Problem Definition • Data Analysis • Mathematical Formulation • Experimental Validation • Applications • Link Prediction • Community Detection 11 Wednesday, May 15, 13
  • 12. Organizational Overlap Problem • Goal: compute the probability of connection based on the organizational time overlap • For a pair of members (A, B) who belonged to the same organization and overlapped in time, we have organizational time overlap: T(A, B, O) • Probability that A and B are connected: P(A, B) • Assume (A, B) only one common org: P(A, B) = f(T(A, B, O), O) • A function of time overlapped in the organization O and Properties of the organization O • In short, P(t) = f(t, O), where t=T(A,B,O) 12 Wednesday, May 15, 13
  • 13. Organizational Overlap Data Analysis • Insight 1: Connection density increases with organizational time overlap 13 Wednesday, May 15, 13
  • 14. Organizational Overlap Data Analysis • Insight 2: Connection density decreases with the size of the organizational 14 Wednesday, May 15, 13
  • 17. Organizational Overlap Model Validation • Empirical connection density fits our model 17 Wednesday, May 15, 13
  • 18. Organizational Overlap Model: Estimating ! • !: organization dependent parameter • Members of smaller organization is more likely to know each other • Empirical and MLE estimates for log(!) ~ -0.8 log(|S|) 18 Wednesday, May 15, 13
  • 19. Outline • Motivation • Organizational Overlap Model • Problem Definition • Data Analysis • Mathematical Formulation • Experimental Validation • Applications • Link Prediction • Community Detection 19 Wednesday, May 15, 13
  • 20. Application: Link Prediction • Warm start: existing edges • 2 features: org. overlap time and size • Common Neighbors (CN) • Adamic-Adar (AA) • Data Sets: LinkedIn, Enron emails, Wiki talk 20 Wednesday, May 15, 13
  • 21. Application: Link Prediction • Cold start: no or sparse edges • All features: • time overlap, company size, company propensity, node propensity, ... • logistic regression model 21 Wednesday, May 15, 13
  • 22. Application: Community Detection • Good for candidate generation for an entity recommendation systems, such as, companies to follow • Graph Clustering algorithm (Graclus) • Members as nodes and an edge between any pair of nodes with overlap • Organizational overlap model for computing edge weight • Graclus: minimizes the total weight of the cuts • Evaluation using • Virality of company follow within communities • Virality of article updates 22 Wednesday, May 15, 13
  • 23. Community Detection Evaluation • Compared 3 methods • Organizational overlap based • Using social connections graph • Random: partition the nodes in the same company • Using Spread of company follow • Spread: avg # of companies followed within d days of the first follow event • Propagation rate: norm. spread 23 Wednesday, May 15, 13
  • 24. Community Detection Evaluation • Virality of article updates within communities 24 Avg degree: 4-6 Avg degree: 12-14 Wednesday, May 15, 13
  • 26. Summary • Motivation • Organizational Overlap Model • Problem Definition • Data Analysis • Mathematical Formulation • Experimental Validation • Applications and Evaluation • Link Prediction: cold and warm start • Community Detection 26 Wednesday, May 15, 13
  • 27. Acknowledgement • http://data.linkedin.com • We are hiring! • Contact: mtiwari[at]linkedin.com • Follow: @mitultiwari on Twitter 27 Wednesday, May 15, 13