An online user joins multiple social networks in order to enjoy different services. On each joined social network, she creates an identity and constitutes its three major dimensions namely profile, content and connection network. She largely governs her identity formulation on any social network and therefore can manipulate multiple aspects of it. With no global identifier to mark her presence uniquely in the online domain, her online identities remain unlinked, isolated and difficult to search. Literature has proposed identity search methods on the basis of profile attributes, but has left the other identity dimensions e.g. content and network, unexplored. In this work, we introduce two novel identity search algorithms based on content and network attributes and improve on traditional identity search algorithm based on profile attributes of a user. We apply proposed identity search algorithms to find a user's identity on Facebook, given her identity on Twitter. We report that a combination of proposed identity search algorithms found Facebook identity for 39% of Twitter users searched while traditional method based on profile attributes found Facebook identity for only 27.4\%. Each proposed identity search algorithm access publicly accessible attributes of a user on any social network. We deploy an identity resolution system, Finding Nemo, which uses proposed identity search methods to find a Twitter user's identity on Facebook. We conclude that inclusion of more than one identity search algorithm, each exploiting distinct dimensional attributes of an identity, helps in improving the accuracy of an identity resolution process.
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
@I seek 'fb.me': Identifying Users across Multiple Online Social Networks
1. @I seek ‘fb.me’:
Identifying Users across Multiple
Online Social Networks
Workshop
on
Web
of
Linked
En11es
(WoLE)
Paridhi
Jain¶,
Ponnurangam
Kumaraguru¶,
Anupam
Joshi*
¶Indraprastha
Ins1tute
of
Informa1on
Technology
(IIIT-‐Delhi)
*University
of
Maryland,
Bal1more
County
(UMBC)
1
4. Motivation
Multiple OSNs
Multiple Identities
Social Aggregation site
Difficult to manage? Difficult to find?
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
Friend
Finder?
Malicious
user?
Influen1al
user?
User
of
interest?
2
2
5. Motivation
Multiple OSNs
Multiple Identities
Social Aggregation site
Difficult to manage? Difficult to find?
Friend
Finder?
Malicious
user?
Influen1al
user?
User
of
interest?
Identity Resolution Problem
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
2
2
6. Identity Resolution
• For a user I, given a user identity IA on a social network A, find user
identity IB on social network B.
{IA}
Alice
13/05/13
{IB}
??
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
3
3
7. Identity Resolution =
Identity Search + Identity Matching
•
Identity Search
For a user I, given her identity IA on a social network A, and a search
parameter S, find the set of identities IBj on social network B such that
S(IA) ⋍ S(IB).
{IA,S}
•
{IB1, ... IBj, ... , IBN} = Q
Identity Matching
Given a user identity IA on a social network A, a set of candidate
identities Q on social network B, and a match function M, locate an
identity pair (IA, IBj) such that M(IA, IBj) = max{M(IA, IB1), M(IA, IBN)}
{IA, Q, M}
13/05/13
{IA, IBj}
{IB}
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
4
4
8. Research Gaps?
– Till
now,
focus
on
bePer
iden1ty
matching
algorithms
– Only
profile
aPributes
(private
and
public)
for
Iden1ty
Search
– Limita1ons
of
Profile
Search
-‐
– Restric1ve
search,
owing
to
non-‐availability
of
common
aPributes
across
networks.
[Gender
on
Facebook,
but
not
on
TwiPer]
– Search
with
Limited
aPributes
→
Large
candidate
set
size
→
Intensive
Iden1ty
Matching
computa1on
– Users
may
choose
different
profile
aPributes
→
Miss
out
correct
iden1ty
in
the
candidate
set
– LiPle
research
on
using
content
and
network
aPributes
to
search
for
candidate
iden11es
– Extensive
use
of
both
private
and
public
aPributes.
Need
user
authoriza1on
for
iden1ty
search
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
5
5
9. Research Gaps?
– Till
now,
focus
on
bePer
iden1ty
matching
algorithms
– Only
profile
aPributes
(private
and
public)
for
Iden1ty
Search
– Limita1ons
of
Profile
Search
-‐
– Restric1ve
search,
owing
to
non-‐availability
of
common
aPributes
across
networks.
[Gender
on
Facebook,
but
not
on
TwiPer]
– Search
with
Limited
aPributes
→
Large
candidate
set
size
→
Intensive
Iden1ty
Matching
computa1on
– Users
may
choose
different
profile
aPributes
→
Miss
out
correct
iden1ty
in
the
candidate
set
– LiPle
research
on
using
content
and
network
aPributes
to
search
for
candidate
iden11es
– Extensive
use
of
both
private
and
public
aPributes.
Need
user
authoriza1on
for
iden1ty
search
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
6
6
10. Research Gaps?
– Till
now,
focus
on
bePer
iden1ty
matching
algorithms
– Only
profile
aPributes
(private
and
public)
for
Iden1ty
Search
– Limita1ons
of
Profile
Search
-‐
– Restric1ve
search,
owing
to
non-‐availability
of
common
aPributes
across
networks.
[Gender
on
Facebook,
but
not
on
TwiPer]
– Search
with
Limited
aPributes
→
Large
candidate
set
size
→
Intensive
Iden1ty
Matching
computa1on
– Users
may
choose
different
profile
aPributes
→
Miss
out
correct
iden1ty
in
the
candidate
set
– LiPle
research
on
using
content
and
network
aPributes
to
search
for
candidate
iden11es
– Extensive
use
of
both
private
and
public
aPributes.
Need
user
authoriza1on
for
iden1ty
search
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
7
7
11. Proposal
– Include
content
and
network
aPributes
as
search
parameters
– Access
only
publicly
accessible
aPributes
– Focus
on
two
popular
social
networks
-‐
TwiPer
and
Facebook
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
8
8
12. Contribution
– Proposed
novel
iden1ty
search
methods
on
social
networks
– Our
iden1ty
resolu1on
methods
return
correct
Facebook
iden1ty
for
39%
TwiPer
users
within
top-‐2
ranks
– We
observe
an
increase
in
accuracy
of
iden1ty
resolu1on
by
11.6%
owing
to
inclusion
of
content
and
network
iden1ty
search,
along
with
improvised
profile
search
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
9
9
14. Identity Matching
– Syntac1c
Matching
– Jaro
Distance
comparison
between
username
and
name
– Example:
{alice123,
jane_alice},
{Alice
Naura,
Alice
N.
Janice}
– Image
Matching
where
hIA
and
hIBj
are
the
RGB
histograms
of
the
profile
image
and
Ns
represent
histogram
size
of
IA
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
11
11
15. Profile Search
Self
-‐
Iden1fica1on
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
12
12
16. Profile Search
Self
-‐
Iden1fica1on
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
12
12
23. Instance,
Public
Friend
List
of
a
user
extracted
from
public
feeds
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
16
16
29. Take away
Inclusion
of
content
and
network
a9ributes
for
iden1ty
search
not
only
improves
iden1ty
resolu1on
accuracy
but
returns
correct
Facebook
iden1ty
within
top-‐2
ranks
for
majority
of
the
TwiPer
users.
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
21
21
30. Current and Future Work
– Extend
the
social
networks
to
search
for
a
given
iden1ty.
Example,
Google+,
Foursquare,
etc.
– Extend
the
search
methods
to
include
social-‐network
specific
features
– Find
mul1ple
(fake)
iden11es
of
users
within
social
networks
13/05/13
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks
22
22