SlideShare uma empresa Scribd logo
1 de 317
Baixar para ler offline
A MODERN VIEW OF
CENTRALITY MEASURES
Paolo Boldi
Univ. degli Studi di Milano
Work in progress with Sebastiano Vigna
What do these people
have in common?
What do these people
have in common?
What do these people
have in common?
What do these people
have in common?
What do these people
have in common?
PageRank (believe it or not)
PageRank (believe it or not)
These are the top-8 actors of the Hollywood graph
according to PageRank
PageRank (believe it or not)
These are the top-8 actors of the Hollywood graph
according to PageRank
Who is going to tell
that is better?
PageRank (believe it or not)
These are the top-8 actors of the Hollywood graph
according to PageRank
Who is going to tell
that is better?
PageRank (believe it or not)
These are the top-8 actors of the Hollywood graph
according to PageRank
Who is going to tell
that is better?
PageRank sucks!
(or NOT?)
PageRank sucks!
(or NOT?)
The Hollywood graph we used contains 2,000,000
nodes. Most of them are completely unknown!
PageRank sucks!
(or NOT?)
The Hollywood graph we used contains 2,000,000
nodes. Most of them are completely unknown!
PageRank is not singling out the best actors...
PageRank sucks!
(or NOT?)
The Hollywood graph we used contains 2,000,000
nodes. Most of them are completely unknown!
PageRank is not singling out the best actors...
...but still it is not pointing to random individuals, is it?
The glass is half
empty
The glass is half
empty
Link Analysis sucks
The glass is half
empty
Link Analysis sucks
The graph itself does not contain useful information, in
general
The glass is half
empty
Link Analysis sucks
The graph itself does not contain useful information, in
general
Of little (or no) help for IR
The glass is half full
The glass is half full
Link Analysis is good, but you cannot expect
any centrality index to work on any network
The glass is half full
Link Analysis is good, but you cannot expect
any centrality index to work on any network
Probably PageRank is scarcely useful on Hollywood,
but maybe other measures would work like a charm
The glass is half full
Link Analysis is good, but you cannot expect
any centrality index to work on any network
Probably PageRank is scarcely useful on Hollywood,
but maybe other measures would work like a charm
Betweenness? Closeness? Katz? ...
The glass is half full
Link Analysis is good, but you cannot expect
any centrality index to work on any network
Probably PageRank is scarcely useful on Hollywood,
but maybe other measures would work like a charm
Betweenness? Closeness? Katz? ...
A problem here: some indices are computationa!y
unfeasible on large networks!
The grand plan
The grand plan
Centrality indices / link analysis has been with us for 60+
years
The grand plan
Centrality indices / link analysis has been with us for 60+
years
IR, sociology, psychology all have given their
contributions...
The grand plan
Centrality indices / link analysis has been with us for 60+
years
IR, sociology, psychology all have given their
contributions...
...turning it into a jungle
My point, today
My point, today
By contract, I have to convince you that networks
contain a big deal of information useful for IR
My point, today
By contract, I have to convince you that networks
contain a big deal of information useful for IR
But you have to use them in a proper way (i.e., to
compute suitable indices)
My point, today
By contract, I have to convince you that networks
contain a big deal of information useful for IR
But you have to use them in a proper way (i.e., to
compute suitable indices)
And more often than not this calls for new algorithms
because of their size (and sometimes density)
Centrality in social sciences
a historical account
Centrality in social sciences
a historical account
First works by Bavelas at MIT (1948)
Centrality in social sciences
a historical account
First works by Bavelas at MIT (1948)
This sparked countless works (Bavelas 1951; Katz 1953;
Shaw 1954; Beauchamp 1965; Mackenzie 1966; Burgess
1969; Anthonisse 1971; Czapiel 1974...) that Freeman
(1979) tried to summarize concluding that:
Centrality in social sciences
a historical account
First works by Bavelas at MIT (1948)
This sparked countless works (Bavelas 1951; Katz 1953;
Shaw 1954; Beauchamp 1965; Mackenzie 1966; Burgess
1969; Anthonisse 1971; Czapiel 1974...) that Freeman
(1979) tried to summarize concluding that:
several measures are o"en only vaguely
related to the intuitive ideas they purport to
index, and many are so complex that it is
difficult or impossible to discover what, if
anything, they are measuring
The 1990s revival
Link Analysis Ranking
The 1990s revival
Link Analysis Ranking
With the advent of search engines, there was a strong
revamp of centrality (LAR in this context)
The 1990s revival
Link Analysis Ranking
With the advent of search engines, there was a strong
revamp of centrality (LAR in this context)
New scenarios
The 1990s revival
Link Analysis Ranking
With the advent of search engines, there was a strong
revamp of centrality (LAR in this context)
New scenarios
• graphs are directed mainly
The 1990s revival
Link Analysis Ranking
With the advent of search engines, there was a strong
revamp of centrality (LAR in this context)
New scenarios
• graphs are directed mainly
• they are huge
The 1990s revival
Link Analysis Ranking
With the advent of search engines, there was a strong
revamp of centrality (LAR in this context)
New scenarios
• graphs are directed mainly
• they are huge
• new attention to efficiency
The 1990s revival
Link Analysis Ranking
With the advent of search engines, there was a strong
revamp of centrality (LAR in this context)
New scenarios
• graphs are directed mainly
• they are huge
• new attention to efficiency
LAR is the ungrateful reincarnation of Centrality
What a mess!
What a mess!
Only few, brave guys tried to make some order in this
mess!
What a mess!
Only few, brave guys tried to make some order in this
mess!
Noteworthy (in the IR context): Craswell, Upstill,
Hawking (ADCS 2003); Najork, Zaragoza, Taylor
(SIGIR 2007); Najork, Gollapudi, Panigrahy (WSDM
2009)
What a mess!
Only few, brave guys tried to make some order in this
mess!
Noteworthy (in the IR context): Craswell, Upstill,
Hawking (ADCS 2003); Najork, Zaragoza, Taylor
(SIGIR 2007); Najork, Gollapudi, Panigrahy (WSDM
2009)
Guess that the others were scared by computational
burden of classical measures
Begin the begin
Begin the begin
The only point on
which everybody
seems to agree: the
center of a star is more
important than the
other nodes
Begin the begin
The only point on
which everybody
seems to agree: the
center of a star is more
important than the
other nodes
But what does it make
more important?
Begin the begin
Begin the begin
I have the largest
degree
Begin the begin
Begin the begin
I am on most
shortest paths
Begin the begin
Begin the begin
I am maximally
close to everybody
Begin the begin
A tale of three tribes
A tale of three tribes
Indices based on degree
A tale of three tribes
Indices based on degree
Indices based on the number of paths or shortest paths
(geodesics) passing through a vertex;
A tale of three tribes
Indices based on degree
Indices based on the number of paths or shortest paths
(geodesics) passing through a vertex;
Indices based on distances from the vertex to the other
vertices
A tale of three tribes
Indices based on degree
Indices based on the number of paths or shortest paths
(geodesics) passing through a vertex;
Indices based on distances from the vertex to the other
vertices
Let me call these indices geometric
Three dogs strive for a bone, and
the fourth runs away with it...
Three dogs strive for a bone, and
the fourth runs away with it...
The advent of Link Analysis pushed for a third
(winning) tribe: spectral indices
Three dogs strive for a bone, and
the fourth runs away with it...
The advent of Link Analysis pushed for a third
(winning) tribe: spectral indices
Some of the geometric indices have also a spectral
(equivalent) definition
1) The degree tribe
1) The degree tribe
(In-)Degree centrality: the number of incoming links
cdeg(x) = d (x)
1) The degree tribe
(In-)Degree centrality: the number of incoming links
Careful: when dealing with directed networks, some
indices present two variants (e.g., in-degree vs. out-
degree), the ones based on incoming paths being more
interesting
cdeg(x) = d (x)
2) The path tribe
2) The path tribe
Betweenness centrality (Anthonisse 1971):
cbetw(x) =
X
y,z6=x
yz(x)
yz
2) The path tribe
Betweenness centrality (Anthonisse 1971):
Fraction of shortest
paths from y to z
passing through x
cbetw(x) =
X
y,z6=x
yz(x)
yz
2) The path tribe
Betweenness centrality (Anthonisse 1971):
cbetw(x) =
X
y,z6=x
yz(x)
yz
2) The path tribe
Betweenness centrality (Anthonisse 1971):
Katz centrality (Katz 1953):
cKatz(x) =
1X
t=0
↵t
⇧x(t)
cbetw(x) =
X
y,z6=x
yz(x)
yz
2) The path tribe
Betweenness centrality (Anthonisse 1971):
Katz centrality (Katz 1953):
# of paths of length t
ending in x
cKatz(x) =
1X
t=0
↵t
⇧x(t)
cbetw(x) =
X
y,z6=x
yz(x)
yz
2) The path tribe
Betweenness centrality (Anthonisse 1971):
Katz centrality (Katz 1953):
cKatz(x) =
1X
t=0
↵t
⇧x(t)
cbetw(x) =
X
y,z6=x
yz(x)
yz
3) The distance tribe
3) The distance tribe
Closeness centrality (Bavelas 1950):
cclos(x) =
1
P
y d(y, x)
3) The distance tribe
Closeness centrality (Bavelas 1950):
Distance from y
to x
cclos(x) =
1
P
y d(y, x)
3) The distance tribe
Closeness centrality (Bavelas 1950):
cclos(x) =
1
P
y d(y, x)
3) The distance tribe
Closeness centrality (Bavelas 1950):
Lin centrality (Lin 1976):
cclos(x) =
1
P
y d(y, x)
cLin(x) =
canReach(x)2
P
y d(y, x)
3) The distance tribe
Closeness centrality (Bavelas 1950):
Lin centrality (Lin 1976):
cclos(x) =
1
P
y d(y, x)
cLin(x) =
canReach(x)2
P
y d(y, x)
3) The distance tribe
Closeness centrality (Bavelas 1950):
Lin centrality (Lin 1976):
The summation is over all y such that d(y,x)<∞
cclos(x) =
1
P
y d(y, x)
cLin(x) =
canReach(x)2
P
y d(y, x)
3) The distance tribe
Closeness centrality (Bavelas 1950):
Lin centrality (Lin 1976):
The summation is over all y such that d(y,x)<∞
cclos(x) =
1
P
y d(y, x)
cLin(x) =
canReach(x)2
P
y d(y, x)
Completely neglected
by the literature
3) The distance tribe
a new member
3) The distance tribe
a new member
Give a warm welcome to Harmonic centrality:
charm(x) =
X
y6=x
1
d(y, x)
3) The distance tribe
a new member
Give a warm welcome to Harmonic centrality:
The denormalized reciprocal of the harmonic mean of a!
distances (even ∞)
charm(x) =
X
y6=x
1
d(y, x)
3) The distance tribe
a new member
Give a warm welcome to Harmonic centrality:
The denormalized reciprocal of the harmonic mean of a!
distances (even ∞)
Inspired by the use the the harmonic mean in
(Marchiori, Latora 2000)
charm(x) =
X
y6=x
1
d(y, x)
3) The distance tribe
a new member
Give a warm welcome to Harmonic centrality:
The denormalized reciprocal of the harmonic mean of a!
distances (even ∞)
Inspired by the use the the harmonic mean in
(Marchiori, Latora 2000)
Probably already appeared somewhere (e.g., quoted for
undirected graphs in Tore Opsahl’s blog)
charm(x) =
X
y6=x
1
d(y, x)
4) The spectral tribe
4) The spectral tribe
All based on the eigenstructure of some graph-related
matrix
PageRank
(Brin, Page, Motwani, Winograd 1999)
PageRank
(Brin, Page, Motwani, Winograd 1999)
The idea is to start from the basic equation...
cpr(x) =
X
y!x
cpr(y)
d+(y)
PageRank
(Brin, Page, Motwani, Winograd 1999)
The idea is to start from the basic equation...
...with an adjustment to make it have a unique solution
(and more)...
cpr(x) =
X
y!x
cpr(y)
d+(y)
cpr(x) = ↵
X
y!x
cpr(y)
d+(y)
+
1 ↵
n
PageRank
(Brin, Page, Motwani, Winograd 1999)
The idea is to start from the basic equation...
...with an adjustment to make it have a unique solution
(and more)...
cpr(x) =
X
y!x
cpr(y)
d+(y)
cpr(x) = ↵
X
y!x
cpr(y)
d+(y)
+
1 ↵
n
PageRank
(Brin, Page, Motwani, Winograd 1999)
The idea is to start from the basic equation...
...with an adjustment to make it have a unique solution
(and more)...
It is the dominant eigenvector of
cpr(x) =
X
y!x
cpr(y)
d+(y)
cpr(x) = ↵
X
y!x
cpr(y)
d+(y)
+
1 ↵
n
↵Gr + (1 ↵)1T
1/n
PageRank
(Brin, Page, Motwani, Winograd 1999)
The idea is to start from the basic equation...
...with an adjustment to make it have a unique solution
(and more)...
It is the dominant eigenvector of
Katz is the dominant eigenvector of
cpr(x) =
X
y!x
cpr(y)
d+(y)
cpr(x) = ↵
X
y!x
cpr(y)
d+(y)
+
1 ↵
n
↵Gr + (1 ↵)1T
1/n
↵G + (1 ↵)eT
1/n
Seeley index
(Seeley 1949)
Seeley index
(Seeley 1949)
It is essentially like PageRank with no damping factor;
equivalently, the stable state of the natural random walk
on G:
cSeeley(x) =
✓
lim
t!1
1
n
(Gr)t
◆
x
Seeley index
(Seeley 1949)
It is essentially like PageRank with no damping factor;
equivalently, the stable state of the natural random walk
on G:
cSeeley(x) =
✓
lim
t!1
1
n
(Gr)t
◆
x
Seeley index
(Seeley 1949)
It is essentially like PageRank with no damping factor;
equivalently, the stable state of the natural random walk
on G:
It is obtained as the limit of PageRank when the
damping goes to 1
cSeeley(x) =
✓
lim
t!1
1
n
(Gr)t
◆
x
Seeley index
(Seeley 1949)
It is essentially like PageRank with no damping factor;
equivalently, the stable state of the natural random walk
on G:
It is obtained as the limit of PageRank when the
damping goes to 1
In general it is a dominant eigenvector of
cSeeley(x) =
✓
lim
t!1
1
n
(Gr)t
◆
x
Gr
HITS
(Kleinberg 1997)
HITS
(Kleinberg 1997)
The idea is to start from the system:
cHauth(x) =
X
y!x
cHhub(y)
cHhub(x) =
X
x!y
cHauth(y)
HITS
(Kleinberg 1997)
The idea is to start from the system:
HITS centrality is defined to be the “authoritativeness”
score
cHauth(x) =
X
y!x
cHhub(y)
cHhub(x) =
X
x!y
cHauth(y)
HITS
(Kleinberg 1997)
The idea is to start from the system:
HITS centrality is defined to be the “authoritativeness”
score
It is a dominant eigenvector of
cHauth(x) =
X
y!x
cHhub(y)
cHhub(x) =
X
x!y
cHauth(y)
GT
G
HITS (, SALSA etc.)
HITS (, SALSA etc.)
WARNING: These measures were proposed exactly
for ranking results in hyperlinked collections
HITS (, SALSA etc.)
WARNING: These measures were proposed exactly
for ranking results in hyperlinked collections
Should be applied not to the whole graph, but to a
suitable subgraph derived from the query
HITS (, SALSA etc.)
WARNING: These measures were proposed exactly
for ranking results in hyperlinked collections
Should be applied not to the whole graph, but to a
suitable subgraph derived from the query
How the subgraph is derived is very relevant for
effectiveness (Najork, Gollapudi, Panighray 2009)
HITS (, SALSA etc.)
WARNING: These measures were proposed exactly
for ranking results in hyperlinked collections
Should be applied not to the whole graph, but to a
suitable subgraph derived from the query
How the subgraph is derived is very relevant for
effectiveness (Najork, Gollapudi, Panighray 2009)
Not really the central point here, though...
SALSA
(Lempel, Moran 2001)
SALSA
(Lempel, Moran 2001)
The idea is to start from the system:
cSauth(x) =
X
y!x
cShub(y)
d+(y)
cShub(x) =
X
x!y
cSauth(y)
d (y)
SALSA
(Lempel, Moran 2001)
The idea is to start from the system:
SALSA centrality is defined to be the
“authoritativeness” score
cSauth(x) =
X
y!x
cShub(y)
d+(y)
cShub(x) =
X
x!y
cSauth(y)
d (y)
SALSA
(Lempel, Moran 2001)
The idea is to start from the system:
SALSA centrality is defined to be the
“authoritativeness” score
It is a dominant eigenvector of
cSauth(x) =
X
y!x
cShub(y)
d+(y)
cShub(x) =
X
x!y
cSauth(y)
d (y)
GT
c Gr
How?
How to assess centrality
How?
How to assess centrality
Axiomatic approach
How?
How to assess centrality
Axiomatic approach
Ground-truth approach
How?
How to assess centrality
Axiomatic approach
Ground-truth approach
IR approach
How?
How to assess centrality
Axiomatic approach
Ground-truth approach
IR approach
Computational feasibility approach
Axiomatic lens
Axiomatic lens
start from some minimal mathematical requirements
Axiomatic
Sensitivity to size
Axiomatic
Sensitivity to size
k p
Two disjoint (or
very far)
components of a
single network
Axiomatic
Sensitivity to size
When k or p goes to ∞, the nodes of the
corresponding subnetwork must become
more important
k p
Two disjoint (or
very far)
components of a
single network
Axiomatic
Sensitivity to density
Axiomatic
Sensitivity to density
The blue and the red node have the same
importance (the two rings have the same size!)
Axiomatic
Sensitivity to density
The blue and the red node have the same
importance (the two rings have the same size!)
Axiomatic
Sensitivity to density
Densifying the left-hand side, we expect the red node to
become more important than the blue node
Axiomatic
Monotonicity
G
x
y
G’
x
y
Axiomatic
Monotonicity
Adding an arc increases the importance of the
target node
G
x
y
G’
x
y
An axiomatic slaughter
An axiomatic slaughter
An axiomatic slaughter
Size Density Monot.
Degree only k yes yes
Betweennes
s
only p no no
Katz only k yes yes
Closeness no no no
Lin only k no no
Harmonic yes yes yes
PageRank no yes yes
Seeley no yes no
HITS only k yes no
SALSA no yes no
An axiomatic slaughter
Size Density Monot.
Degree only k yes yes
Betweennes
s
only p no no
Katz only k yes yes
Closeness no no no
Lin only k no no
Harmonic yes yes yes
PageRank no yes yes
Seeley no yes no
HITS only k yes no
SALSA no yes no
An axiomatic slaughter
Size Density Monot.
Degree only k yes yes
Betweennes
s
only p no no
Katz only k yes yes
Closeness no no no
Lin only k no no
Harmonic yes yes yes
PageRank no yes yes
Seeley no yes no
HITS only k yes no
SALSA no yes no
An axiomatic slaughter
Size Density Monot.
Degree only k yes yes
Betweennes
s
only p no no
Katz only k yes yes
Closeness no no no
Lin only k no no
Harmonic yes yes yes
PageRank no yes yes
Seeley no yes no
HITS only k yes no
SALSA no yes no
Ground-truth lens
(mostly: anecdoctic / comparative)
Ground-truth lens
(mostly: anecdoctic / comparative)
check them against real (social?) networks on which you have
some ground truth about importance/centrality/...
Hollywood: PageRank
Ron Jeremy Adolf Hitler Lloyd Kaufman George W. Bush
Ronald Reagan Bill Clinton Martin Sheen Debbie Rochon
Hollywood: PageRank
Ron Jeremy Adolf Hitler Lloyd Kaufman George W. Bush
Ronald Reagan Bill Clinton Martin Sheen Debbie Rochon
Hollywood: Degree
William Shatner Bess Flowers Martin Sheen Ronald Reagan
George Clooney Samuel Jackson Robin Williams Tom Hanks
Hollywood: Degree
William Shatner Bess Flowers Martin Sheen Ronald Reagan
George Clooney Samuel Jackson Robin Williams Tom Hanks
Hollywood: Betweenness
Adolf Hitler Lloyd Kaufman Ron Jeremy Tony Robinson
Olu Jacobs Max von Sydow Udo Kier George W. Bush
Hollywood: Betweenness
Adolf Hitler Lloyd Kaufman Ron Jeremy Tony Robinson
Olu Jacobs Max von Sydow Udo Kier George W. Bush
Hollywood: Katz
William Shatner Martin Sheen
George Clooney
Robin WilliamsTom Hanks
Ronald Reagan Bruce Willis Samuel Jackson
Hollywood: Katz
William Shatner Martin Sheen
George Clooney
Robin WilliamsTom Hanks
Ronald Reagan Bruce Willis Samuel Jackson
Hollywood: Closeness
Lina Tjeng Ryan VillapotoAnh Loan Nguyen Thi Chad Reed
Bjorn van Wenum J.P. Ramackers Herbert Sydney R.D. Nicholson
Hollywood: Closeness
Lina Tjeng Ryan VillapotoAnh Loan Nguyen Thi Chad Reed
Bjorn van Wenum J.P. Ramackers Herbert Sydney R.D. Nicholson
Hollywood: Closeness
Lina Tjeng Ryan VillapotoAnh Loan Nguyen Thi Chad Reed
Bjorn van Wenum J.P. Ramackers Herbert Sydney R.D. Nicholson
Isolatednodeshavelargestcentrality
Hollywood: Lin
George Clooney Samuel Jackson Martin Sheen Dennis Hopper
Antonio Banderas Madonna Michael Douglas Tom Cruise
Hollywood: HITS
William Shatner Robin Williams Bruce WillisTom Hanks
Michael Douglas Cameron Diaz Harrison Ford Pierce Brosnan
Hollywood: Harmonic
George Clooney Samuel Jackson Sharon Stone Tom Hanks
Martin Sheen Dennis Hopper Antonio Banderas Madonna
What about the web?
.uk Top Ten
What about the web?
.uk Top Ten
What about the web?
.uk Top Ten
PageRank Katz Lin Harmonic
http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/
http://www.direct.gov.uk/en/index.htm http://www.kelkoo.co.uk/ http://www.bbc.co.uk/accessibility/ http://news.bbc.co.uk/
http://www.names.co.uk/
http://www.kelkoo.co.uk/b/a/
kc_top_searches_charts.html
http://news.bbc.co.uk/ http://www.dti.gov.uk/
http://www.names.co.uk/hosting.html
http://www.kelkoo.co.uk/b/a/
co_2765_128501-company-information-
pages.html
http://www.dti.gov.uk/ http://www.direct.gov.uk/en/index.htm
http://www.names.co.uk/email.html
http://www.kelkoo.co.uk/b/a/sm_site-
map.html?displayType=alpha
http://www.google.co.uk/ http://www.google.co.uk/
http://www.names.co.uk/
controlpanel.html
http://www.kelkoo.co.uk/b/a/
co_5199_128501-how-to-use-
kelkoo.html
http://www.guardian.co.uk/ http://www.bbc.co.uk/accessibility/
http://www.scdc.org.uk/
http://www.kelkoo.co.uk/b/a/
co_2120_128501-shopping-guides-
price-comparison-on-kelkoo-uk.html
http://www.homeoffice.gov.uk/ http://www.homeoffice.gov.uk/
http://www.freelyricsearch.co.uk/
index.html
http://www.ebay.co.uk/ http://www.statistics.gov.uk/ http://www.statistics.gov.uk/
http://www.247partypeople.co.uk/
login.asp
http://www.top50scrappers.co.uk/ http://www.bbc.co.uk/privacy/ http://www.bbc.co.uk/privacy/
http://www.becs.co.uk/catalog/
cookie_usage.php
http://cgi1.ebay.co.uk/aw-cgi/
eBayISAPI.dll?TimeShow
http://www.bbc.co.uk/info/ http://www.bbc.co.uk/info/
What about the web?
.uk Top Ten
PageRank Katz Lin Harmonic
http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/
http://www.direct.gov.uk/en/index.htm http://www.kelkoo.co.uk/ http://www.bbc.co.uk/accessibility/ http://news.bbc.co.uk/
http://www.names.co.uk/
http://www.kelkoo.co.uk/b/a/
kc_top_searches_charts.html
http://news.bbc.co.uk/ http://www.dti.gov.uk/
http://www.names.co.uk/hosting.html
http://www.kelkoo.co.uk/b/a/
co_2765_128501-company-information-
pages.html
http://www.dti.gov.uk/ http://www.direct.gov.uk/en/index.htm
http://www.names.co.uk/email.html
http://www.kelkoo.co.uk/b/a/sm_site-
map.html?displayType=alpha
http://www.google.co.uk/ http://www.google.co.uk/
http://www.names.co.uk/
controlpanel.html
http://www.kelkoo.co.uk/b/a/
co_5199_128501-how-to-use-
kelkoo.html
http://www.guardian.co.uk/ http://www.bbc.co.uk/accessibility/
http://www.scdc.org.uk/
http://www.kelkoo.co.uk/b/a/
co_2120_128501-shopping-guides-
price-comparison-on-kelkoo-uk.html
http://www.homeoffice.gov.uk/ http://www.homeoffice.gov.uk/
http://www.freelyricsearch.co.uk/
index.html
http://www.ebay.co.uk/ http://www.statistics.gov.uk/ http://www.statistics.gov.uk/
http://www.247partypeople.co.uk/
login.asp
http://www.top50scrappers.co.uk/ http://www.bbc.co.uk/privacy/ http://www.bbc.co.uk/privacy/
http://www.becs.co.uk/catalog/
cookie_usage.php
http://cgi1.ebay.co.uk/aw-cgi/
eBayISAPI.dll?TimeShow
http://www.bbc.co.uk/info/ http://www.bbc.co.uk/info/
What about the web?
.uk Top Ten
PageRank Katz Lin Harmonic
http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/
http://www.direct.gov.uk/en/index.htm http://www.kelkoo.co.uk/ http://www.bbc.co.uk/accessibility/ http://news.bbc.co.uk/
http://www.names.co.uk/
http://www.kelkoo.co.uk/b/a/
kc_top_searches_charts.html
http://news.bbc.co.uk/ http://www.dti.gov.uk/
http://www.names.co.uk/hosting.html
http://www.kelkoo.co.uk/b/a/
co_2765_128501-company-information-
pages.html
http://www.dti.gov.uk/ http://www.direct.gov.uk/en/index.htm
http://www.names.co.uk/email.html
http://www.kelkoo.co.uk/b/a/sm_site-
map.html?displayType=alpha
http://www.google.co.uk/ http://www.google.co.uk/
http://www.names.co.uk/
controlpanel.html
http://www.kelkoo.co.uk/b/a/
co_5199_128501-how-to-use-
kelkoo.html
http://www.guardian.co.uk/ http://www.bbc.co.uk/accessibility/
http://www.scdc.org.uk/
http://www.kelkoo.co.uk/b/a/
co_2120_128501-shopping-guides-
price-comparison-on-kelkoo-uk.html
http://www.homeoffice.gov.uk/ http://www.homeoffice.gov.uk/
http://www.freelyricsearch.co.uk/
index.html
http://www.ebay.co.uk/ http://www.statistics.gov.uk/ http://www.statistics.gov.uk/
http://www.247partypeople.co.uk/
login.asp
http://www.top50scrappers.co.uk/ http://www.bbc.co.uk/privacy/ http://www.bbc.co.uk/privacy/
http://www.becs.co.uk/catalog/
cookie_usage.php
http://cgi1.ebay.co.uk/aw-cgi/
eBayISAPI.dll?TimeShow
http://www.bbc.co.uk/info/ http://www.bbc.co.uk/info/
What about the web?
.uk Top Ten
PageRank Katz Lin Harmonic
http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/
http://www.direct.gov.uk/en/index.htm http://www.kelkoo.co.uk/ http://www.bbc.co.uk/accessibility/ http://news.bbc.co.uk/
http://www.names.co.uk/
http://www.kelkoo.co.uk/b/a/
kc_top_searches_charts.html
http://news.bbc.co.uk/ http://www.dti.gov.uk/
http://www.names.co.uk/hosting.html
http://www.kelkoo.co.uk/b/a/
co_2765_128501-company-information-
pages.html
http://www.dti.gov.uk/ http://www.direct.gov.uk/en/index.htm
http://www.names.co.uk/email.html
http://www.kelkoo.co.uk/b/a/sm_site-
map.html?displayType=alpha
http://www.google.co.uk/ http://www.google.co.uk/
http://www.names.co.uk/
controlpanel.html
http://www.kelkoo.co.uk/b/a/
co_5199_128501-how-to-use-
kelkoo.html
http://www.guardian.co.uk/ http://www.bbc.co.uk/accessibility/
http://www.scdc.org.uk/
http://www.kelkoo.co.uk/b/a/
co_2120_128501-shopping-guides-
price-comparison-on-kelkoo-uk.html
http://www.homeoffice.gov.uk/ http://www.homeoffice.gov.uk/
http://www.freelyricsearch.co.uk/
index.html
http://www.ebay.co.uk/ http://www.statistics.gov.uk/ http://www.statistics.gov.uk/
http://www.247partypeople.co.uk/
login.asp
http://www.top50scrappers.co.uk/ http://www.bbc.co.uk/privacy/ http://www.bbc.co.uk/privacy/
http://www.becs.co.uk/catalog/
cookie_usage.php
http://cgi1.ebay.co.uk/aw-cgi/
eBayISAPI.dll?TimeShow
http://www.bbc.co.uk/info/ http://www.bbc.co.uk/info/
How do they compare?
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074
Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801
Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416
Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865
SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924
closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319
harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549
Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546
HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249
between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923
PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276
PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952
PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000
Table 1: Hollywood
1
Hollywood
Kendall’s τ
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074
Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801
Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416
Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865
SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924
closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319
harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549
Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546
HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249
between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923
PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276
PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952
PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000
Table 1: Hollywood
1
Hollywood
Kendall’s τ
most geometric indices and HITS
are rather correlated to one another
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074
Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801
Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416
Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865
SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924
closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319
harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549
Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546
HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249
between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923
PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276
PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952
PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000
Table 1: Hollywood
1
Hollywood
Kendall’s τ
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074
Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801
Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416
Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865
SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924
closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319
harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549
Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546
HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249
between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923
PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276
PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952
PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000
Table 1: Hollywood
1
Hollywood
Kendall’s τ
Katz, degree and SALSA are also
highly correlated
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074
Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801
Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416
Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865
SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924
closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319
harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549
Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546
HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249
between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923
PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276
PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952
PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000
Table 1: Hollywood
1
Hollywood
Kendall’s τ
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074
Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801
Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416
Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865
SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924
closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319
harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549
Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546
HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249
between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923
PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276
PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952
PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000
Table 1: Hollywood
1
Hollywood
Kendall’s τ
Betweenness does not correlate
to anything
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074
Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801
Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416
Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865
SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924
closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319
harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549
Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546
HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249
between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923
PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276
PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952
PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000
Table 1: Hollywood
1
Hollywood
Kendall’s τ
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074
Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801
Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416
Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865
SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924
closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319
harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549
Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546
HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249
between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923
PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276
PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952
PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000
Table 1: Hollywood
1
Hollywood
Kendall’s τ
PageRank stands alone
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784
Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595
Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611
Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622
SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747
closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004
harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072
Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069
HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415
PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289
PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088
PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000
Table 1: .uk
1
.uk (May 2007 snapshot)
Kendall’s τ
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784
Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595
Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611
Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622
SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747
closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004
harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072
Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069
HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415
PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289
PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088
PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000
Table 1: .uk
1
.uk (May 2007 snapshot)
Kendall’s τ
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784
Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595
Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611
Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622
SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747
closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004
harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072
Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069
HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415
PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289
PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088
PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000
Table 1: .uk
1
.uk (May 2007 snapshot)
Kendall’s τ
Betweenness could not be computed
because of graph size (106M nodes)
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784
Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595
Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611
Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622
SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747
closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004
harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072
Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069
HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415
PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289
PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088
PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000
Table 1: .uk
1
.uk (May 2007 snapshot)
Kendall’s τ
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784
Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595
Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611
Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622
SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747
closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004
harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072
Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069
HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415
PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289
PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088
PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000
Table 1: .uk
1
.uk (May 2007 snapshot)
Kendall’s τ
The same correlations as with Hollywood,
but even more emphasized
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784
Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595
Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611
Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622
SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747
closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004
harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072
Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069
HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415
PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289
PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088
PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000
Table 1: .uk
1
.uk (May 2007 snapshot)
Kendall’s τ
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784
Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595
Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611
Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622
SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747
closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004
harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072
Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069
HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415
PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289
PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088
PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000
Table 1: .uk
1
.uk (May 2007 snapshot)
Kendall’s τ
Exception: HITS used to be correlated with the geometric
indices, while now it is alone
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784
Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595
Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611
Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622
SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747
closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004
harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072
Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069
HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415
PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289
PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088
PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000
Table 1: .uk
1
.uk (May 2007 snapshot)
Kendall’s τ
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784
Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595
Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611
Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622
SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747
closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004
harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072
Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069
HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415
PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289
PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088
PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000
Table 1: .uk
1
.uk (May 2007 snapshot)
Kendall’s τ
A larger correlation between PageRank and Katz & degree
Orkut (2007 snapshot)
Kendall’s τ
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3508 0.7265 0.7596 0.8016
Katz 1/4 0.9522 1.0000 0.9489 0.8750 0.9522 0.5972 0.5875 0.5967 0.3984 0.6868 0.7179 0.7577
Katz 1/2 0.8982 0.9489 1.0000 0.9275 0.8982 0.6382 0.6338 0.6380 0.4491 0.6400 0.6690 0.7067
Katz 3/4 0.8242 0.8750 0.9275 1.0000 0.8242 0.6839 0.6910 0.6842 0.5213 0.5742 0.6005 0.6355
SALSA 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3505 0.7265 0.7596 0.8016
closeness 0.5521 0.5972 0.6382 0.6839 0.5521 1.0000 0.9458 0.9830 0.6539 0.3862 0.4040 0.4268
harmonic 0.5391 0.5875 0.6338 0.6910 0.5391 0.9458 1.0000 0.9471 0.7090 0.3612 0.3777 0.3992
Lin 0.5513 0.5967 0.6380 0.6842 0.5513 0.9830 0.9471 1.0000 0.6546 0.3852 0.4030 0.4257
HITS 0.3508 0.3984 0.4491 0.5213 0.3505 0.6539 0.7090 0.6546 1.0000 0.1689 0.1778 0.1917
PR 1/4 0.7265 0.6868 0.6400 0.5742 0.7265 0.3862 0.3612 0.3852 0.1689 1.0000 0.9520 0.8889
PR 1/2 0.7596 0.7179 0.6690 0.6005 0.7596 0.4040 0.3777 0.4030 0.1778 0.9520 1.0000 0.9363
PR 3/4 0.8016 0.7577 0.7067 0.6355 0.8016 0.4268 0.3992 0.4257 0.1917 0.8889 0.9363 1.0000
Table 1: Orkut
1
Orkut (2007 snapshot)
Kendall’s τ
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3508 0.7265 0.7596 0.8016
Katz 1/4 0.9522 1.0000 0.9489 0.8750 0.9522 0.5972 0.5875 0.5967 0.3984 0.6868 0.7179 0.7577
Katz 1/2 0.8982 0.9489 1.0000 0.9275 0.8982 0.6382 0.6338 0.6380 0.4491 0.6400 0.6690 0.7067
Katz 3/4 0.8242 0.8750 0.9275 1.0000 0.8242 0.6839 0.6910 0.6842 0.5213 0.5742 0.6005 0.6355
SALSA 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3505 0.7265 0.7596 0.8016
closeness 0.5521 0.5972 0.6382 0.6839 0.5521 1.0000 0.9458 0.9830 0.6539 0.3862 0.4040 0.4268
harmonic 0.5391 0.5875 0.6338 0.6910 0.5391 0.9458 1.0000 0.9471 0.7090 0.3612 0.3777 0.3992
Lin 0.5513 0.5967 0.6380 0.6842 0.5513 0.9830 0.9471 1.0000 0.6546 0.3852 0.4030 0.4257
HITS 0.3508 0.3984 0.4491 0.5213 0.3505 0.6539 0.7090 0.6546 1.0000 0.1689 0.1778 0.1917
PR 1/4 0.7265 0.6868 0.6400 0.5742 0.7265 0.3862 0.3612 0.3852 0.1689 1.0000 0.9520 0.8889
PR 1/2 0.7596 0.7179 0.6690 0.6005 0.7596 0.4040 0.3777 0.4030 0.1778 0.9520 1.0000 0.9363
PR 3/4 0.8016 0.7577 0.7067 0.6355 0.8016 0.4268 0.3992 0.4257 0.1917 0.8889 0.9363 1.0000
Table 1: Orkut
1
Orkut (2007 snapshot)
Kendall’s τ
The same correlation as in Hollywood,
even if this time SALSA is also pretty correlated
with PageRank as well
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3508 0.7265 0.7596 0.8016
Katz 1/4 0.9522 1.0000 0.9489 0.8750 0.9522 0.5972 0.5875 0.5967 0.3984 0.6868 0.7179 0.7577
Katz 1/2 0.8982 0.9489 1.0000 0.9275 0.8982 0.6382 0.6338 0.6380 0.4491 0.6400 0.6690 0.7067
Katz 3/4 0.8242 0.8750 0.9275 1.0000 0.8242 0.6839 0.6910 0.6842 0.5213 0.5742 0.6005 0.6355
SALSA 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3505 0.7265 0.7596 0.8016
closeness 0.5521 0.5972 0.6382 0.6839 0.5521 1.0000 0.9458 0.9830 0.6539 0.3862 0.4040 0.4268
harmonic 0.5391 0.5875 0.6338 0.6910 0.5391 0.9458 1.0000 0.9471 0.7090 0.3612 0.3777 0.3992
Lin 0.5513 0.5967 0.6380 0.6842 0.5513 0.9830 0.9471 1.0000 0.6546 0.3852 0.4030 0.4257
HITS 0.3508 0.3984 0.4491 0.5213 0.3505 0.6539 0.7090 0.6546 1.0000 0.1689 0.1778 0.1917
PR 1/4 0.7265 0.6868 0.6400 0.5742 0.7265 0.3862 0.3612 0.3852 0.1689 1.0000 0.9520 0.8889
PR 1/2 0.7596 0.7179 0.6690 0.6005 0.7596 0.4040 0.3777 0.4030 0.1778 0.9520 1.0000 0.9363
PR 3/4 0.8016 0.7577 0.7067 0.6355 0.8016 0.4268 0.3992 0.4257 0.1917 0.8889 0.9363 1.0000
Table 1: Orkut
1
Orkut (2007 snapshot)
Kendall’s τ
IR lens
(using TREC .gov2)
IR lens
(using TREC .gov2)
use centrality (in isolation or combined with textual features)
to rerank query results and see how good (bad) they do
TREC .gov2
TREC .gov2
150 queries (query title words, in AND; with stemming,
no stopword elimination)
TREC .gov2
150 queries (query title words, in AND; with stemming,
no stopword elimination)
Generated the result graph using the method described
by Najork et al. 2009 (a variant of Kleinberg’s HITS
graph, taking a in-links and b out-links)
TREC .gov2
150 queries (query title words, in AND; with stemming,
no stopword elimination)
Generated the result graph using the method described
by Najork et al. 2009 (a variant of Kleinberg’s HITS
graph, taking a in-links and b out-links)
Considered many combinations: here I present only
the cases a=b=0 (i.e., subgraph induced by the result set)
TREC .gov2
150 queries (query title words, in AND; with stemming,
no stopword elimination)
Generated the result graph using the method described
by Najork et al. 2009 (a variant of Kleinberg’s HITS
graph, taking a in-links and b out-links)
Considered many combinations: here I present only
the cases a=b=0 (i.e., subgraph induced by the result set)
With or without intra-host links
P@10 and NDCG@10
P@10 and NDCG@10
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Betweenness 0.0584 0.0595 0.0577 0.0588
Closeness 0.1101 0.1061 0.1121 0.1168
PageRank (best) 0.1107 0.1078 0.1295 0.1347
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
Katz (best) 0.1242 0.1228 0.1262 0.1297
Lin 0.1295 0.1308 0.1248 0.1286
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Betweenness 0.0584 0.0595 0.0577 0.0588
Closeness 0.1101 0.1061 0.1121 0.1168
PageRank (best) 0.1107 0.1078 0.1295 0.1347
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
Katz (best) 0.1242 0.1228 0.1262 0.1297
Lin 0.1295 0.1308 0.1248 0.1286
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Betweenness 0.0584 0.0595 0.0577 0.0588
Closeness 0.1101 0.1061 0.1121 0.1168
PageRank (best) 0.1107 0.1078 0.1295 0.1347
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
Katz (best) 0.1242 0.1228 0.1262 0.1297
Lin 0.1295 0.1308 0.1248 0.1286
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Betweenness 0.0584 0.0595 0.0577 0.0588
Closeness 0.1101 0.1061 0.1121 0.1168
PageRank (best) 0.1107 0.1078 0.1295 0.1347
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
Katz (best) 0.1242 0.1228 0.1262 0.1297
Lin 0.1295 0.1308 0.1248 0.1286
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Betweenness 0.0584 0.0595 0.0577 0.0588
Closeness 0.1101 0.1061 0.1121 0.1168
PageRank (best) 0.1107 0.1078 0.1295 0.1347
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
Katz (best) 0.1242 0.1228 0.1262 0.1297
Lin 0.1295 0.1308 0.1248 0.1286
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
Intra-host links?
Intra-host links?
Keep them or throw them away?
Intra-host links?
Keep them or throw them away?
Most indices get better if you throw them away...
Intra-host links?
Keep them or throw them away?
Most indices get better if you throw them away...
Throwing such links away injects a lot of information,
but apparently harmonic doesn’t need it!
Intra-host links?
Keep them or throw them away?
Most indices get better if you throw them away...
Throwing such links away injects a lot of information,
but apparently harmonic doesn’t need it!
...but harmonic is better (and best of all) with the
whole thing!
.uk (May 2007 snapshot)
Kendall’s τ with no intra-host links
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9995 0.9995 0.9995 0.9965 0.9883 0.9984 0.9984 0.8767 0.9956 0.9956 0.9955
Katz 1/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
Katz 1/2 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
Katz 3/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
SALSA 0.9965 0.9960 0.9960 0.9960 1.0000 0.9904 0.9950 0.9950 0.8718 0.9959 0.9959 0.9958
closeness 0.9883 0.9870 0.9870 0.9870 0.9904 1.0000 0.9859 0.9876 0.8714 0.9894 0.9893 0.9892
harmonic 0.9984 0.9986 0.9986 0.9986 0.9950 0.9859 1.0000 0.9984 0.8759 0.9944 0.9945 0.9946
Lin 0.9984 0.9978 0.9978 0.9978 0.9950 0.9876 0.9984 1.0000 0.8759 0.9945 0.9946 0.9946
HITS 0.8767 0.8763 0.8763 0.8763 0.8718 0.8714 0.8759 0.8759 1.0000 0.8727 0.8727 0.8727
PR 1/4 0.9956 0.9952 0.9952 0.9952 0.9959 0.9894 0.9944 0.9945 0.8727 1.0000 0.9998 0.9997
PR 1/2 0.9956 0.9953 0.9953 0.9953 0.9959 0.9893 0.9945 0.9946 0.8727 0.9998 1.0000 0.9999
PR 3/4 0.9955 0.9953 0.9953 0.9953 0.9958 0.9892 0.9946 0.9946 0.8727 0.9997 0.9999 1.0000
Table 1: .uk-nn
1
.uk (May 2007 snapshot)
Kendall’s τ with no intra-host links
Everything becomes correlated. Why???
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9995 0.9995 0.9995 0.9965 0.9883 0.9984 0.9984 0.8767 0.9956 0.9956 0.9955
Katz 1/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
Katz 1/2 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
Katz 3/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
SALSA 0.9965 0.9960 0.9960 0.9960 1.0000 0.9904 0.9950 0.9950 0.8718 0.9959 0.9959 0.9958
closeness 0.9883 0.9870 0.9870 0.9870 0.9904 1.0000 0.9859 0.9876 0.8714 0.9894 0.9893 0.9892
harmonic 0.9984 0.9986 0.9986 0.9986 0.9950 0.9859 1.0000 0.9984 0.8759 0.9944 0.9945 0.9946
Lin 0.9984 0.9978 0.9978 0.9978 0.9950 0.9876 0.9984 1.0000 0.8759 0.9945 0.9946 0.9946
HITS 0.8767 0.8763 0.8763 0.8763 0.8718 0.8714 0.8759 0.8759 1.0000 0.8727 0.8727 0.8727
PR 1/4 0.9956 0.9952 0.9952 0.9952 0.9959 0.9894 0.9944 0.9945 0.8727 1.0000 0.9998 0.9997
PR 1/2 0.9956 0.9953 0.9953 0.9953 0.9959 0.9893 0.9945 0.9946 0.8727 0.9998 1.0000 0.9999
PR 3/4 0.9955 0.9953 0.9953 0.9953 0.9958 0.9892 0.9946 0.9946 0.8727 0.9997 0.9999 1.0000
Table 1: .uk-nn
1
.uk (May 2007 snapshot)
Kendall’s τ with no intra-host links
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9995 0.9995 0.9995 0.9965 0.9883 0.9984 0.9984 0.8767 0.9956 0.9956 0.9955
Katz 1/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
Katz 1/2 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
Katz 3/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
SALSA 0.9965 0.9960 0.9960 0.9960 1.0000 0.9904 0.9950 0.9950 0.8718 0.9959 0.9959 0.9958
closeness 0.9883 0.9870 0.9870 0.9870 0.9904 1.0000 0.9859 0.9876 0.8714 0.9894 0.9893 0.9892
harmonic 0.9984 0.9986 0.9986 0.9986 0.9950 0.9859 1.0000 0.9984 0.8759 0.9944 0.9945 0.9946
Lin 0.9984 0.9978 0.9978 0.9978 0.9950 0.9876 0.9984 1.0000 0.8759 0.9945 0.9946 0.9946
HITS 0.8767 0.8763 0.8763 0.8763 0.8718 0.8714 0.8759 0.8759 1.0000 0.8727 0.8727 0.8727
PR 1/4 0.9956 0.9952 0.9952 0.9952 0.9959 0.9894 0.9944 0.9945 0.8727 1.0000 0.9998 0.9997
PR 1/2 0.9956 0.9953 0.9953 0.9953 0.9959 0.9893 0.9945 0.9946 0.8727 0.9998 1.0000 0.9999
PR 3/4 0.9955 0.9953 0.9953 0.9953 0.9958 0.9892 0.9946 0.9946 0.8727 0.9997 0.9999 1.0000
Table 1: .uk-nn
1
.uk (May 2007 snapshot)
Kendall’s τ with no intra-host links
Because most nodes have degree 0 (and hence
they also have all the other scores at a tie)
degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4
degree 1.0000 0.9995 0.9995 0.9995 0.9965 0.9883 0.9984 0.9984 0.8767 0.9956 0.9956 0.9955
Katz 1/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
Katz 1/2 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
Katz 3/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953
SALSA 0.9965 0.9960 0.9960 0.9960 1.0000 0.9904 0.9950 0.9950 0.8718 0.9959 0.9959 0.9958
closeness 0.9883 0.9870 0.9870 0.9870 0.9904 1.0000 0.9859 0.9876 0.8714 0.9894 0.9893 0.9892
harmonic 0.9984 0.9986 0.9986 0.9986 0.9950 0.9859 1.0000 0.9984 0.8759 0.9944 0.9945 0.9946
Lin 0.9984 0.9978 0.9978 0.9978 0.9950 0.9876 0.9984 1.0000 0.8759 0.9945 0.9946 0.9946
HITS 0.8767 0.8763 0.8763 0.8763 0.8718 0.8714 0.8759 0.8759 1.0000 0.8727 0.8727 0.8727
PR 1/4 0.9956 0.9952 0.9952 0.9952 0.9959 0.9894 0.9944 0.9945 0.8727 1.0000 0.9998 0.9997
PR 1/2 0.9956 0.9953 0.9953 0.9953 0.9959 0.9893 0.9945 0.9946 0.8727 0.9998 1.0000 0.9999
PR 3/4 0.9955 0.9953 0.9953 0.9953 0.9958 0.9892 0.9946 0.9946 0.8727 0.9997 0.9999 1.0000
Table 1: .uk-nn
1
P@10 and NDCG@10
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Weighted degree 0.1356 0.1373 0.1262 0.1295
SALSinA 0.1349 0.1357 0.1255 0.1318
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
BM25 0.5644 0.5842 0.5644 0.5842
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Weighted degree 0.1356 0.1373 0.1262 0.1295
SALSinA 0.1349 0.1357 0.1255 0.1318
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
BM25 0.5644 0.5842 0.5644 0.5842
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Weighted degree 0.1356 0.1373 0.1262 0.1295
SALSinA 0.1349 0.1357 0.1255 0.1318
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
BM25 0.5644 0.5842 0.5644 0.5842
d (x) · canReach(x)
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Weighted degree 0.1356 0.1373 0.1262 0.1295
SALSinA 0.1349 0.1357 0.1255 0.1318
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
BM25 0.5644 0.5842 0.5644 0.5842
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Weighted degree 0.1356 0.1373 0.1262 0.1295
SALSinA 0.1349 0.1357 0.1255 0.1318
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
BM25 0.5644 0.5842 0.5644 0.5842
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Weighted degree 0.1356 0.1373 0.1262 0.1295
SALSinA 0.1349 0.1357 0.1255 0.1318
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
BM25 0.5644 0.5842 0.5644 0.5842
d (x) · connected(x)
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Weighted degree 0.1356 0.1373 0.1262 0.1295
SALSinA 0.1349 0.1357 0.1255 0.1318
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
BM25 0.5644 0.5842 0.5644 0.5842
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Weighted degree 0.1356 0.1373 0.1262 0.1295
SALSinA 0.1349 0.1357 0.1255 0.1318
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
BM25 0.5644 0.5842 0.5644 0.5842
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Weighted degree 0.1356 0.1373 0.1262 0.1295
SALSinA 0.1349 0.1357 0.1255 0.1318
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
BM25 0.5644 0.5842 0.5644 0.5842
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Weighted degree 0.1356 0.1373 0.1262 0.1295
SALSinA 0.1349 0.1357 0.1255 0.1318
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
BM25 0.5644 0.5842 0.5644 0.5842
P@10 and NDCG@10
All linksAll links Inter-host links onlyInter-host links only
P@10 NDCG@10 P@10 NDCG@10
Weighted degree 0.1356 0.1373 0.1262 0.1295
SALSinA 0.1349 0.1357 0.1255 0.1318
Degree 0.1208 0.1091 0.1248 0.1283
SALSA 0.1221 0.1194 0.1282 0.1384
HITS 0.1349 0.1364 0.1107 0.1179
Harmonic 0.1430 0.1449 0.1262 0.1293
BM25 0.5644 0.5842 0.5644 0.5842
TODO
Study features in
combination
Computational feasibility lens
Computational feasibility lens
which indices are computable effectively on large
networks; consider also parallelizability / distributability...
Best algorithms so far
Best algorithms so far
Best algorithms so far
How? Progr.
Degree Trivial O(1) - - +++
Betweennes
s
O(nm) [Brandes 2001] - no -
Katz Iterative (Gauss-Seidel) Fast convergence yes +
PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++
Seeley Iterative (PM) Slow convergence no -
HITS Iterative (PM) Slow convergence no -
SALSA Direct O(n+) - no +++
Best algorithms so far
How? Progr.
Degree Trivial O(1) - - +++
Betweennes
s
O(nm) [Brandes 2001] - no -
Katz Iterative (Gauss-Seidel) Fast convergence yes +
PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++
Seeley Iterative (PM) Slow convergence no -
HITS Iterative (PM) Slow convergence no -
SALSA Direct O(n+) - no +++
Best algorithms so far
How? Progr.
Degree Trivial O(1) - - +++
Betweennes
s
O(nm) [Brandes 2001] - no -
Katz Iterative (Gauss-Seidel) Fast convergence yes +
PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++
Seeley Iterative (PM) Slow convergence no -
HITS Iterative (PM) Slow convergence no -
SALSA Direct O(n+) - no +++
Best algorithms so far
How? Progr.
Degree Trivial O(1) - - +++
Betweennes
s
O(nm) [Brandes 2001] - no -
Katz Iterative (Gauss-Seidel) Fast convergence yes +
PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++
Seeley Iterative (PM) Slow convergence no -
HITS Iterative (PM) Slow convergence no -
SALSA Direct O(n+) - no +++
Best algorithms so far
How? Progr.
Degree Trivial O(1) - - +++
Betweennes
s
O(nm) [Brandes 2001] - no -
Katz Iterative (Gauss-Seidel) Fast convergence yes +
PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++
Seeley Iterative (PM) Slow convergence no -
HITS Iterative (PM) Slow convergence no -
SALSA Direct O(n+) - no +++
Best algorithms so far
How? Progr.
Degree Trivial O(1) - - +++
Betweennes
s
O(nm) [Brandes 2001] - no -
Katz Iterative (Gauss-Seidel) Fast convergence yes +
PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++
Seeley Iterative (PM) Slow convergence no -
HITS Iterative (PM) Slow convergence no -
SALSA Direct O(n+) - no +++
Best algorithms so far
How? Progr.
Degree Trivial O(1) - - +++
Betweennes
s
O(nm) [Brandes 2001] - no -
Katz Iterative (Gauss-Seidel) Fast convergence yes +
PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++
Seeley Iterative (PM) Slow convergence no -
HITS Iterative (PM) Slow convergence no -
SALSA Direct O(n+) - no +++
Best algorithms so far
How? Progr.
Degree Trivial O(1) - - +++
Betweennes
s
O(nm) [Brandes 2001] - no -
Katz Iterative (Gauss-Seidel) Fast convergence yes +
PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++
Seeley Iterative (PM) Slow convergence no -
HITS Iterative (PM) Slow convergence no -
SALSA Direct O(n+) - no +++
Geometric
Geometric
What about geometic measures, like closeness, Lin
and harmonic?
Computing harmonic
Computing harmonic
Let us take harmonic as an example
Computing harmonic
Let us take harmonic as an example
But how easy is it to compute?
charm(x) =
X
y6=x
1
d(y, x)
Computing harmonic
Let us take harmonic as an example
But how easy is it to compute?
...or...
charm(x) =
X
y6=x
1
d(y, x)
charm(x) =
1X
t=1
|Bt(x)| |Bt 1(x)|
t
Computing harmonic
Let us take harmonic as an example
But how easy is it to compute?
...or...
charm(x) =
X
y6=x
1
d(y, x)
charm(x) =
1X
t=1
|Bt(x)| |Bt 1(x)|
t Ball of radius t
about x
Computing harmonic
Let us take harmonic as an example
But how easy is it to compute?
...or...
charm(x) =
X
y6=x
1
d(y, x)
charm(x) =
1X
t=1
|Bt(x)| |Bt 1(x)|
t
Computing by diffusion
Computing by diffusion
Clearly B0(x) = {x}
Computing by diffusion
Clearly
Moreover
B0(x) = {x}
Bt+1(x) = {x} [
[
x!y
Bt(y)
Computing by diffusion
Clearly
Moreover
So one needs just one single sequential scan of the
graph to compute the balls at the next iteration
B0(x) = {x}
Bt+1(x) = {x} [
[
x!y
Bt(y)
A round of updates
☺
☺
☺
☺
☺
☺
☺
☺
☺
☺
☺
☺
☺☺
☺
☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺
☺
☺
☺
☺
☺
☺☺
☺
☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺
☺
☺
☺
☺
☺☺
☺
☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺
☺
☺
☺☺
☺
☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺
☺
☺
☺☺
☺
☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺
☺☺
☺
☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺
☺☺
☺
☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺☺
☺
☺
☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺☺
☺
☺
☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺☺
☺☺
☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺☺
☺☺ ☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺☺
☺☺ ☺
☺
☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺☺
☺☺ ☺
☺☺
☺☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺☺
☺☺ ☺
☺☺ ☺
☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺☺
☺☺ ☺
☺☺ ☺
☺
☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺☺
☺☺ ☺
☺☺ ☺
☺☺
☺
A round of updates
☺
☺
☺
☺
☺
☺
☺☺ ☺
☺ ☺
☺☺
☺☺ ☺
☺☺ ☺
☺☺ ☺
Another round...
☺☺
☺ ☺ ☺
☺☺ ☺
☺ ☺
☺☺ ☺
☺☺☺
☺ ☺ ☺
☺ ☺
☺☺☺
☺ ☺
☺☺ ☺
☺☺
☺☺☺
Another round...
☺☺
☺ ☺ ☺
☺☺ ☺
☺ ☺
☺☺ ☺
☺☺☺
☺ ☺ ☺
☺ ☺
☺☺☺
☺ ☺
☺☺ ☺
☺☺
☺☺☺
Another round...
☺☺
☺ ☺ ☺
☺☺ ☺
☺ ☺
☺☺ ☺
☺☺☺
☺ ☺ ☺☺ ☺
☺☺☺
☺ ☺
☺☺ ☺
☺☺
☺☺☺
Another round...
☺☺
☺ ☺ ☺
☺☺ ☺
☺ ☺
☺☺ ☺
☺☺☺
☺ ☺ ☺☺ ☺☺☺☺
☺ ☺
☺☺ ☺
☺☺
☺☺☺
Another round...
☺☺
☺ ☺ ☺
☺☺ ☺
☺ ☺
☺☺ ☺
☺☺☺
☺ ☺ ☺☺ ☺☺☺☺
☺ ☺
☺☺ ☺
☺☺
☺☺☺
Another round...
☺☺
☺ ☺ ☺
☺☺ ☺
☺ ☺
☺☺ ☺
☺☺☺
☺ ☺ ☺☺ ☺☺☺☺
☺ ☺☺☺ ☺
☺☺
☺☺☺
Another round...
☺☺
☺ ☺ ☺
☺☺ ☺
☺ ☺
☺☺ ☺
☺☺☺
☺ ☺ ☺☺ ☺☺☺☺
☺ ☺☺☺ ☺
☺☺
☺☺☺
Another round...
☺☺
☺ ☺ ☺
☺☺ ☺
☺ ☺
☺☺ ☺
☺☺☺
☺ ☺ ☺☺ ☺☺☺☺
☺ ☺☺☺ ☺
☺☺☺☺☺
Easy but expensive
Easy but expensive
Each set uses linear space; overall quadratic
Easy but expensive
Each set uses linear space; overall quadratic
Impossible!
Easy but expensive
Each set uses linear space; overall quadratic
Impossible!
But what if we use approximate sets?
Easy but expensive
Each set uses linear space; overall quadratic
Impossible!
But what if we use approximate sets?
Idea: use probabilistic counters, which represent sets but
answer just to “size?” questions
Easy but expensive
Each set uses linear space; overall quadratic
Impossible!
But what if we use approximate sets?
Idea: use probabilistic counters, which represent sets but
answer just to “size?” questions
Very small! With 40 bits you can count up to 4 billion
with a standard deviation of 6%
Use the Force
Use the Force
HyperBall (http://webgraph.dsi.unimi.it/)
does the job
Use the Force
HyperBall (http://webgraph.dsi.unimi.it/)
does the job
Fully exploits multicore architectures; uses broadword
microparallelization
Use the Force
HyperBall (http://webgraph.dsi.unimi.it/)
does the job
Fully exploits multicore architectures; uses broadword
microparallelization
Works like a charm on networks with hundreds of
millions of nodes
What is HyperBall
What is HyperBall
It uses the diffusion idea to compute (at the same
time):
What is HyperBall
It uses the diffusion idea to compute (at the same
time):
Lin + Closeness + Harmonic + Number of reachable
nodes...
What is HyperBall
It uses the diffusion idea to compute (at the same
time):
Lin + Closeness + Harmonic + Number of reachable
nodes...
It employs Flajolet’s HyperLogLog counters to store
sets
What is HyperBall
It uses the diffusion idea to compute (at the same
time):
Lin + Closeness + Harmonic + Number of reachable
nodes...
It employs Flajolet’s HyperLogLog counters to store
sets
More on HyperBall in the next few slides...
Main trick
Main trick
Choose an approximate set such that unions can be
computed quickly
Main trick
Choose an approximate set such that unions can be
computed quickly
ANF [Palmer et al., KDD ’02] uses Martin–Flajolet
(MF) counters (log n+c space)
Main trick
Choose an approximate set such that unions can be
computed quickly
ANF [Palmer et al., KDD ’02] uses Martin–Flajolet
(MF) counters (log n+c space)
We use HyperLogLog counters [Flajolet et al.,2007]
(loglog n space)
Main trick
Choose an approximate set such that unions can be
computed quickly
ANF [Palmer et al., KDD ’02] uses Martin–Flajolet
(MF) counters (log n+c space)
We use HyperLogLog counters [Flajolet et al.,2007]
(loglog n space)
MF counters can be combined with an OR
Main trick
Choose an approximate set such that unions can be
computed quickly
ANF [Palmer et al., KDD ’02] uses Martin–Flajolet
(MF) counters (log n+c space)
We use HyperLogLog counters [Flajolet et al.,2007]
(loglog n space)
MF counters can be combined with an OR
We use broadword programming to combine
HyperLogLog counters quickly!
HyperLogLog counters
HyperLogLog counters
Instead of actually counting, we observe a statistical
feature of a set (think stream) of elements
HyperLogLog counters
Instead of actually counting, we observe a statistical
feature of a set (think stream) of elements
The feature: the number of trailing zeroes of the value
of a very good hash function
HyperLogLog counters
Instead of actually counting, we observe a statistical
feature of a set (think stream) of elements
The feature: the number of trailing zeroes of the value
of a very good hash function
We keep track of the maximum m (log log n bits!)
HyperLogLog counters
Instead of actually counting, we observe a statistical
feature of a set (think stream) of elements
The feature: the number of trailing zeroes of the value
of a very good hash function
We keep track of the maximum m (log log n bits!)
The number of distinct elements ∝ 2m
HyperLogLog counters
Instead of actually counting, we observe a statistical
feature of a set (think stream) of elements
The feature: the number of trailing zeroes of the value
of a very good hash function
We keep track of the maximum m (log log n bits!)
The number of distinct elements ∝ 2m
Important: the counter of stream AB is simply the
maximum of the counters of A and B!
Other ideas
Other ideas
We keep track of modifications: we do not maximize
with unmodified counters
Other ideas
We keep track of modifications: we do not maximize
with unmodified counters
Systolic computation: each modified set signals back to
predecessors that something is going to happen (much
fewer updates!)
Other ideas
We keep track of modifications: we do not maximize
with unmodified counters
Systolic computation: each modified set signals back to
predecessors that something is going to happen (much
fewer updates!)
Multicore exploitation by decomposition: a task is
updating just 1000 counters (almost linear scaling)
Footprint
Footprint
Scalability: a minimum of 20 bytes per node
Footprint
Scalability: a minimum of 20 bytes per node
On a 2TiB machine, 100 billion nodes
Footprint
Scalability: a minimum of 20 bytes per node
On a 2TiB machine, 100 billion nodes
Graph structure is accessed by memory-mapping in a
compressed form (WebGraph)
Footprint
Scalability: a minimum of 20 bytes per node
On a 2TiB machine, 100 billion nodes
Graph structure is accessed by memory-mapping in a
compressed form (WebGraph)
Pointers to the graph are stored using succinct lists
(Elias-Fano representation)
Performance
Performance
On a 177K nodes
Performance
On a 177K nodes
Hadoop: 2875s per iteration [Kang, Papadimitriou,
Sun and H. Tong, 2011]
Performance
On a 177K nodes
Hadoop: 2875s per iteration [Kang, Papadimitriou,
Sun and H. Tong, 2011]
HyperBall on this laptop: 70s per iteration
Performance
On a 177K nodes
Hadoop: 2875s per iteration [Kang, Papadimitriou,
Sun and H. Tong, 2011]
HyperBall on this laptop: 70s per iteration
On a 32-core workstation: 23s per iteration
Performance
On a 177K nodes
Hadoop: 2875s per iteration [Kang, Papadimitriou,
Sun and H. Tong, 2011]
HyperBall on this laptop: 70s per iteration
On a 32-core workstation: 23s per iteration
On ClueWeb09 (4.8G nodes, 8G arcs) on a 40-core
workstation: 141m (avg. 40s per iteration)
Convergence
5 15 25 35 45 55 65 75 85 95
0.0000.0020.0040.006
Harmonic centrality
# runs
Relativeerror
And when the Force is not enough?
And when the Force is not enough?
Diffusion processes are easily distributable
And when the Force is not enough?
Diffusion processes are easily distributable
A Pregel-like implementation of HyperANF is
underway (by Sebastian Schelter, TU Berlin)
Did you see the
light?
Did you see the
light?
No? Me neither... But we have a path
Did you see the
light?
No? Me neither... But we have a path
Some things have been ruled out, at least
Did you see the
light?
No? Me neither... But we have a path
Some things have been ruled out, at least
Some others, that were neglected, have found revenge
Did you see the
light?
No? Me neither... But we have a path
Some things have been ruled out, at least
Some others, that were neglected, have found revenge
Some new ones seem to be promising
Questions?

Mais conteúdo relacionado

Semelhante a Centralities_PaoloBoldi

Data/Visualization - Digital Center Cohort - 13_0222
Data/Visualization - Digital Center Cohort - 13_0222Data/Visualization - Digital Center Cohort - 13_0222
Data/Visualization - Digital Center Cohort - 13_0222jeffreylancaster
 
What Shakespeare Taught Us About Visualization and Data Science
What Shakespeare Taught Us About Visualization and Data ScienceWhat Shakespeare Taught Us About Visualization and Data Science
What Shakespeare Taught Us About Visualization and Data Sciencegleicher
 
Choosing the Right Database
Choosing the Right DatabaseChoosing the Right Database
Choosing the Right DatabaseDavid Simons
 
Interactive visualization and exploration of network data with Gephi
Interactive visualization and exploration of network data with GephiInteractive visualization and exploration of network data with Gephi
Interactive visualization and exploration of network data with GephiDigital Methods Initiative
 
Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016Roy Clariana
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveJames Hendler
 
Bristol Uni - Use Cases of NoSQL
Bristol Uni - Use Cases of NoSQLBristol Uni - Use Cases of NoSQL
Bristol Uni - Use Cases of NoSQLDavid Simons
 
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...taxonbytes
 
Interactive visualization and exploration of network data with gephi
Interactive visualization and exploration of network data with gephiInteractive visualization and exploration of network data with gephi
Interactive visualization and exploration of network data with gephiBernhard Rieder
 
02 Network Data Collection
02 Network Data Collection02 Network Data Collection
02 Network Data Collectiondnac
 
Spark Summit EU talk by Casey Stella
Spark Summit EU talk by Casey StellaSpark Summit EU talk by Casey Stella
Spark Summit EU talk by Casey StellaSpark Summit
 
Public opinion leaderships analysis using methods of social network analysis ...
Public opinion leaderships analysis using methods of social network analysis ...Public opinion leaderships analysis using methods of social network analysis ...
Public opinion leaderships analysis using methods of social network analysis ...Miguel Oliva
 
Introduction to query rewriting optimisation with dependencies
Introduction to query rewriting optimisation with dependenciesIntroduction to query rewriting optimisation with dependencies
Introduction to query rewriting optimisation with dependenciesMariano Rodriguez-Muro
 
Compare Contrast Essay Template
Compare Contrast Essay TemplateCompare Contrast Essay Template
Compare Contrast Essay TemplateAlly Gonzales
 
Statistical Programming with JavaScript
Statistical Programming with JavaScriptStatistical Programming with JavaScript
Statistical Programming with JavaScriptDavid Simons
 
Discovery Analytics: Tracking Ebola Spread
Discovery Analytics: Tracking Ebola SpreadDiscovery Analytics: Tracking Ebola Spread
Discovery Analytics: Tracking Ebola SpreadHPCC Systems
 
Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organi...
Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organi...Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organi...
Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organi...COST Action TD1210
 

Semelhante a Centralities_PaoloBoldi (20)

Data/Visualization - Digital Center Cohort - 13_0222
Data/Visualization - Digital Center Cohort - 13_0222Data/Visualization - Digital Center Cohort - 13_0222
Data/Visualization - Digital Center Cohort - 13_0222
 
What Shakespeare Taught Us About Visualization and Data Science
What Shakespeare Taught Us About Visualization and Data ScienceWhat Shakespeare Taught Us About Visualization and Data Science
What Shakespeare Taught Us About Visualization and Data Science
 
Choosing the Right Database
Choosing the Right DatabaseChoosing the Right Database
Choosing the Right Database
 
Interactive visualization and exploration of network data with Gephi
Interactive visualization and exploration of network data with GephiInteractive visualization and exploration of network data with Gephi
Interactive visualization and exploration of network data with Gephi
 
Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspective
 
Bristol Uni - Use Cases of NoSQL
Bristol Uni - Use Cases of NoSQLBristol Uni - Use Cases of NoSQL
Bristol Uni - Use Cases of NoSQL
 
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
 
NGSS Earth and Space Science
NGSS Earth and Space ScienceNGSS Earth and Space Science
NGSS Earth and Space Science
 
Uo march2015 talk
Uo march2015 talkUo march2015 talk
Uo march2015 talk
 
Interactive visualization and exploration of network data with gephi
Interactive visualization and exploration of network data with gephiInteractive visualization and exploration of network data with gephi
Interactive visualization and exploration of network data with gephi
 
02 Network Data Collection
02 Network Data Collection02 Network Data Collection
02 Network Data Collection
 
02 Network Data Collection (2016)
02 Network Data Collection (2016)02 Network Data Collection (2016)
02 Network Data Collection (2016)
 
Spark Summit EU talk by Casey Stella
Spark Summit EU talk by Casey StellaSpark Summit EU talk by Casey Stella
Spark Summit EU talk by Casey Stella
 
Public opinion leaderships analysis using methods of social network analysis ...
Public opinion leaderships analysis using methods of social network analysis ...Public opinion leaderships analysis using methods of social network analysis ...
Public opinion leaderships analysis using methods of social network analysis ...
 
Introduction to query rewriting optimisation with dependencies
Introduction to query rewriting optimisation with dependenciesIntroduction to query rewriting optimisation with dependencies
Introduction to query rewriting optimisation with dependencies
 
Compare Contrast Essay Template
Compare Contrast Essay TemplateCompare Contrast Essay Template
Compare Contrast Essay Template
 
Statistical Programming with JavaScript
Statistical Programming with JavaScriptStatistical Programming with JavaScript
Statistical Programming with JavaScript
 
Discovery Analytics: Tracking Ebola Spread
Discovery Analytics: Tracking Ebola SpreadDiscovery Analytics: Tracking Ebola Spread
Discovery Analytics: Tracking Ebola Spread
 
Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organi...
Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organi...Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organi...
Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organi...
 

Mais de Yandex

Предсказание оттока игроков из World of Tanks
Предсказание оттока игроков из World of TanksПредсказание оттока игроков из World of Tanks
Предсказание оттока игроков из World of TanksYandex
 
Как принять/организовать работу по поисковой оптимизации сайта, Сергей Царик,...
Как принять/организовать работу по поисковой оптимизации сайта, Сергей Царик,...Как принять/организовать работу по поисковой оптимизации сайта, Сергей Царик,...
Как принять/организовать работу по поисковой оптимизации сайта, Сергей Царик,...Yandex
 
Структурированные данные, Юлия Тихоход, лекция в Школе вебмастеров Яндекса
Структурированные данные, Юлия Тихоход, лекция в Школе вебмастеров ЯндексаСтруктурированные данные, Юлия Тихоход, лекция в Школе вебмастеров Яндекса
Структурированные данные, Юлия Тихоход, лекция в Школе вебмастеров ЯндексаYandex
 
Представление сайта в поиске, Сергей Лысенко, лекция в Школе вебмастеров Яндекса
Представление сайта в поиске, Сергей Лысенко, лекция в Школе вебмастеров ЯндексаПредставление сайта в поиске, Сергей Лысенко, лекция в Школе вебмастеров Яндекса
Представление сайта в поиске, Сергей Лысенко, лекция в Школе вебмастеров ЯндексаYandex
 
Плохие методы продвижения сайта, Екатерины Гладких, лекция в Школе вебмастеро...
Плохие методы продвижения сайта, Екатерины Гладких, лекция в Школе вебмастеро...Плохие методы продвижения сайта, Екатерины Гладких, лекция в Школе вебмастеро...
Плохие методы продвижения сайта, Екатерины Гладких, лекция в Школе вебмастеро...Yandex
 
Основные принципы ранжирования, Сергей Царик и Антон Роменский, лекция в Школ...
Основные принципы ранжирования, Сергей Царик и Антон Роменский, лекция в Школ...Основные принципы ранжирования, Сергей Царик и Антон Роменский, лекция в Школ...
Основные принципы ранжирования, Сергей Царик и Антон Роменский, лекция в Школ...Yandex
 
Основные принципы индексирования сайта, Александр Смирнов, лекция в Школе веб...
Основные принципы индексирования сайта, Александр Смирнов, лекция в Школе веб...Основные принципы индексирования сайта, Александр Смирнов, лекция в Школе веб...
Основные принципы индексирования сайта, Александр Смирнов, лекция в Школе веб...Yandex
 
Мобильное приложение: как и зачем, Александр Лукин, лекция в Школе вебмастеро...
Мобильное приложение: как и зачем, Александр Лукин, лекция в Школе вебмастеро...Мобильное приложение: как и зачем, Александр Лукин, лекция в Школе вебмастеро...
Мобильное приложение: как и зачем, Александр Лукин, лекция в Школе вебмастеро...Yandex
 
Сайты на мобильных устройствах, Олег Ножичкин, лекция в Школе вебмастеров Янд...
Сайты на мобильных устройствах, Олег Ножичкин, лекция в Школе вебмастеров Янд...Сайты на мобильных устройствах, Олег Ножичкин, лекция в Школе вебмастеров Янд...
Сайты на мобильных устройствах, Олег Ножичкин, лекция в Школе вебмастеров Янд...Yandex
 
Качественная аналитика сайта, Юрий Батиевский, лекция в Школе вебмастеров Янд...
Качественная аналитика сайта, Юрий Батиевский, лекция в Школе вебмастеров Янд...Качественная аналитика сайта, Юрий Батиевский, лекция в Школе вебмастеров Янд...
Качественная аналитика сайта, Юрий Батиевский, лекция в Школе вебмастеров Янд...Yandex
 
Что можно и что нужно измерять на сайте, Петр Аброськин, лекция в Школе вебма...
Что можно и что нужно измерять на сайте, Петр Аброськин, лекция в Школе вебма...Что можно и что нужно измерять на сайте, Петр Аброськин, лекция в Школе вебма...
Что можно и что нужно измерять на сайте, Петр Аброськин, лекция в Школе вебма...Yandex
 
Как правильно поставить ТЗ на создание сайта, Алексей Бородкин, лекция в Школ...
Как правильно поставить ТЗ на создание сайта, Алексей Бородкин, лекция в Школ...Как правильно поставить ТЗ на создание сайта, Алексей Бородкин, лекция в Школ...
Как правильно поставить ТЗ на создание сайта, Алексей Бородкин, лекция в Школ...Yandex
 
Как защитить свой сайт, Пётр Волков, лекция в Школе вебмастеров
Как защитить свой сайт, Пётр Волков, лекция в Школе вебмастеровКак защитить свой сайт, Пётр Волков, лекция в Школе вебмастеров
Как защитить свой сайт, Пётр Волков, лекция в Школе вебмастеровYandex
 
Как правильно составить структуру сайта, Дмитрий Сатин, лекция в Школе вебмас...
Как правильно составить структуру сайта, Дмитрий Сатин, лекция в Школе вебмас...Как правильно составить структуру сайта, Дмитрий Сатин, лекция в Школе вебмас...
Как правильно составить структуру сайта, Дмитрий Сатин, лекция в Школе вебмас...Yandex
 
Технические особенности создания сайта, Дмитрий Васильева, лекция в Школе веб...
Технические особенности создания сайта, Дмитрий Васильева, лекция в Школе веб...Технические особенности создания сайта, Дмитрий Васильева, лекция в Школе веб...
Технические особенности создания сайта, Дмитрий Васильева, лекция в Школе веб...Yandex
 
Конструкторы для отдельных элементов сайта, Елена Першина, лекция в Школе веб...
Конструкторы для отдельных элементов сайта, Елена Першина, лекция в Школе веб...Конструкторы для отдельных элементов сайта, Елена Першина, лекция в Школе веб...
Конструкторы для отдельных элементов сайта, Елена Першина, лекция в Школе веб...Yandex
 
Контент для интернет-магазинов, Катерина Ерошина, лекция в Школе вебмастеров ...
Контент для интернет-магазинов, Катерина Ерошина, лекция в Школе вебмастеров ...Контент для интернет-магазинов, Катерина Ерошина, лекция в Школе вебмастеров ...
Контент для интернет-магазинов, Катерина Ерошина, лекция в Школе вебмастеров ...Yandex
 
Как написать хороший текст для сайта, Катерина Ерошина, лекция в Школе вебмас...
Как написать хороший текст для сайта, Катерина Ерошина, лекция в Школе вебмас...Как написать хороший текст для сайта, Катерина Ерошина, лекция в Школе вебмас...
Как написать хороший текст для сайта, Катерина Ерошина, лекция в Школе вебмас...Yandex
 
Usability и дизайн - как не помешать пользователю, Алексей Иванов, лекция в Ш...
Usability и дизайн - как не помешать пользователю, Алексей Иванов, лекция в Ш...Usability и дизайн - как не помешать пользователю, Алексей Иванов, лекция в Ш...
Usability и дизайн - как не помешать пользователю, Алексей Иванов, лекция в Ш...Yandex
 
Cайт. Зачем он и каким должен быть, Алексей Иванов, лекция в Школе вебмастеро...
Cайт. Зачем он и каким должен быть, Алексей Иванов, лекция в Школе вебмастеро...Cайт. Зачем он и каким должен быть, Алексей Иванов, лекция в Школе вебмастеро...
Cайт. Зачем он и каким должен быть, Алексей Иванов, лекция в Школе вебмастеро...Yandex
 

Mais de Yandex (20)

Предсказание оттока игроков из World of Tanks
Предсказание оттока игроков из World of TanksПредсказание оттока игроков из World of Tanks
Предсказание оттока игроков из World of Tanks
 
Как принять/организовать работу по поисковой оптимизации сайта, Сергей Царик,...
Как принять/организовать работу по поисковой оптимизации сайта, Сергей Царик,...Как принять/организовать работу по поисковой оптимизации сайта, Сергей Царик,...
Как принять/организовать работу по поисковой оптимизации сайта, Сергей Царик,...
 
Структурированные данные, Юлия Тихоход, лекция в Школе вебмастеров Яндекса
Структурированные данные, Юлия Тихоход, лекция в Школе вебмастеров ЯндексаСтруктурированные данные, Юлия Тихоход, лекция в Школе вебмастеров Яндекса
Структурированные данные, Юлия Тихоход, лекция в Школе вебмастеров Яндекса
 
Представление сайта в поиске, Сергей Лысенко, лекция в Школе вебмастеров Яндекса
Представление сайта в поиске, Сергей Лысенко, лекция в Школе вебмастеров ЯндексаПредставление сайта в поиске, Сергей Лысенко, лекция в Школе вебмастеров Яндекса
Представление сайта в поиске, Сергей Лысенко, лекция в Школе вебмастеров Яндекса
 
Плохие методы продвижения сайта, Екатерины Гладких, лекция в Школе вебмастеро...
Плохие методы продвижения сайта, Екатерины Гладких, лекция в Школе вебмастеро...Плохие методы продвижения сайта, Екатерины Гладких, лекция в Школе вебмастеро...
Плохие методы продвижения сайта, Екатерины Гладких, лекция в Школе вебмастеро...
 
Основные принципы ранжирования, Сергей Царик и Антон Роменский, лекция в Школ...
Основные принципы ранжирования, Сергей Царик и Антон Роменский, лекция в Школ...Основные принципы ранжирования, Сергей Царик и Антон Роменский, лекция в Школ...
Основные принципы ранжирования, Сергей Царик и Антон Роменский, лекция в Школ...
 
Основные принципы индексирования сайта, Александр Смирнов, лекция в Школе веб...
Основные принципы индексирования сайта, Александр Смирнов, лекция в Школе веб...Основные принципы индексирования сайта, Александр Смирнов, лекция в Школе веб...
Основные принципы индексирования сайта, Александр Смирнов, лекция в Школе веб...
 
Мобильное приложение: как и зачем, Александр Лукин, лекция в Школе вебмастеро...
Мобильное приложение: как и зачем, Александр Лукин, лекция в Школе вебмастеро...Мобильное приложение: как и зачем, Александр Лукин, лекция в Школе вебмастеро...
Мобильное приложение: как и зачем, Александр Лукин, лекция в Школе вебмастеро...
 
Сайты на мобильных устройствах, Олег Ножичкин, лекция в Школе вебмастеров Янд...
Сайты на мобильных устройствах, Олег Ножичкин, лекция в Школе вебмастеров Янд...Сайты на мобильных устройствах, Олег Ножичкин, лекция в Школе вебмастеров Янд...
Сайты на мобильных устройствах, Олег Ножичкин, лекция в Школе вебмастеров Янд...
 
Качественная аналитика сайта, Юрий Батиевский, лекция в Школе вебмастеров Янд...
Качественная аналитика сайта, Юрий Батиевский, лекция в Школе вебмастеров Янд...Качественная аналитика сайта, Юрий Батиевский, лекция в Школе вебмастеров Янд...
Качественная аналитика сайта, Юрий Батиевский, лекция в Школе вебмастеров Янд...
 
Что можно и что нужно измерять на сайте, Петр Аброськин, лекция в Школе вебма...
Что можно и что нужно измерять на сайте, Петр Аброськин, лекция в Школе вебма...Что можно и что нужно измерять на сайте, Петр Аброськин, лекция в Школе вебма...
Что можно и что нужно измерять на сайте, Петр Аброськин, лекция в Школе вебма...
 
Как правильно поставить ТЗ на создание сайта, Алексей Бородкин, лекция в Школ...
Как правильно поставить ТЗ на создание сайта, Алексей Бородкин, лекция в Школ...Как правильно поставить ТЗ на создание сайта, Алексей Бородкин, лекция в Школ...
Как правильно поставить ТЗ на создание сайта, Алексей Бородкин, лекция в Школ...
 
Как защитить свой сайт, Пётр Волков, лекция в Школе вебмастеров
Как защитить свой сайт, Пётр Волков, лекция в Школе вебмастеровКак защитить свой сайт, Пётр Волков, лекция в Школе вебмастеров
Как защитить свой сайт, Пётр Волков, лекция в Школе вебмастеров
 
Как правильно составить структуру сайта, Дмитрий Сатин, лекция в Школе вебмас...
Как правильно составить структуру сайта, Дмитрий Сатин, лекция в Школе вебмас...Как правильно составить структуру сайта, Дмитрий Сатин, лекция в Школе вебмас...
Как правильно составить структуру сайта, Дмитрий Сатин, лекция в Школе вебмас...
 
Технические особенности создания сайта, Дмитрий Васильева, лекция в Школе веб...
Технические особенности создания сайта, Дмитрий Васильева, лекция в Школе веб...Технические особенности создания сайта, Дмитрий Васильева, лекция в Школе веб...
Технические особенности создания сайта, Дмитрий Васильева, лекция в Школе веб...
 
Конструкторы для отдельных элементов сайта, Елена Першина, лекция в Школе веб...
Конструкторы для отдельных элементов сайта, Елена Першина, лекция в Школе веб...Конструкторы для отдельных элементов сайта, Елена Першина, лекция в Школе веб...
Конструкторы для отдельных элементов сайта, Елена Першина, лекция в Школе веб...
 
Контент для интернет-магазинов, Катерина Ерошина, лекция в Школе вебмастеров ...
Контент для интернет-магазинов, Катерина Ерошина, лекция в Школе вебмастеров ...Контент для интернет-магазинов, Катерина Ерошина, лекция в Школе вебмастеров ...
Контент для интернет-магазинов, Катерина Ерошина, лекция в Школе вебмастеров ...
 
Как написать хороший текст для сайта, Катерина Ерошина, лекция в Школе вебмас...
Как написать хороший текст для сайта, Катерина Ерошина, лекция в Школе вебмас...Как написать хороший текст для сайта, Катерина Ерошина, лекция в Школе вебмас...
Как написать хороший текст для сайта, Катерина Ерошина, лекция в Школе вебмас...
 
Usability и дизайн - как не помешать пользователю, Алексей Иванов, лекция в Ш...
Usability и дизайн - как не помешать пользователю, Алексей Иванов, лекция в Ш...Usability и дизайн - как не помешать пользователю, Алексей Иванов, лекция в Ш...
Usability и дизайн - как не помешать пользователю, Алексей Иванов, лекция в Ш...
 
Cайт. Зачем он и каким должен быть, Алексей Иванов, лекция в Школе вебмастеро...
Cайт. Зачем он и каким должен быть, Алексей Иванов, лекция в Школе вебмастеро...Cайт. Зачем он и каким должен быть, Алексей Иванов, лекция в Школе вебмастеро...
Cайт. Зачем он и каким должен быть, Алексей Иванов, лекция в Школе вебмастеро...
 

Último

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Último (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Centralities_PaoloBoldi

  • 1. A MODERN VIEW OF CENTRALITY MEASURES Paolo Boldi Univ. degli Studi di Milano Work in progress with Sebastiano Vigna
  • 2. What do these people have in common?
  • 3. What do these people have in common?
  • 4. What do these people have in common?
  • 5. What do these people have in common?
  • 6. What do these people have in common?
  • 8. PageRank (believe it or not) These are the top-8 actors of the Hollywood graph according to PageRank
  • 9. PageRank (believe it or not) These are the top-8 actors of the Hollywood graph according to PageRank Who is going to tell that is better?
  • 10. PageRank (believe it or not) These are the top-8 actors of the Hollywood graph according to PageRank Who is going to tell that is better?
  • 11. PageRank (believe it or not) These are the top-8 actors of the Hollywood graph according to PageRank Who is going to tell that is better?
  • 13. PageRank sucks! (or NOT?) The Hollywood graph we used contains 2,000,000 nodes. Most of them are completely unknown!
  • 14. PageRank sucks! (or NOT?) The Hollywood graph we used contains 2,000,000 nodes. Most of them are completely unknown! PageRank is not singling out the best actors...
  • 15. PageRank sucks! (or NOT?) The Hollywood graph we used contains 2,000,000 nodes. Most of them are completely unknown! PageRank is not singling out the best actors... ...but still it is not pointing to random individuals, is it?
  • 16. The glass is half empty
  • 17. The glass is half empty Link Analysis sucks
  • 18. The glass is half empty Link Analysis sucks The graph itself does not contain useful information, in general
  • 19. The glass is half empty Link Analysis sucks The graph itself does not contain useful information, in general Of little (or no) help for IR
  • 20. The glass is half full
  • 21. The glass is half full Link Analysis is good, but you cannot expect any centrality index to work on any network
  • 22. The glass is half full Link Analysis is good, but you cannot expect any centrality index to work on any network Probably PageRank is scarcely useful on Hollywood, but maybe other measures would work like a charm
  • 23. The glass is half full Link Analysis is good, but you cannot expect any centrality index to work on any network Probably PageRank is scarcely useful on Hollywood, but maybe other measures would work like a charm Betweenness? Closeness? Katz? ...
  • 24. The glass is half full Link Analysis is good, but you cannot expect any centrality index to work on any network Probably PageRank is scarcely useful on Hollywood, but maybe other measures would work like a charm Betweenness? Closeness? Katz? ... A problem here: some indices are computationa!y unfeasible on large networks!
  • 26. The grand plan Centrality indices / link analysis has been with us for 60+ years
  • 27. The grand plan Centrality indices / link analysis has been with us for 60+ years IR, sociology, psychology all have given their contributions...
  • 28. The grand plan Centrality indices / link analysis has been with us for 60+ years IR, sociology, psychology all have given their contributions... ...turning it into a jungle
  • 30. My point, today By contract, I have to convince you that networks contain a big deal of information useful for IR
  • 31. My point, today By contract, I have to convince you that networks contain a big deal of information useful for IR But you have to use them in a proper way (i.e., to compute suitable indices)
  • 32. My point, today By contract, I have to convince you that networks contain a big deal of information useful for IR But you have to use them in a proper way (i.e., to compute suitable indices) And more often than not this calls for new algorithms because of their size (and sometimes density)
  • 33. Centrality in social sciences a historical account
  • 34. Centrality in social sciences a historical account First works by Bavelas at MIT (1948)
  • 35. Centrality in social sciences a historical account First works by Bavelas at MIT (1948) This sparked countless works (Bavelas 1951; Katz 1953; Shaw 1954; Beauchamp 1965; Mackenzie 1966; Burgess 1969; Anthonisse 1971; Czapiel 1974...) that Freeman (1979) tried to summarize concluding that:
  • 36. Centrality in social sciences a historical account First works by Bavelas at MIT (1948) This sparked countless works (Bavelas 1951; Katz 1953; Shaw 1954; Beauchamp 1965; Mackenzie 1966; Burgess 1969; Anthonisse 1971; Czapiel 1974...) that Freeman (1979) tried to summarize concluding that: several measures are o"en only vaguely related to the intuitive ideas they purport to index, and many are so complex that it is difficult or impossible to discover what, if anything, they are measuring
  • 37. The 1990s revival Link Analysis Ranking
  • 38. The 1990s revival Link Analysis Ranking With the advent of search engines, there was a strong revamp of centrality (LAR in this context)
  • 39. The 1990s revival Link Analysis Ranking With the advent of search engines, there was a strong revamp of centrality (LAR in this context) New scenarios
  • 40. The 1990s revival Link Analysis Ranking With the advent of search engines, there was a strong revamp of centrality (LAR in this context) New scenarios • graphs are directed mainly
  • 41. The 1990s revival Link Analysis Ranking With the advent of search engines, there was a strong revamp of centrality (LAR in this context) New scenarios • graphs are directed mainly • they are huge
  • 42. The 1990s revival Link Analysis Ranking With the advent of search engines, there was a strong revamp of centrality (LAR in this context) New scenarios • graphs are directed mainly • they are huge • new attention to efficiency
  • 43. The 1990s revival Link Analysis Ranking With the advent of search engines, there was a strong revamp of centrality (LAR in this context) New scenarios • graphs are directed mainly • they are huge • new attention to efficiency LAR is the ungrateful reincarnation of Centrality
  • 45. What a mess! Only few, brave guys tried to make some order in this mess!
  • 46. What a mess! Only few, brave guys tried to make some order in this mess! Noteworthy (in the IR context): Craswell, Upstill, Hawking (ADCS 2003); Najork, Zaragoza, Taylor (SIGIR 2007); Najork, Gollapudi, Panigrahy (WSDM 2009)
  • 47. What a mess! Only few, brave guys tried to make some order in this mess! Noteworthy (in the IR context): Craswell, Upstill, Hawking (ADCS 2003); Najork, Zaragoza, Taylor (SIGIR 2007); Najork, Gollapudi, Panigrahy (WSDM 2009) Guess that the others were scared by computational burden of classical measures
  • 49. Begin the begin The only point on which everybody seems to agree: the center of a star is more important than the other nodes
  • 50. Begin the begin The only point on which everybody seems to agree: the center of a star is more important than the other nodes But what does it make more important?
  • 52. Begin the begin I have the largest degree
  • 54. Begin the begin I am on most shortest paths
  • 56. Begin the begin I am maximally close to everybody
  • 58. A tale of three tribes
  • 59. A tale of three tribes Indices based on degree
  • 60. A tale of three tribes Indices based on degree Indices based on the number of paths or shortest paths (geodesics) passing through a vertex;
  • 61. A tale of three tribes Indices based on degree Indices based on the number of paths or shortest paths (geodesics) passing through a vertex; Indices based on distances from the vertex to the other vertices
  • 62. A tale of three tribes Indices based on degree Indices based on the number of paths or shortest paths (geodesics) passing through a vertex; Indices based on distances from the vertex to the other vertices Let me call these indices geometric
  • 63. Three dogs strive for a bone, and the fourth runs away with it...
  • 64. Three dogs strive for a bone, and the fourth runs away with it... The advent of Link Analysis pushed for a third (winning) tribe: spectral indices
  • 65. Three dogs strive for a bone, and the fourth runs away with it... The advent of Link Analysis pushed for a third (winning) tribe: spectral indices Some of the geometric indices have also a spectral (equivalent) definition
  • 66. 1) The degree tribe
  • 67. 1) The degree tribe (In-)Degree centrality: the number of incoming links cdeg(x) = d (x)
  • 68. 1) The degree tribe (In-)Degree centrality: the number of incoming links Careful: when dealing with directed networks, some indices present two variants (e.g., in-degree vs. out- degree), the ones based on incoming paths being more interesting cdeg(x) = d (x)
  • 69. 2) The path tribe
  • 70. 2) The path tribe Betweenness centrality (Anthonisse 1971): cbetw(x) = X y,z6=x yz(x) yz
  • 71. 2) The path tribe Betweenness centrality (Anthonisse 1971): Fraction of shortest paths from y to z passing through x cbetw(x) = X y,z6=x yz(x) yz
  • 72. 2) The path tribe Betweenness centrality (Anthonisse 1971): cbetw(x) = X y,z6=x yz(x) yz
  • 73. 2) The path tribe Betweenness centrality (Anthonisse 1971): Katz centrality (Katz 1953): cKatz(x) = 1X t=0 ↵t ⇧x(t) cbetw(x) = X y,z6=x yz(x) yz
  • 74. 2) The path tribe Betweenness centrality (Anthonisse 1971): Katz centrality (Katz 1953): # of paths of length t ending in x cKatz(x) = 1X t=0 ↵t ⇧x(t) cbetw(x) = X y,z6=x yz(x) yz
  • 75. 2) The path tribe Betweenness centrality (Anthonisse 1971): Katz centrality (Katz 1953): cKatz(x) = 1X t=0 ↵t ⇧x(t) cbetw(x) = X y,z6=x yz(x) yz
  • 77. 3) The distance tribe Closeness centrality (Bavelas 1950): cclos(x) = 1 P y d(y, x)
  • 78. 3) The distance tribe Closeness centrality (Bavelas 1950): Distance from y to x cclos(x) = 1 P y d(y, x)
  • 79. 3) The distance tribe Closeness centrality (Bavelas 1950): cclos(x) = 1 P y d(y, x)
  • 80. 3) The distance tribe Closeness centrality (Bavelas 1950): Lin centrality (Lin 1976): cclos(x) = 1 P y d(y, x) cLin(x) = canReach(x)2 P y d(y, x)
  • 81. 3) The distance tribe Closeness centrality (Bavelas 1950): Lin centrality (Lin 1976): cclos(x) = 1 P y d(y, x) cLin(x) = canReach(x)2 P y d(y, x)
  • 82. 3) The distance tribe Closeness centrality (Bavelas 1950): Lin centrality (Lin 1976): The summation is over all y such that d(y,x)<∞ cclos(x) = 1 P y d(y, x) cLin(x) = canReach(x)2 P y d(y, x)
  • 83. 3) The distance tribe Closeness centrality (Bavelas 1950): Lin centrality (Lin 1976): The summation is over all y such that d(y,x)<∞ cclos(x) = 1 P y d(y, x) cLin(x) = canReach(x)2 P y d(y, x) Completely neglected by the literature
  • 84. 3) The distance tribe a new member
  • 85. 3) The distance tribe a new member Give a warm welcome to Harmonic centrality: charm(x) = X y6=x 1 d(y, x)
  • 86. 3) The distance tribe a new member Give a warm welcome to Harmonic centrality: The denormalized reciprocal of the harmonic mean of a! distances (even ∞) charm(x) = X y6=x 1 d(y, x)
  • 87. 3) The distance tribe a new member Give a warm welcome to Harmonic centrality: The denormalized reciprocal of the harmonic mean of a! distances (even ∞) Inspired by the use the the harmonic mean in (Marchiori, Latora 2000) charm(x) = X y6=x 1 d(y, x)
  • 88. 3) The distance tribe a new member Give a warm welcome to Harmonic centrality: The denormalized reciprocal of the harmonic mean of a! distances (even ∞) Inspired by the use the the harmonic mean in (Marchiori, Latora 2000) Probably already appeared somewhere (e.g., quoted for undirected graphs in Tore Opsahl’s blog) charm(x) = X y6=x 1 d(y, x)
  • 90. 4) The spectral tribe All based on the eigenstructure of some graph-related matrix
  • 92. PageRank (Brin, Page, Motwani, Winograd 1999) The idea is to start from the basic equation... cpr(x) = X y!x cpr(y) d+(y)
  • 93. PageRank (Brin, Page, Motwani, Winograd 1999) The idea is to start from the basic equation... ...with an adjustment to make it have a unique solution (and more)... cpr(x) = X y!x cpr(y) d+(y) cpr(x) = ↵ X y!x cpr(y) d+(y) + 1 ↵ n
  • 94. PageRank (Brin, Page, Motwani, Winograd 1999) The idea is to start from the basic equation... ...with an adjustment to make it have a unique solution (and more)... cpr(x) = X y!x cpr(y) d+(y) cpr(x) = ↵ X y!x cpr(y) d+(y) + 1 ↵ n
  • 95. PageRank (Brin, Page, Motwani, Winograd 1999) The idea is to start from the basic equation... ...with an adjustment to make it have a unique solution (and more)... It is the dominant eigenvector of cpr(x) = X y!x cpr(y) d+(y) cpr(x) = ↵ X y!x cpr(y) d+(y) + 1 ↵ n ↵Gr + (1 ↵)1T 1/n
  • 96. PageRank (Brin, Page, Motwani, Winograd 1999) The idea is to start from the basic equation... ...with an adjustment to make it have a unique solution (and more)... It is the dominant eigenvector of Katz is the dominant eigenvector of cpr(x) = X y!x cpr(y) d+(y) cpr(x) = ↵ X y!x cpr(y) d+(y) + 1 ↵ n ↵Gr + (1 ↵)1T 1/n ↵G + (1 ↵)eT 1/n
  • 98. Seeley index (Seeley 1949) It is essentially like PageRank with no damping factor; equivalently, the stable state of the natural random walk on G: cSeeley(x) = ✓ lim t!1 1 n (Gr)t ◆ x
  • 99. Seeley index (Seeley 1949) It is essentially like PageRank with no damping factor; equivalently, the stable state of the natural random walk on G: cSeeley(x) = ✓ lim t!1 1 n (Gr)t ◆ x
  • 100. Seeley index (Seeley 1949) It is essentially like PageRank with no damping factor; equivalently, the stable state of the natural random walk on G: It is obtained as the limit of PageRank when the damping goes to 1 cSeeley(x) = ✓ lim t!1 1 n (Gr)t ◆ x
  • 101. Seeley index (Seeley 1949) It is essentially like PageRank with no damping factor; equivalently, the stable state of the natural random walk on G: It is obtained as the limit of PageRank when the damping goes to 1 In general it is a dominant eigenvector of cSeeley(x) = ✓ lim t!1 1 n (Gr)t ◆ x Gr
  • 103. HITS (Kleinberg 1997) The idea is to start from the system: cHauth(x) = X y!x cHhub(y) cHhub(x) = X x!y cHauth(y)
  • 104. HITS (Kleinberg 1997) The idea is to start from the system: HITS centrality is defined to be the “authoritativeness” score cHauth(x) = X y!x cHhub(y) cHhub(x) = X x!y cHauth(y)
  • 105. HITS (Kleinberg 1997) The idea is to start from the system: HITS centrality is defined to be the “authoritativeness” score It is a dominant eigenvector of cHauth(x) = X y!x cHhub(y) cHhub(x) = X x!y cHauth(y) GT G
  • 106. HITS (, SALSA etc.)
  • 107. HITS (, SALSA etc.) WARNING: These measures were proposed exactly for ranking results in hyperlinked collections
  • 108. HITS (, SALSA etc.) WARNING: These measures were proposed exactly for ranking results in hyperlinked collections Should be applied not to the whole graph, but to a suitable subgraph derived from the query
  • 109. HITS (, SALSA etc.) WARNING: These measures were proposed exactly for ranking results in hyperlinked collections Should be applied not to the whole graph, but to a suitable subgraph derived from the query How the subgraph is derived is very relevant for effectiveness (Najork, Gollapudi, Panighray 2009)
  • 110. HITS (, SALSA etc.) WARNING: These measures were proposed exactly for ranking results in hyperlinked collections Should be applied not to the whole graph, but to a suitable subgraph derived from the query How the subgraph is derived is very relevant for effectiveness (Najork, Gollapudi, Panighray 2009) Not really the central point here, though...
  • 112. SALSA (Lempel, Moran 2001) The idea is to start from the system: cSauth(x) = X y!x cShub(y) d+(y) cShub(x) = X x!y cSauth(y) d (y)
  • 113. SALSA (Lempel, Moran 2001) The idea is to start from the system: SALSA centrality is defined to be the “authoritativeness” score cSauth(x) = X y!x cShub(y) d+(y) cShub(x) = X x!y cSauth(y) d (y)
  • 114. SALSA (Lempel, Moran 2001) The idea is to start from the system: SALSA centrality is defined to be the “authoritativeness” score It is a dominant eigenvector of cSauth(x) = X y!x cShub(y) d+(y) cShub(x) = X x!y cSauth(y) d (y) GT c Gr
  • 115. How? How to assess centrality
  • 116. How? How to assess centrality Axiomatic approach
  • 117. How? How to assess centrality Axiomatic approach Ground-truth approach
  • 118. How? How to assess centrality Axiomatic approach Ground-truth approach IR approach
  • 119. How? How to assess centrality Axiomatic approach Ground-truth approach IR approach Computational feasibility approach
  • 121. Axiomatic lens start from some minimal mathematical requirements
  • 123. Axiomatic Sensitivity to size k p Two disjoint (or very far) components of a single network
  • 124. Axiomatic Sensitivity to size When k or p goes to ∞, the nodes of the corresponding subnetwork must become more important k p Two disjoint (or very far) components of a single network
  • 126. Axiomatic Sensitivity to density The blue and the red node have the same importance (the two rings have the same size!)
  • 127. Axiomatic Sensitivity to density The blue and the red node have the same importance (the two rings have the same size!)
  • 128. Axiomatic Sensitivity to density Densifying the left-hand side, we expect the red node to become more important than the blue node
  • 130. Axiomatic Monotonicity Adding an arc increases the importance of the target node G x y G’ x y
  • 133. An axiomatic slaughter Size Density Monot. Degree only k yes yes Betweennes s only p no no Katz only k yes yes Closeness no no no Lin only k no no Harmonic yes yes yes PageRank no yes yes Seeley no yes no HITS only k yes no SALSA no yes no
  • 134. An axiomatic slaughter Size Density Monot. Degree only k yes yes Betweennes s only p no no Katz only k yes yes Closeness no no no Lin only k no no Harmonic yes yes yes PageRank no yes yes Seeley no yes no HITS only k yes no SALSA no yes no
  • 135. An axiomatic slaughter Size Density Monot. Degree only k yes yes Betweennes s only p no no Katz only k yes yes Closeness no no no Lin only k no no Harmonic yes yes yes PageRank no yes yes Seeley no yes no HITS only k yes no SALSA no yes no
  • 136. An axiomatic slaughter Size Density Monot. Degree only k yes yes Betweennes s only p no no Katz only k yes yes Closeness no no no Lin only k no no Harmonic yes yes yes PageRank no yes yes Seeley no yes no HITS only k yes no SALSA no yes no
  • 138. Ground-truth lens (mostly: anecdoctic / comparative) check them against real (social?) networks on which you have some ground truth about importance/centrality/...
  • 139. Hollywood: PageRank Ron Jeremy Adolf Hitler Lloyd Kaufman George W. Bush Ronald Reagan Bill Clinton Martin Sheen Debbie Rochon
  • 140. Hollywood: PageRank Ron Jeremy Adolf Hitler Lloyd Kaufman George W. Bush Ronald Reagan Bill Clinton Martin Sheen Debbie Rochon
  • 141. Hollywood: Degree William Shatner Bess Flowers Martin Sheen Ronald Reagan George Clooney Samuel Jackson Robin Williams Tom Hanks
  • 142. Hollywood: Degree William Shatner Bess Flowers Martin Sheen Ronald Reagan George Clooney Samuel Jackson Robin Williams Tom Hanks
  • 143. Hollywood: Betweenness Adolf Hitler Lloyd Kaufman Ron Jeremy Tony Robinson Olu Jacobs Max von Sydow Udo Kier George W. Bush
  • 144. Hollywood: Betweenness Adolf Hitler Lloyd Kaufman Ron Jeremy Tony Robinson Olu Jacobs Max von Sydow Udo Kier George W. Bush
  • 145. Hollywood: Katz William Shatner Martin Sheen George Clooney Robin WilliamsTom Hanks Ronald Reagan Bruce Willis Samuel Jackson
  • 146. Hollywood: Katz William Shatner Martin Sheen George Clooney Robin WilliamsTom Hanks Ronald Reagan Bruce Willis Samuel Jackson
  • 147. Hollywood: Closeness Lina Tjeng Ryan VillapotoAnh Loan Nguyen Thi Chad Reed Bjorn van Wenum J.P. Ramackers Herbert Sydney R.D. Nicholson
  • 148. Hollywood: Closeness Lina Tjeng Ryan VillapotoAnh Loan Nguyen Thi Chad Reed Bjorn van Wenum J.P. Ramackers Herbert Sydney R.D. Nicholson
  • 149. Hollywood: Closeness Lina Tjeng Ryan VillapotoAnh Loan Nguyen Thi Chad Reed Bjorn van Wenum J.P. Ramackers Herbert Sydney R.D. Nicholson Isolatednodeshavelargestcentrality
  • 150. Hollywood: Lin George Clooney Samuel Jackson Martin Sheen Dennis Hopper Antonio Banderas Madonna Michael Douglas Tom Cruise
  • 151. Hollywood: HITS William Shatner Robin Williams Bruce WillisTom Hanks Michael Douglas Cameron Diaz Harrison Ford Pierce Brosnan
  • 152. Hollywood: Harmonic George Clooney Samuel Jackson Sharon Stone Tom Hanks Martin Sheen Dennis Hopper Antonio Banderas Madonna
  • 153. What about the web? .uk Top Ten
  • 154. What about the web? .uk Top Ten
  • 155. What about the web? .uk Top Ten PageRank Katz Lin Harmonic http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/en/index.htm http://www.kelkoo.co.uk/ http://www.bbc.co.uk/accessibility/ http://news.bbc.co.uk/ http://www.names.co.uk/ http://www.kelkoo.co.uk/b/a/ kc_top_searches_charts.html http://news.bbc.co.uk/ http://www.dti.gov.uk/ http://www.names.co.uk/hosting.html http://www.kelkoo.co.uk/b/a/ co_2765_128501-company-information- pages.html http://www.dti.gov.uk/ http://www.direct.gov.uk/en/index.htm http://www.names.co.uk/email.html http://www.kelkoo.co.uk/b/a/sm_site- map.html?displayType=alpha http://www.google.co.uk/ http://www.google.co.uk/ http://www.names.co.uk/ controlpanel.html http://www.kelkoo.co.uk/b/a/ co_5199_128501-how-to-use- kelkoo.html http://www.guardian.co.uk/ http://www.bbc.co.uk/accessibility/ http://www.scdc.org.uk/ http://www.kelkoo.co.uk/b/a/ co_2120_128501-shopping-guides- price-comparison-on-kelkoo-uk.html http://www.homeoffice.gov.uk/ http://www.homeoffice.gov.uk/ http://www.freelyricsearch.co.uk/ index.html http://www.ebay.co.uk/ http://www.statistics.gov.uk/ http://www.statistics.gov.uk/ http://www.247partypeople.co.uk/ login.asp http://www.top50scrappers.co.uk/ http://www.bbc.co.uk/privacy/ http://www.bbc.co.uk/privacy/ http://www.becs.co.uk/catalog/ cookie_usage.php http://cgi1.ebay.co.uk/aw-cgi/ eBayISAPI.dll?TimeShow http://www.bbc.co.uk/info/ http://www.bbc.co.uk/info/
  • 156. What about the web? .uk Top Ten PageRank Katz Lin Harmonic http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/en/index.htm http://www.kelkoo.co.uk/ http://www.bbc.co.uk/accessibility/ http://news.bbc.co.uk/ http://www.names.co.uk/ http://www.kelkoo.co.uk/b/a/ kc_top_searches_charts.html http://news.bbc.co.uk/ http://www.dti.gov.uk/ http://www.names.co.uk/hosting.html http://www.kelkoo.co.uk/b/a/ co_2765_128501-company-information- pages.html http://www.dti.gov.uk/ http://www.direct.gov.uk/en/index.htm http://www.names.co.uk/email.html http://www.kelkoo.co.uk/b/a/sm_site- map.html?displayType=alpha http://www.google.co.uk/ http://www.google.co.uk/ http://www.names.co.uk/ controlpanel.html http://www.kelkoo.co.uk/b/a/ co_5199_128501-how-to-use- kelkoo.html http://www.guardian.co.uk/ http://www.bbc.co.uk/accessibility/ http://www.scdc.org.uk/ http://www.kelkoo.co.uk/b/a/ co_2120_128501-shopping-guides- price-comparison-on-kelkoo-uk.html http://www.homeoffice.gov.uk/ http://www.homeoffice.gov.uk/ http://www.freelyricsearch.co.uk/ index.html http://www.ebay.co.uk/ http://www.statistics.gov.uk/ http://www.statistics.gov.uk/ http://www.247partypeople.co.uk/ login.asp http://www.top50scrappers.co.uk/ http://www.bbc.co.uk/privacy/ http://www.bbc.co.uk/privacy/ http://www.becs.co.uk/catalog/ cookie_usage.php http://cgi1.ebay.co.uk/aw-cgi/ eBayISAPI.dll?TimeShow http://www.bbc.co.uk/info/ http://www.bbc.co.uk/info/
  • 157. What about the web? .uk Top Ten PageRank Katz Lin Harmonic http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/en/index.htm http://www.kelkoo.co.uk/ http://www.bbc.co.uk/accessibility/ http://news.bbc.co.uk/ http://www.names.co.uk/ http://www.kelkoo.co.uk/b/a/ kc_top_searches_charts.html http://news.bbc.co.uk/ http://www.dti.gov.uk/ http://www.names.co.uk/hosting.html http://www.kelkoo.co.uk/b/a/ co_2765_128501-company-information- pages.html http://www.dti.gov.uk/ http://www.direct.gov.uk/en/index.htm http://www.names.co.uk/email.html http://www.kelkoo.co.uk/b/a/sm_site- map.html?displayType=alpha http://www.google.co.uk/ http://www.google.co.uk/ http://www.names.co.uk/ controlpanel.html http://www.kelkoo.co.uk/b/a/ co_5199_128501-how-to-use- kelkoo.html http://www.guardian.co.uk/ http://www.bbc.co.uk/accessibility/ http://www.scdc.org.uk/ http://www.kelkoo.co.uk/b/a/ co_2120_128501-shopping-guides- price-comparison-on-kelkoo-uk.html http://www.homeoffice.gov.uk/ http://www.homeoffice.gov.uk/ http://www.freelyricsearch.co.uk/ index.html http://www.ebay.co.uk/ http://www.statistics.gov.uk/ http://www.statistics.gov.uk/ http://www.247partypeople.co.uk/ login.asp http://www.top50scrappers.co.uk/ http://www.bbc.co.uk/privacy/ http://www.bbc.co.uk/privacy/ http://www.becs.co.uk/catalog/ cookie_usage.php http://cgi1.ebay.co.uk/aw-cgi/ eBayISAPI.dll?TimeShow http://www.bbc.co.uk/info/ http://www.bbc.co.uk/info/
  • 158. What about the web? .uk Top Ten PageRank Katz Lin Harmonic http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/ http://www.direct.gov.uk/en/index.htm http://www.kelkoo.co.uk/ http://www.bbc.co.uk/accessibility/ http://news.bbc.co.uk/ http://www.names.co.uk/ http://www.kelkoo.co.uk/b/a/ kc_top_searches_charts.html http://news.bbc.co.uk/ http://www.dti.gov.uk/ http://www.names.co.uk/hosting.html http://www.kelkoo.co.uk/b/a/ co_2765_128501-company-information- pages.html http://www.dti.gov.uk/ http://www.direct.gov.uk/en/index.htm http://www.names.co.uk/email.html http://www.kelkoo.co.uk/b/a/sm_site- map.html?displayType=alpha http://www.google.co.uk/ http://www.google.co.uk/ http://www.names.co.uk/ controlpanel.html http://www.kelkoo.co.uk/b/a/ co_5199_128501-how-to-use- kelkoo.html http://www.guardian.co.uk/ http://www.bbc.co.uk/accessibility/ http://www.scdc.org.uk/ http://www.kelkoo.co.uk/b/a/ co_2120_128501-shopping-guides- price-comparison-on-kelkoo-uk.html http://www.homeoffice.gov.uk/ http://www.homeoffice.gov.uk/ http://www.freelyricsearch.co.uk/ index.html http://www.ebay.co.uk/ http://www.statistics.gov.uk/ http://www.statistics.gov.uk/ http://www.247partypeople.co.uk/ login.asp http://www.top50scrappers.co.uk/ http://www.bbc.co.uk/privacy/ http://www.bbc.co.uk/privacy/ http://www.becs.co.uk/catalog/ cookie_usage.php http://cgi1.ebay.co.uk/aw-cgi/ eBayISAPI.dll?TimeShow http://www.bbc.co.uk/info/ http://www.bbc.co.uk/info/
  • 159. How do they compare?
  • 160. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074 Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801 Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416 Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865 SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924 closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319 harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549 Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546 HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249 between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923 PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276 PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952 PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000 Table 1: Hollywood 1 Hollywood Kendall’s τ
  • 161. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074 Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801 Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416 Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865 SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924 closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319 harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549 Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546 HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249 between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923 PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276 PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952 PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000 Table 1: Hollywood 1 Hollywood Kendall’s τ most geometric indices and HITS are rather correlated to one another
  • 162. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074 Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801 Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416 Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865 SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924 closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319 harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549 Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546 HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249 between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923 PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276 PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952 PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000 Table 1: Hollywood 1 Hollywood Kendall’s τ
  • 163. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074 Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801 Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416 Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865 SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924 closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319 harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549 Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546 HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249 between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923 PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276 PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952 PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000 Table 1: Hollywood 1 Hollywood Kendall’s τ Katz, degree and SALSA are also highly correlated
  • 164. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074 Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801 Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416 Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865 SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924 closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319 harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549 Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546 HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249 between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923 PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276 PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952 PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000 Table 1: Hollywood 1 Hollywood Kendall’s τ
  • 165. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074 Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801 Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416 Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865 SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924 closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319 harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549 Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546 HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249 between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923 PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276 PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952 PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000 Table 1: Hollywood 1 Hollywood Kendall’s τ Betweenness does not correlate to anything
  • 166. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074 Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801 Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416 Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865 SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924 closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319 harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549 Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546 HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249 between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923 PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276 PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952 PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000 Table 1: Hollywood 1 Hollywood Kendall’s τ
  • 167. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS between PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9709 0.9287 0.8627 0.9005 0.4357 0.5526 0.5512 0.5170 0.5034 0.3699 0.4225 0.5074 Katz 1/4 0.9709 1.0000 0.9609 0.8957 0.8719 0.4638 0.5816 0.5801 0.5476 0.5026 0.3448 0.3964 0.4801 Katz 1/2 0.9287 0.9609 1.0000 0.9369 0.8291 0.4965 0.6157 0.6139 0.5849 0.4952 0.3108 0.3605 0.4416 Katz 3/4 0.8627 0.8957 0.9369 1.0000 0.7630 0.5488 0.6697 0.6676 0.6478 0.4811 0.2633 0.3098 0.3865 SALSA 0.9005 0.8719 0.8291 0.7630 1.0000 0.5371 0.4519 0.4504 0.4185 0.4692 0.4496 0.5042 0.5924 closeness 0.4357 0.4638 0.4965 0.5488 0.5371 1.0000 0.8503 0.8508 0.7366 0.3293 0.1529 0.1813 0.2319 harmonic 0.5526 0.5816 0.6157 0.6697 0.4519 0.8503 1.0000 0.9925 0.8694 0.3929 0.0752 0.1041 0.1549 Lin 0.5512 0.5801 0.6139 0.6676 0.4504 0.8508 0.9925 1.0000 0.8680 0.3916 0.0753 0.1041 0.1546 HITS 0.5170 0.5476 0.5849 0.6478 0.4185 0.7366 0.8694 0.8680 1.0000 0.3645 0.0518 0.0780 0.1249 between 0.5034 0.5026 0.4952 0.4811 0.4692 0.3293 0.3929 0.3916 0.3696 1.0000 0.4852 0.4909 0.4923 PR 1/4 0.3699 0.3448 0.3108 0.2633 0.4496 0.1529 0.0752 0.0753 0.0518 0.4852 1.0000 0.9317 0.8276 PR 1/2 0.4225 0.3964 0.3605 0.3098 0.5042 0.1813 0.1041 0.1041 0.0780 0.4909 0.9317 1.0000 0.8952 PR 3/4 0.5074 0.4801 0.4416 0.3865 0.5924 0.2319 0.1549 0.1546 0.1249 0.4923 0.8276 0.8952 1.0000 Table 1: Hollywood 1 Hollywood Kendall’s τ PageRank stands alone
  • 168. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784 Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595 Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611 Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622 SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747 closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004 harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072 Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069 HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415 PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289 PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088 PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000 Table 1: .uk 1 .uk (May 2007 snapshot) Kendall’s τ
  • 169. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784 Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595 Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611 Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622 SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747 closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004 harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072 Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069 HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415 PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289 PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088 PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000 Table 1: .uk 1 .uk (May 2007 snapshot) Kendall’s τ
  • 170. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784 Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595 Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611 Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622 SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747 closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004 harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072 Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069 HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415 PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289 PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088 PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000 Table 1: .uk 1 .uk (May 2007 snapshot) Kendall’s τ Betweenness could not be computed because of graph size (106M nodes)
  • 171. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784 Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595 Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611 Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622 SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747 closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004 harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072 Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069 HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415 PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289 PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088 PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000 Table 1: .uk 1 .uk (May 2007 snapshot) Kendall’s τ
  • 172. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784 Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595 Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611 Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622 SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747 closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004 harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072 Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069 HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415 PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289 PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088 PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000 Table 1: .uk 1 .uk (May 2007 snapshot) Kendall’s τ The same correlations as with Hollywood, but even more emphasized
  • 173. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784 Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595 Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611 Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622 SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747 closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004 harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072 Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069 HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415 PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289 PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088 PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000 Table 1: .uk 1 .uk (May 2007 snapshot) Kendall’s τ
  • 174. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784 Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595 Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611 Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622 SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747 closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004 harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072 Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069 HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415 PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289 PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088 PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000 Table 1: .uk 1 .uk (May 2007 snapshot) Kendall’s τ Exception: HITS used to be correlated with the geometric indices, while now it is alone
  • 175. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784 Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595 Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611 Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622 SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747 closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004 harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072 Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069 HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415 PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289 PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088 PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000 Table 1: .uk 1 .uk (May 2007 snapshot) Kendall’s τ
  • 176. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9053 0.9024 0.9000 0.9114 0.1950 0.2060 0.2060 0.2853 0.6449 0.6161 0.5784 Katz 1/4 0.9053 1.0000 0.9957 0.9922 0.8141 0.2059 0.2268 0.2265 0.2773 0.5917 0.5820 0.5595 Katz 1/2 0.9024 0.9957 1.0000 0.9966 0.8112 0.2078 0.2289 0.2286 0.2776 0.5914 0.5827 0.5611 Katz 3/4 0.9000 0.9922 0.9966 1.0000 0.8089 0.2094 0.2307 0.2303 0.2778 0.5911 0.5832 0.5622 SALSA 0.9114 0.8141 0.8112 0.8089 1.0000 0.1782 0.1617 0.1619 0.1917 0.6445 0.6146 0.5747 closeness 0.1950 0.2059 0.2078 0.2094 0.1782 1.0000 0.8592 0.8566 0.3817 0.1518 0.1746 0.2004 harmonic 0.2060 0.2268 0.2289 0.2307 0.1617 0.8592 1.0000 0.9694 0.4253 0.1503 0.1770 0.2072 Lin 0.2060 0.2265 0.2286 0.2303 0.1619 0.8566 0.9694 1.0000 0.4272 0.1503 0.1768 0.2069 HITS 0.2853 0.2773 0.2776 0.2778 0.1917 0.3817 0.4253 0.4272 1.0000 0.1529 0.1484 0.1415 PR 1/4 0.6449 0.5917 0.5914 0.5911 0.6445 0.1518 0.1503 0.1503 0.1529 1.0000 0.9182 0.8289 PR 1/2 0.6161 0.5820 0.5827 0.5832 0.6146 0.1746 0.1770 0.1768 0.1484 0.9182 1.0000 0.9088 PR 3/4 0.5784 0.5595 0.5611 0.5622 0.5747 0.2004 0.2072 0.2069 0.1415 0.8289 0.9088 1.0000 Table 1: .uk 1 .uk (May 2007 snapshot) Kendall’s τ A larger correlation between PageRank and Katz & degree
  • 178. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3508 0.7265 0.7596 0.8016 Katz 1/4 0.9522 1.0000 0.9489 0.8750 0.9522 0.5972 0.5875 0.5967 0.3984 0.6868 0.7179 0.7577 Katz 1/2 0.8982 0.9489 1.0000 0.9275 0.8982 0.6382 0.6338 0.6380 0.4491 0.6400 0.6690 0.7067 Katz 3/4 0.8242 0.8750 0.9275 1.0000 0.8242 0.6839 0.6910 0.6842 0.5213 0.5742 0.6005 0.6355 SALSA 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3505 0.7265 0.7596 0.8016 closeness 0.5521 0.5972 0.6382 0.6839 0.5521 1.0000 0.9458 0.9830 0.6539 0.3862 0.4040 0.4268 harmonic 0.5391 0.5875 0.6338 0.6910 0.5391 0.9458 1.0000 0.9471 0.7090 0.3612 0.3777 0.3992 Lin 0.5513 0.5967 0.6380 0.6842 0.5513 0.9830 0.9471 1.0000 0.6546 0.3852 0.4030 0.4257 HITS 0.3508 0.3984 0.4491 0.5213 0.3505 0.6539 0.7090 0.6546 1.0000 0.1689 0.1778 0.1917 PR 1/4 0.7265 0.6868 0.6400 0.5742 0.7265 0.3862 0.3612 0.3852 0.1689 1.0000 0.9520 0.8889 PR 1/2 0.7596 0.7179 0.6690 0.6005 0.7596 0.4040 0.3777 0.4030 0.1778 0.9520 1.0000 0.9363 PR 3/4 0.8016 0.7577 0.7067 0.6355 0.8016 0.4268 0.3992 0.4257 0.1917 0.8889 0.9363 1.0000 Table 1: Orkut 1 Orkut (2007 snapshot) Kendall’s τ
  • 179. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3508 0.7265 0.7596 0.8016 Katz 1/4 0.9522 1.0000 0.9489 0.8750 0.9522 0.5972 0.5875 0.5967 0.3984 0.6868 0.7179 0.7577 Katz 1/2 0.8982 0.9489 1.0000 0.9275 0.8982 0.6382 0.6338 0.6380 0.4491 0.6400 0.6690 0.7067 Katz 3/4 0.8242 0.8750 0.9275 1.0000 0.8242 0.6839 0.6910 0.6842 0.5213 0.5742 0.6005 0.6355 SALSA 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3505 0.7265 0.7596 0.8016 closeness 0.5521 0.5972 0.6382 0.6839 0.5521 1.0000 0.9458 0.9830 0.6539 0.3862 0.4040 0.4268 harmonic 0.5391 0.5875 0.6338 0.6910 0.5391 0.9458 1.0000 0.9471 0.7090 0.3612 0.3777 0.3992 Lin 0.5513 0.5967 0.6380 0.6842 0.5513 0.9830 0.9471 1.0000 0.6546 0.3852 0.4030 0.4257 HITS 0.3508 0.3984 0.4491 0.5213 0.3505 0.6539 0.7090 0.6546 1.0000 0.1689 0.1778 0.1917 PR 1/4 0.7265 0.6868 0.6400 0.5742 0.7265 0.3862 0.3612 0.3852 0.1689 1.0000 0.9520 0.8889 PR 1/2 0.7596 0.7179 0.6690 0.6005 0.7596 0.4040 0.3777 0.4030 0.1778 0.9520 1.0000 0.9363 PR 3/4 0.8016 0.7577 0.7067 0.6355 0.8016 0.4268 0.3992 0.4257 0.1917 0.8889 0.9363 1.0000 Table 1: Orkut 1 Orkut (2007 snapshot) Kendall’s τ The same correlation as in Hollywood, even if this time SALSA is also pretty correlated with PageRank as well
  • 180. degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3508 0.7265 0.7596 0.8016 Katz 1/4 0.9522 1.0000 0.9489 0.8750 0.9522 0.5972 0.5875 0.5967 0.3984 0.6868 0.7179 0.7577 Katz 1/2 0.8982 0.9489 1.0000 0.9275 0.8982 0.6382 0.6338 0.6380 0.4491 0.6400 0.6690 0.7067 Katz 3/4 0.8242 0.8750 0.9275 1.0000 0.8242 0.6839 0.6910 0.6842 0.5213 0.5742 0.6005 0.6355 SALSA 1.0000 0.9522 0.8982 0.8242 1.0000 0.5521 0.5391 0.5513 0.3505 0.7265 0.7596 0.8016 closeness 0.5521 0.5972 0.6382 0.6839 0.5521 1.0000 0.9458 0.9830 0.6539 0.3862 0.4040 0.4268 harmonic 0.5391 0.5875 0.6338 0.6910 0.5391 0.9458 1.0000 0.9471 0.7090 0.3612 0.3777 0.3992 Lin 0.5513 0.5967 0.6380 0.6842 0.5513 0.9830 0.9471 1.0000 0.6546 0.3852 0.4030 0.4257 HITS 0.3508 0.3984 0.4491 0.5213 0.3505 0.6539 0.7090 0.6546 1.0000 0.1689 0.1778 0.1917 PR 1/4 0.7265 0.6868 0.6400 0.5742 0.7265 0.3862 0.3612 0.3852 0.1689 1.0000 0.9520 0.8889 PR 1/2 0.7596 0.7179 0.6690 0.6005 0.7596 0.4040 0.3777 0.4030 0.1778 0.9520 1.0000 0.9363 PR 3/4 0.8016 0.7577 0.7067 0.6355 0.8016 0.4268 0.3992 0.4257 0.1917 0.8889 0.9363 1.0000 Table 1: Orkut 1 Orkut (2007 snapshot) Kendall’s τ
  • 182. IR lens (using TREC .gov2) use centrality (in isolation or combined with textual features) to rerank query results and see how good (bad) they do
  • 184. TREC .gov2 150 queries (query title words, in AND; with stemming, no stopword elimination)
  • 185. TREC .gov2 150 queries (query title words, in AND; with stemming, no stopword elimination) Generated the result graph using the method described by Najork et al. 2009 (a variant of Kleinberg’s HITS graph, taking a in-links and b out-links)
  • 186. TREC .gov2 150 queries (query title words, in AND; with stemming, no stopword elimination) Generated the result graph using the method described by Najork et al. 2009 (a variant of Kleinberg’s HITS graph, taking a in-links and b out-links) Considered many combinations: here I present only the cases a=b=0 (i.e., subgraph induced by the result set)
  • 187. TREC .gov2 150 queries (query title words, in AND; with stemming, no stopword elimination) Generated the result graph using the method described by Najork et al. 2009 (a variant of Kleinberg’s HITS graph, taking a in-links and b out-links) Considered many combinations: here I present only the cases a=b=0 (i.e., subgraph induced by the result set) With or without intra-host links
  • 190. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Betweenness 0.0584 0.0595 0.0577 0.0588 Closeness 0.1101 0.1061 0.1121 0.1168 PageRank (best) 0.1107 0.1078 0.1295 0.1347 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 Katz (best) 0.1242 0.1228 0.1262 0.1297 Lin 0.1295 0.1308 0.1248 0.1286 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293
  • 191. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Betweenness 0.0584 0.0595 0.0577 0.0588 Closeness 0.1101 0.1061 0.1121 0.1168 PageRank (best) 0.1107 0.1078 0.1295 0.1347 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 Katz (best) 0.1242 0.1228 0.1262 0.1297 Lin 0.1295 0.1308 0.1248 0.1286 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293
  • 192. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Betweenness 0.0584 0.0595 0.0577 0.0588 Closeness 0.1101 0.1061 0.1121 0.1168 PageRank (best) 0.1107 0.1078 0.1295 0.1347 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 Katz (best) 0.1242 0.1228 0.1262 0.1297 Lin 0.1295 0.1308 0.1248 0.1286 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293
  • 193. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Betweenness 0.0584 0.0595 0.0577 0.0588 Closeness 0.1101 0.1061 0.1121 0.1168 PageRank (best) 0.1107 0.1078 0.1295 0.1347 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 Katz (best) 0.1242 0.1228 0.1262 0.1297 Lin 0.1295 0.1308 0.1248 0.1286 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293
  • 194. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Betweenness 0.0584 0.0595 0.0577 0.0588 Closeness 0.1101 0.1061 0.1121 0.1168 PageRank (best) 0.1107 0.1078 0.1295 0.1347 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 Katz (best) 0.1242 0.1228 0.1262 0.1297 Lin 0.1295 0.1308 0.1248 0.1286 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293
  • 196. Intra-host links? Keep them or throw them away?
  • 197. Intra-host links? Keep them or throw them away? Most indices get better if you throw them away...
  • 198. Intra-host links? Keep them or throw them away? Most indices get better if you throw them away... Throwing such links away injects a lot of information, but apparently harmonic doesn’t need it!
  • 199. Intra-host links? Keep them or throw them away? Most indices get better if you throw them away... Throwing such links away injects a lot of information, but apparently harmonic doesn’t need it! ...but harmonic is better (and best of all) with the whole thing!
  • 200. .uk (May 2007 snapshot) Kendall’s τ with no intra-host links degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9995 0.9995 0.9995 0.9965 0.9883 0.9984 0.9984 0.8767 0.9956 0.9956 0.9955 Katz 1/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 Katz 1/2 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 Katz 3/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 SALSA 0.9965 0.9960 0.9960 0.9960 1.0000 0.9904 0.9950 0.9950 0.8718 0.9959 0.9959 0.9958 closeness 0.9883 0.9870 0.9870 0.9870 0.9904 1.0000 0.9859 0.9876 0.8714 0.9894 0.9893 0.9892 harmonic 0.9984 0.9986 0.9986 0.9986 0.9950 0.9859 1.0000 0.9984 0.8759 0.9944 0.9945 0.9946 Lin 0.9984 0.9978 0.9978 0.9978 0.9950 0.9876 0.9984 1.0000 0.8759 0.9945 0.9946 0.9946 HITS 0.8767 0.8763 0.8763 0.8763 0.8718 0.8714 0.8759 0.8759 1.0000 0.8727 0.8727 0.8727 PR 1/4 0.9956 0.9952 0.9952 0.9952 0.9959 0.9894 0.9944 0.9945 0.8727 1.0000 0.9998 0.9997 PR 1/2 0.9956 0.9953 0.9953 0.9953 0.9959 0.9893 0.9945 0.9946 0.8727 0.9998 1.0000 0.9999 PR 3/4 0.9955 0.9953 0.9953 0.9953 0.9958 0.9892 0.9946 0.9946 0.8727 0.9997 0.9999 1.0000 Table 1: .uk-nn 1
  • 201. .uk (May 2007 snapshot) Kendall’s τ with no intra-host links Everything becomes correlated. Why??? degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9995 0.9995 0.9995 0.9965 0.9883 0.9984 0.9984 0.8767 0.9956 0.9956 0.9955 Katz 1/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 Katz 1/2 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 Katz 3/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 SALSA 0.9965 0.9960 0.9960 0.9960 1.0000 0.9904 0.9950 0.9950 0.8718 0.9959 0.9959 0.9958 closeness 0.9883 0.9870 0.9870 0.9870 0.9904 1.0000 0.9859 0.9876 0.8714 0.9894 0.9893 0.9892 harmonic 0.9984 0.9986 0.9986 0.9986 0.9950 0.9859 1.0000 0.9984 0.8759 0.9944 0.9945 0.9946 Lin 0.9984 0.9978 0.9978 0.9978 0.9950 0.9876 0.9984 1.0000 0.8759 0.9945 0.9946 0.9946 HITS 0.8767 0.8763 0.8763 0.8763 0.8718 0.8714 0.8759 0.8759 1.0000 0.8727 0.8727 0.8727 PR 1/4 0.9956 0.9952 0.9952 0.9952 0.9959 0.9894 0.9944 0.9945 0.8727 1.0000 0.9998 0.9997 PR 1/2 0.9956 0.9953 0.9953 0.9953 0.9959 0.9893 0.9945 0.9946 0.8727 0.9998 1.0000 0.9999 PR 3/4 0.9955 0.9953 0.9953 0.9953 0.9958 0.9892 0.9946 0.9946 0.8727 0.9997 0.9999 1.0000 Table 1: .uk-nn 1
  • 202. .uk (May 2007 snapshot) Kendall’s τ with no intra-host links degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9995 0.9995 0.9995 0.9965 0.9883 0.9984 0.9984 0.8767 0.9956 0.9956 0.9955 Katz 1/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 Katz 1/2 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 Katz 3/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 SALSA 0.9965 0.9960 0.9960 0.9960 1.0000 0.9904 0.9950 0.9950 0.8718 0.9959 0.9959 0.9958 closeness 0.9883 0.9870 0.9870 0.9870 0.9904 1.0000 0.9859 0.9876 0.8714 0.9894 0.9893 0.9892 harmonic 0.9984 0.9986 0.9986 0.9986 0.9950 0.9859 1.0000 0.9984 0.8759 0.9944 0.9945 0.9946 Lin 0.9984 0.9978 0.9978 0.9978 0.9950 0.9876 0.9984 1.0000 0.8759 0.9945 0.9946 0.9946 HITS 0.8767 0.8763 0.8763 0.8763 0.8718 0.8714 0.8759 0.8759 1.0000 0.8727 0.8727 0.8727 PR 1/4 0.9956 0.9952 0.9952 0.9952 0.9959 0.9894 0.9944 0.9945 0.8727 1.0000 0.9998 0.9997 PR 1/2 0.9956 0.9953 0.9953 0.9953 0.9959 0.9893 0.9945 0.9946 0.8727 0.9998 1.0000 0.9999 PR 3/4 0.9955 0.9953 0.9953 0.9953 0.9958 0.9892 0.9946 0.9946 0.8727 0.9997 0.9999 1.0000 Table 1: .uk-nn 1
  • 203. .uk (May 2007 snapshot) Kendall’s τ with no intra-host links Because most nodes have degree 0 (and hence they also have all the other scores at a tie) degree Katz 1/4 Katz 1/2 Katz 3/4 SALSA closeness harmonic Lin HITS PR 1/4 PR 1/2 PR 3/4 degree 1.0000 0.9995 0.9995 0.9995 0.9965 0.9883 0.9984 0.9984 0.8767 0.9956 0.9956 0.9955 Katz 1/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 Katz 1/2 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 Katz 3/4 0.9995 1.0000 1.0000 1.0000 0.9960 0.9870 0.9986 0.9978 0.8763 0.9952 0.9953 0.9953 SALSA 0.9965 0.9960 0.9960 0.9960 1.0000 0.9904 0.9950 0.9950 0.8718 0.9959 0.9959 0.9958 closeness 0.9883 0.9870 0.9870 0.9870 0.9904 1.0000 0.9859 0.9876 0.8714 0.9894 0.9893 0.9892 harmonic 0.9984 0.9986 0.9986 0.9986 0.9950 0.9859 1.0000 0.9984 0.8759 0.9944 0.9945 0.9946 Lin 0.9984 0.9978 0.9978 0.9978 0.9950 0.9876 0.9984 1.0000 0.8759 0.9945 0.9946 0.9946 HITS 0.8767 0.8763 0.8763 0.8763 0.8718 0.8714 0.8759 0.8759 1.0000 0.8727 0.8727 0.8727 PR 1/4 0.9956 0.9952 0.9952 0.9952 0.9959 0.9894 0.9944 0.9945 0.8727 1.0000 0.9998 0.9997 PR 1/2 0.9956 0.9953 0.9953 0.9953 0.9959 0.9893 0.9945 0.9946 0.8727 0.9998 1.0000 0.9999 PR 3/4 0.9955 0.9953 0.9953 0.9953 0.9958 0.9892 0.9946 0.9946 0.8727 0.9997 0.9999 1.0000 Table 1: .uk-nn 1
  • 205. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Weighted degree 0.1356 0.1373 0.1262 0.1295 SALSinA 0.1349 0.1357 0.1255 0.1318 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293 BM25 0.5644 0.5842 0.5644 0.5842
  • 206. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Weighted degree 0.1356 0.1373 0.1262 0.1295 SALSinA 0.1349 0.1357 0.1255 0.1318 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293 BM25 0.5644 0.5842 0.5644 0.5842
  • 207. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Weighted degree 0.1356 0.1373 0.1262 0.1295 SALSinA 0.1349 0.1357 0.1255 0.1318 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293 BM25 0.5644 0.5842 0.5644 0.5842 d (x) · canReach(x)
  • 208. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Weighted degree 0.1356 0.1373 0.1262 0.1295 SALSinA 0.1349 0.1357 0.1255 0.1318 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293 BM25 0.5644 0.5842 0.5644 0.5842
  • 209. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Weighted degree 0.1356 0.1373 0.1262 0.1295 SALSinA 0.1349 0.1357 0.1255 0.1318 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293 BM25 0.5644 0.5842 0.5644 0.5842
  • 210. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Weighted degree 0.1356 0.1373 0.1262 0.1295 SALSinA 0.1349 0.1357 0.1255 0.1318 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293 BM25 0.5644 0.5842 0.5644 0.5842 d (x) · connected(x)
  • 211. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Weighted degree 0.1356 0.1373 0.1262 0.1295 SALSinA 0.1349 0.1357 0.1255 0.1318 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293 BM25 0.5644 0.5842 0.5644 0.5842
  • 212. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Weighted degree 0.1356 0.1373 0.1262 0.1295 SALSinA 0.1349 0.1357 0.1255 0.1318 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293 BM25 0.5644 0.5842 0.5644 0.5842
  • 213. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Weighted degree 0.1356 0.1373 0.1262 0.1295 SALSinA 0.1349 0.1357 0.1255 0.1318 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293 BM25 0.5644 0.5842 0.5644 0.5842
  • 214. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Weighted degree 0.1356 0.1373 0.1262 0.1295 SALSinA 0.1349 0.1357 0.1255 0.1318 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293 BM25 0.5644 0.5842 0.5644 0.5842
  • 215. P@10 and NDCG@10 All linksAll links Inter-host links onlyInter-host links only P@10 NDCG@10 P@10 NDCG@10 Weighted degree 0.1356 0.1373 0.1262 0.1295 SALSinA 0.1349 0.1357 0.1255 0.1318 Degree 0.1208 0.1091 0.1248 0.1283 SALSA 0.1221 0.1194 0.1282 0.1384 HITS 0.1349 0.1364 0.1107 0.1179 Harmonic 0.1430 0.1449 0.1262 0.1293 BM25 0.5644 0.5842 0.5644 0.5842 TODO Study features in combination
  • 217. Computational feasibility lens which indices are computable effectively on large networks; consider also parallelizability / distributability...
  • 220. Best algorithms so far How? Progr. Degree Trivial O(1) - - +++ Betweennes s O(nm) [Brandes 2001] - no - Katz Iterative (Gauss-Seidel) Fast convergence yes + PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++ Seeley Iterative (PM) Slow convergence no - HITS Iterative (PM) Slow convergence no - SALSA Direct O(n+) - no +++
  • 221. Best algorithms so far How? Progr. Degree Trivial O(1) - - +++ Betweennes s O(nm) [Brandes 2001] - no - Katz Iterative (Gauss-Seidel) Fast convergence yes + PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++ Seeley Iterative (PM) Slow convergence no - HITS Iterative (PM) Slow convergence no - SALSA Direct O(n+) - no +++
  • 222. Best algorithms so far How? Progr. Degree Trivial O(1) - - +++ Betweennes s O(nm) [Brandes 2001] - no - Katz Iterative (Gauss-Seidel) Fast convergence yes + PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++ Seeley Iterative (PM) Slow convergence no - HITS Iterative (PM) Slow convergence no - SALSA Direct O(n+) - no +++
  • 223. Best algorithms so far How? Progr. Degree Trivial O(1) - - +++ Betweennes s O(nm) [Brandes 2001] - no - Katz Iterative (Gauss-Seidel) Fast convergence yes + PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++ Seeley Iterative (PM) Slow convergence no - HITS Iterative (PM) Slow convergence no - SALSA Direct O(n+) - no +++
  • 224. Best algorithms so far How? Progr. Degree Trivial O(1) - - +++ Betweennes s O(nm) [Brandes 2001] - no - Katz Iterative (Gauss-Seidel) Fast convergence yes + PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++ Seeley Iterative (PM) Slow convergence no - HITS Iterative (PM) Slow convergence no - SALSA Direct O(n+) - no +++
  • 225. Best algorithms so far How? Progr. Degree Trivial O(1) - - +++ Betweennes s O(nm) [Brandes 2001] - no - Katz Iterative (Gauss-Seidel) Fast convergence yes + PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++ Seeley Iterative (PM) Slow convergence no - HITS Iterative (PM) Slow convergence no - SALSA Direct O(n+) - no +++
  • 226. Best algorithms so far How? Progr. Degree Trivial O(1) - - +++ Betweennes s O(nm) [Brandes 2001] - no - Katz Iterative (Gauss-Seidel) Fast convergence yes + PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++ Seeley Iterative (PM) Slow convergence no - HITS Iterative (PM) Slow convergence no - SALSA Direct O(n+) - no +++
  • 227. Best algorithms so far How? Progr. Degree Trivial O(1) - - +++ Betweennes s O(nm) [Brandes 2001] - no - Katz Iterative (Gauss-Seidel) Fast convergence yes + PageRank Iterative (PM, GS, Jacobi...) Fast convergence yes +++ Seeley Iterative (PM) Slow convergence no - HITS Iterative (PM) Slow convergence no - SALSA Direct O(n+) - no +++
  • 229. Geometric What about geometic measures, like closeness, Lin and harmonic?
  • 231. Computing harmonic Let us take harmonic as an example
  • 232. Computing harmonic Let us take harmonic as an example But how easy is it to compute? charm(x) = X y6=x 1 d(y, x)
  • 233. Computing harmonic Let us take harmonic as an example But how easy is it to compute? ...or... charm(x) = X y6=x 1 d(y, x) charm(x) = 1X t=1 |Bt(x)| |Bt 1(x)| t
  • 234. Computing harmonic Let us take harmonic as an example But how easy is it to compute? ...or... charm(x) = X y6=x 1 d(y, x) charm(x) = 1X t=1 |Bt(x)| |Bt 1(x)| t Ball of radius t about x
  • 235. Computing harmonic Let us take harmonic as an example But how easy is it to compute? ...or... charm(x) = X y6=x 1 d(y, x) charm(x) = 1X t=1 |Bt(x)| |Bt 1(x)| t
  • 238. Computing by diffusion Clearly Moreover B0(x) = {x} Bt+1(x) = {x} [ [ x!y Bt(y)
  • 239. Computing by diffusion Clearly Moreover So one needs just one single sequential scan of the graph to compute the balls at the next iteration B0(x) = {x} Bt+1(x) = {x} [ [ x!y Bt(y)
  • 240. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 241. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 242. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 243. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 244. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 245. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 246. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 247. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 248. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 249. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 250. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 251. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺
  • 252. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺☺ ☺ ☺☺ ☺☺ ☺ ☺
  • 253. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺☺ ☺ ☺☺ ☺ ☺ ☺ ☺
  • 254. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺☺ ☺ ☺☺ ☺ ☺ ☺ ☺
  • 255. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺☺ ☺ ☺☺ ☺ ☺☺ ☺
  • 256. A round of updates ☺ ☺ ☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺☺ ☺ ☺☺ ☺ ☺☺ ☺
  • 257. Another round... ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺☺☺ ☺ ☺ ☺ ☺ ☺ ☺☺☺ ☺ ☺ ☺☺ ☺ ☺☺ ☺☺☺
  • 258. Another round... ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺☺☺ ☺ ☺ ☺ ☺ ☺ ☺☺☺ ☺ ☺ ☺☺ ☺ ☺☺ ☺☺☺
  • 259. Another round... ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺☺☺ ☺ ☺ ☺☺ ☺ ☺☺☺ ☺ ☺ ☺☺ ☺ ☺☺ ☺☺☺
  • 260. Another round... ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺☺☺ ☺ ☺ ☺☺ ☺☺☺☺ ☺ ☺ ☺☺ ☺ ☺☺ ☺☺☺
  • 261. Another round... ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺☺☺ ☺ ☺ ☺☺ ☺☺☺☺ ☺ ☺ ☺☺ ☺ ☺☺ ☺☺☺
  • 262. Another round... ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺☺☺ ☺ ☺ ☺☺ ☺☺☺☺ ☺ ☺☺☺ ☺ ☺☺ ☺☺☺
  • 263. Another round... ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺☺☺ ☺ ☺ ☺☺ ☺☺☺☺ ☺ ☺☺☺ ☺ ☺☺ ☺☺☺
  • 264. Another round... ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺ ☺ ☺☺ ☺ ☺☺☺ ☺ ☺ ☺☺ ☺☺☺☺ ☺ ☺☺☺ ☺ ☺☺☺☺☺
  • 266. Easy but expensive Each set uses linear space; overall quadratic
  • 267. Easy but expensive Each set uses linear space; overall quadratic Impossible!
  • 268. Easy but expensive Each set uses linear space; overall quadratic Impossible! But what if we use approximate sets?
  • 269. Easy but expensive Each set uses linear space; overall quadratic Impossible! But what if we use approximate sets? Idea: use probabilistic counters, which represent sets but answer just to “size?” questions
  • 270. Easy but expensive Each set uses linear space; overall quadratic Impossible! But what if we use approximate sets? Idea: use probabilistic counters, which represent sets but answer just to “size?” questions Very small! With 40 bits you can count up to 4 billion with a standard deviation of 6%
  • 272. Use the Force HyperBall (http://webgraph.dsi.unimi.it/) does the job
  • 273. Use the Force HyperBall (http://webgraph.dsi.unimi.it/) does the job Fully exploits multicore architectures; uses broadword microparallelization
  • 274. Use the Force HyperBall (http://webgraph.dsi.unimi.it/) does the job Fully exploits multicore architectures; uses broadword microparallelization Works like a charm on networks with hundreds of millions of nodes
  • 276. What is HyperBall It uses the diffusion idea to compute (at the same time):
  • 277. What is HyperBall It uses the diffusion idea to compute (at the same time): Lin + Closeness + Harmonic + Number of reachable nodes...
  • 278. What is HyperBall It uses the diffusion idea to compute (at the same time): Lin + Closeness + Harmonic + Number of reachable nodes... It employs Flajolet’s HyperLogLog counters to store sets
  • 279. What is HyperBall It uses the diffusion idea to compute (at the same time): Lin + Closeness + Harmonic + Number of reachable nodes... It employs Flajolet’s HyperLogLog counters to store sets More on HyperBall in the next few slides...
  • 281. Main trick Choose an approximate set such that unions can be computed quickly
  • 282. Main trick Choose an approximate set such that unions can be computed quickly ANF [Palmer et al., KDD ’02] uses Martin–Flajolet (MF) counters (log n+c space)
  • 283. Main trick Choose an approximate set such that unions can be computed quickly ANF [Palmer et al., KDD ’02] uses Martin–Flajolet (MF) counters (log n+c space) We use HyperLogLog counters [Flajolet et al.,2007] (loglog n space)
  • 284. Main trick Choose an approximate set such that unions can be computed quickly ANF [Palmer et al., KDD ’02] uses Martin–Flajolet (MF) counters (log n+c space) We use HyperLogLog counters [Flajolet et al.,2007] (loglog n space) MF counters can be combined with an OR
  • 285. Main trick Choose an approximate set such that unions can be computed quickly ANF [Palmer et al., KDD ’02] uses Martin–Flajolet (MF) counters (log n+c space) We use HyperLogLog counters [Flajolet et al.,2007] (loglog n space) MF counters can be combined with an OR We use broadword programming to combine HyperLogLog counters quickly!
  • 287. HyperLogLog counters Instead of actually counting, we observe a statistical feature of a set (think stream) of elements
  • 288. HyperLogLog counters Instead of actually counting, we observe a statistical feature of a set (think stream) of elements The feature: the number of trailing zeroes of the value of a very good hash function
  • 289. HyperLogLog counters Instead of actually counting, we observe a statistical feature of a set (think stream) of elements The feature: the number of trailing zeroes of the value of a very good hash function We keep track of the maximum m (log log n bits!)
  • 290. HyperLogLog counters Instead of actually counting, we observe a statistical feature of a set (think stream) of elements The feature: the number of trailing zeroes of the value of a very good hash function We keep track of the maximum m (log log n bits!) The number of distinct elements ∝ 2m
  • 291. HyperLogLog counters Instead of actually counting, we observe a statistical feature of a set (think stream) of elements The feature: the number of trailing zeroes of the value of a very good hash function We keep track of the maximum m (log log n bits!) The number of distinct elements ∝ 2m Important: the counter of stream AB is simply the maximum of the counters of A and B!
  • 293. Other ideas We keep track of modifications: we do not maximize with unmodified counters
  • 294. Other ideas We keep track of modifications: we do not maximize with unmodified counters Systolic computation: each modified set signals back to predecessors that something is going to happen (much fewer updates!)
  • 295. Other ideas We keep track of modifications: we do not maximize with unmodified counters Systolic computation: each modified set signals back to predecessors that something is going to happen (much fewer updates!) Multicore exploitation by decomposition: a task is updating just 1000 counters (almost linear scaling)
  • 297. Footprint Scalability: a minimum of 20 bytes per node
  • 298. Footprint Scalability: a minimum of 20 bytes per node On a 2TiB machine, 100 billion nodes
  • 299. Footprint Scalability: a minimum of 20 bytes per node On a 2TiB machine, 100 billion nodes Graph structure is accessed by memory-mapping in a compressed form (WebGraph)
  • 300. Footprint Scalability: a minimum of 20 bytes per node On a 2TiB machine, 100 billion nodes Graph structure is accessed by memory-mapping in a compressed form (WebGraph) Pointers to the graph are stored using succinct lists (Elias-Fano representation)
  • 303. Performance On a 177K nodes Hadoop: 2875s per iteration [Kang, Papadimitriou, Sun and H. Tong, 2011]
  • 304. Performance On a 177K nodes Hadoop: 2875s per iteration [Kang, Papadimitriou, Sun and H. Tong, 2011] HyperBall on this laptop: 70s per iteration
  • 305. Performance On a 177K nodes Hadoop: 2875s per iteration [Kang, Papadimitriou, Sun and H. Tong, 2011] HyperBall on this laptop: 70s per iteration On a 32-core workstation: 23s per iteration
  • 306. Performance On a 177K nodes Hadoop: 2875s per iteration [Kang, Papadimitriou, Sun and H. Tong, 2011] HyperBall on this laptop: 70s per iteration On a 32-core workstation: 23s per iteration On ClueWeb09 (4.8G nodes, 8G arcs) on a 40-core workstation: 141m (avg. 40s per iteration)
  • 307. Convergence 5 15 25 35 45 55 65 75 85 95 0.0000.0020.0040.006 Harmonic centrality # runs Relativeerror
  • 308. And when the Force is not enough?
  • 309. And when the Force is not enough? Diffusion processes are easily distributable
  • 310. And when the Force is not enough? Diffusion processes are easily distributable A Pregel-like implementation of HyperANF is underway (by Sebastian Schelter, TU Berlin)
  • 311. Did you see the light?
  • 312. Did you see the light? No? Me neither... But we have a path
  • 313. Did you see the light? No? Me neither... But we have a path Some things have been ruled out, at least
  • 314. Did you see the light? No? Me neither... But we have a path Some things have been ruled out, at least Some others, that were neglected, have found revenge
  • 315. Did you see the light? No? Me neither... But we have a path Some things have been ruled out, at least Some others, that were neglected, have found revenge Some new ones seem to be promising
  • 316.