A conference I gave at the Kings's College doctoral school with Mathieu Jacomy on the notion of social border and the advantage of adding continuity in social research through digital navigation.
10. Extensive and
intensive data
Venturini, Tommaso and Bruno Latour, 2010
“The Social Fabric: Digital Traces and Quali-Quantitative Methods”
in Proceedings of Future En Seine 2009, pp. 87–101
Paris: Editions Future en Seine.
11. This is a world where massive amounts of data and applied
mathematics replace every other tool that might be brought to bear. Out
with every theory of human behavior, from linguistics to sociology.
Forget taxonomy, ontology, and psychology. Who knows why people do
what they do? The point is they do it, and we can track and measure it
with unprecedented fidelity. With enough data, the numbers speak for
themselves.
Chris Anderson
http://www.wired.com/science/discoveries/
magazine/16-07/pb_theory
The end of theory?
12. Askitas, N., & Zimmermann, K. 2011
Health and Well-Being in the Crisis IZA
Discussion Paper
Beware: digital data
is not your data!
19. Extensive and
intensive data
Latour, Bruno, Pablo Jensen, Tommaso Venturini,
Sébastian Grauwin and Dominique Boullier, 2012.
“‘The Whole Is Always Smaller than Its Parts’:
A Digital Test of Gabriel Tardes’ Monads.”
The British journal of sociology 63(4), pp. 590–615
22. An ontological and
emergent boundary
The collective self is not a simple epiphenomenon of
its morphologic base, precisely as the individual self
is not a simple efflorescence of the nervous system.
For the collective self to appear, a sui generis
synthesis of individual self has to be produced. This
synthesis creates a world of feelings, ideas, images
that, once come to life, follow their own laws.
Emile Durkheim, 1912
Le formes élémentaires de
la vie religieuse
23. …that may hide other
more relevant boundaries
zgrossbart.github.io/hbo
ecycling/
24. From boundaries
to boundary work
Fences make good neighbors
Gieryn, Thomas F. (1983)
Boundary-work
the demarcation of science from non-science
American Sociological Review 48(6): 781–795
Demarcation is as
much a practical
problem for scientists
as an analytical
problem for
sociologists and
philosophers
25. The lesson of ANT
(and of constructivism)
It is not that in collective life there are no boundaries
(between micro and macro, science and politics…),
It is that all boundaries are constantly
constructed, de-constructed and re-constructed
(and this is work is the object of social research)
26. The lesson of ANT
(and of constructivism)
It is not that in collective life there are no boundaries
(between micro and macro, science and politics…),
It is that all boundaries are constantly
constructed, de-constructed and re-constructed
(and this is work is the object of social research)
Venturini, T. (2010).
Diving in magma: how to explore controversies with actor-network theory. in Public
Understanding of Science, 19(3), 258–273.
27. Part IV Becoming sensitive to the
differences in the density of
association
28. 3 discontinuities
• 1. In data:
intensive data / extensive data
• 2. In methods:
situating / aggregating
• 3. In theory:
micro-interactions / macro-structure
29. 3 discontinuities
to cross
• 1. In data:
intensive data / extensive data
Digital traceability and computation (data geeks)
• 2. In methods:
situating / aggregating
Datascape navigation (designers)
• 3. In theory:
micro-interactions / macro-structure
A non-emergentist theory of action (actor-network theorists)
31. A network (graph)
is not a network (actor-network)
Actor-Network Theory Visual Network Analysis
Actors and networks have the
same properties (they are the
same)
≠
Networks are composite while nodes
are indivisible and uncombinable
Different mediations (can) have
different effects ≠
All edges have the same effect
(possibly with different weight)
Different actors (can) have
different association potential ≠
All nodes have equal linking
potential
A-N are always seen from one or
more specific viewpoints ≠
Networks are usually seen from
above/outside
What counts is change ≠ Networks are statics
32. A question
of resonance
A diagram of a network, then, does not look
like a network but maintain the same
qualities of relations – proximities, degrees
of separation, and so forth – that a network
also requires in order to form.
Resemblance should here be considered a
resonating rather than a hierarchy (a form)
that arranges signifiers and signified within
a sign
(p. 24).
Munster, A. (2013).
An Aesthesia of Networks
Cambridge Mass.: MIT Press
33. The fabric of
(cooked) rice Roland Barthes (1970)
The Empire of Signs
Cooked rice (whose absolutely
special identity is attested by a
special name, which is not that of
raw rice) can be defined only by a
contradiction of substance; it is at
once cohesive and detachable; its
substantial destination is the
fragment, the clump; the volatile
conglomerate… it constitutes in the
picture a compact whiteness,
granular (contrary to that of our
bread) and yet friable:
what comes to the table to the table, dense and stuck together, comes undone at a touch
of the chopsticks, though without ever scattering, as if division occurred only to produce
still another irreducible cohesion (pp. 12-14).
34. The fabric of
collective life
Jacob L. Moreno, April 3, 1933
The New York Times
Social life is continuous but not homogenous
Doing social research is becoming sensitive to
the differences in the density of association
37. Force-vectors’ magic trick
Jacomy, M., Venturini, T., Heymann, S. & Bastian, M. (2014)
ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network
Visualization Designed for the Gephi Software.
PlosONE, 9:6
38. Network as maps London Underground
1920 map
homepage.ntlworld.com/clivebillson/tube/tube.html - www.fourthway.co.uk/tfl.html
39. Network as maps London Underground
1933 map (Harry Beck)
homepage.ntlworld.com/clivebillson/tube/tube.html - www.fourthway.co.uk/tfl.html
43. Visual network analysis questions
A. Position (force-vector spatialization)
1. Nodes density
Where are structural holes (under-populated regions)?
Where are clusters an sub-clusters (over-populated regions)?
Which are the largest and most cohesive clusters?
2. Relative position
Which nodes/clusters are globally and locally central?
Which nodes/clusters are global and local bridges (between clusters)?
B. Size (ranking by in-degree / out-degree)
3. Nodes connectivity
Which nodes are the authorities (receive most connections)?
Which nodes are the hub (originate most connections)?
C. Color (color by partition)
4. Distribution
Is typology coherent with topology (partitions coincide with clusters)?
Which are the exceptions (‘misplaced nodes’)?
44. Visual network analysis questions
A. Position (force-vector spatialization)
1. Nodes density
Where are structural holes (under-populated regions)?
Where are clusters an sub-clusters (over-populated regions)?
Which are the largest and most cohesive clusters?
2. Relative position
Which nodes/clusters are globally and locally central?
Which nodes/clusters are global and local bridges (between clusters)?
B. Size (ranking by in-degree / out-degree)
3. Nodes connectivity
Which nodes are the authorities (receive most connections)?
Which nodes are the hub (originate most connections)?
C. Color (color by partition)
4. Distribution
Is typology coherent with topology (partitions coincide with clusters)?
Which are the exceptions (‘misplaced nodes’)?
48. Visual network analysis questions
A. Position (force-vector spatialization)
1. Nodes density
Where are structural holes (under-populated regions)?
Where are clusters an sub-clusters (over-populated regions)?
Which are the largest and most cohesive clusters?
2. Relative position
Which nodes/clusters are globally and locally central?
Which nodes/clusters are global and local bridges (between clusters)?
B. Size (ranking by in-degree / out-degree)
3. Nodes connectivity
Which nodes are the authorities (receive most connections)?
Which nodes are the hub (originate most connections)?
C. Color (color by partition)
4. Distribution
Is typology coherent with topology (partitions coincide with clusters)?
Which are the exceptions (‘misplaced nodes’)?
51. Visual network analysis questions
A. Position (force-vector spatialization)
1. Nodes density
Where are structural holes (under-populated regions)?
Where are clusters an sub-clusters (over-populated regions)?
Which are the largest and most cohesive clusters?
2. Relative position
Which nodes/clusters are globally and locally central?
Which nodes/clusters are global and local bridges (between clusters)?
B. Size (ranking by in-degree / out-degree)
3. Nodes connectivity
Which nodes are the authorities (receive most connections)?
Which nodes are the hub (originate most connections)?
C. Color (color by partition)
4. Distribution
Is typology coherent with topology (partitions coincide with clusters)?
Which are the exceptions (‘misplaced nodes’)?
54. Visual network analysis questions
A. Position (force-vector spatialization)
1. Nodes density
Where are structural holes (under-populated regions)?
Where are clusters an sub-clusters (over-populated regions)?
Which are the largest and most cohesive clusters?
2. Relative position
Which nodes/clusters are globally and locally central?
Which nodes/clusters are global and local bridges (between clusters)?
B. Size (ranking by in-degree / out-degree)
3. Nodes connectivity
Which nodes are the authorities (receive most connections)?
Which nodes are the hub (originate most connections)?
C. Color (color by partition)
4. Distribution
Is typology coherent with topology (partitions coincide with clusters)?
Which are the exceptions (‘misplaced nodes’)?
59. Visual network analysis
Venturini, T., Jacomy, M and De Carvalho Pereira, D.
Visual Network Analysis:
The example of the rio+20 online debate
(working paper)
When the algorithm is launched, the nodes are moved by the opposite forces until they reach a situation of equilibrium.
A few years ago, Chris Anderson published a controversial article on the journal Wired, in which he argued for The End of Theory:
“At the petabyte scale, information is not a matter of simple three- and four-dimensional taxonomy and order but of dimensionally agnostic statistics. It calls for an entirely different approach, one that requires us to lose the tether of data as something that can be visualized in its totality. It forces us to view data mathematically first and establish a context for it later. For instance, Google conquered the advertising world with nothing more than applied mathematics. It didn't pretend to know anything about the culture and conventions of advertising — it just assumed that better data, with better analytical tools, would win the day. And Google was right. Google's founding philosophy is that we don't know why this page is better than that one: If the statistics of incoming links say it is, that's good enough. No semantic or causal analysis is required. That's why Google can translate languages without actually "knowing" them (given equal corpus data, Google can translate Klingon into Farsi as easily as it can translate French into German). And why it can match ads to content without any knowledge or assumptions about the ads or the content”.
This argument is misleading for the reason I gave in the previous paragraph: learning something from digital traces requires separating information from noise. But things are even more complicated, because there is not way to what is information and what is noise without knowing how the traces have been constructed.
An example will make my argument cleared. Some years ago, I was striving with some colleagues to make sense of Google Insight for Search data and use them for social research. Reading the literature, we stumbled on an amazing discussion paper by Askitas and Zimmermann (2011), in which the two economists claimed to have found a striking correlation between the unemployment rate and the search for anti-depressors’ side effects. The result was compelling: when the unemployment rate begins to rise because of the economical crisis of 2008, so does the query for anti-depressors’ side effects.
Trying to reproduce these findings, however, we noticed something strange: it was not the name of the anti-depressors that matched with unemployment, but the expression ‘side effects’. At first we thought that people might have been taking more medicines in general when they lose their job, but than we found out that other words had the same curve and, in particular, the word ‘template’, which also start being more searched at the end of 2008.
We were striving to make sense of this, when it occurred to us that in late 2008 Google enabled by default its ‘suggest’ feature. This feature is meant to auto-complete common search expressions: when you ask it Google about a dish, it will asky you if you want to know about its recipe, when you ask about motivational letter, it will ask you if you are looking for a template and when you ask about drug, it will ask you if you want to know about its side-effects.
The main aim of this course is to teach you how to avoid jumping from the frying pan of positivism to the fire of relativism.
Or, as the say in Thailand, escape a tiger, meet a crocodile.
The main aim of this course is to teach you how to avoid jumping from the frying pan of positivism to the fire of relativism.
Or, as the say in Thailand, escape a tiger, meet a crocodile.
18
When the algorithm is launched, the nodes are moved by the opposite forces until they reach a situation of equilibrium.
The main aim of this course is to teach you how to avoid jumping from the frying pan of positivism to the fire of relativism.
Or, as the say in Thailand, escape a tiger, meet a crocodile.
22
23
marc lombardi
In the next chapter, we will see how the power of networks as tools for computing, visualizing and manipulating information mixed with the growing availability of data brought by digital traceability could transform the very roots of social sciences. The advantages of networks, however, should not induce to neglect the many differences that exist between actor-network theory and network analysis. Four in particular make classic network analysis unfit to operationalize actor-network theory.
The first and possibly the most important is that while in ANT ‘networks’ and ‘actors’ are the same thing, in network analysis they have completely different properties: while nodes are indivisible and impenetrable (as atoms were supposed to be in physics, before smaller elementary particles took their place), networks are by definition composite. The second and third difficulties come from the lack of differentiation of standard graph theory. In ANT different associations can have different effects (opposing someone has not the same effect of supporting him/her), while in network analysis edge can be of different type but they will all have the same mathematical effect (possibly with different weight or in different direction). Likewise, whereas in ANT actors differ in their potential of association (remember the example of the shepherd, the dog and the fence, who are capable to associate with the sheep in very different ways), in network analysis all nodes connects in the same way. Finally, ANT is a theory of change, what counts in it is the transformation of the actors and their relations. Network analysis, at least in its standard form, has been developed for static networks and handles very badly the dynamics.
But networks are also maps. One of the first proof of this had been provide in 1933 when the sociologist Jacob Moreno published on the NY Times this image. The network portrays the relations of friendship in an elementary school. The title of the article reads “Emotions Mapped by a New Geography”, explicitly stating that the purpose of the visualization is to represent social relations as in a geographical map. Once you know that the triangles in the image represent the boys of the class and the rounds represent the girls, the genre separation becomes evident as well as the first (romantic?) relationship within the class.
In this course, however, we will spatialize networks by using a set of algorithms called ‘force-vector’. These algorithms works by arranges the nodes in the space by simulating a physical system where nodes repulse each other while arcs bounds them like springs.
When the algorithm is launched, the nodes are moved by the opposite forces until they reach a situation of equilibrium.
When the algorithm is launched, the nodes are moved by the opposite forces until they reach a situation of equilibrium.
Networks can be interpreted as geographical maps because the proximity of their points is significant: it means something. Of course there is a capital difference between geographical maps and networks. In the former, the position of the points is depends on a system of coordinates defined before and independently from the points. In the latter, on the contrary, it is the nodes and their relations that define a space that has no autonomous existence.
The clearest illustration of this difference can be drawn from the history of underground maps. Until the 30s, underground maps were designed by placing the stations according to their geographical coordinates and then drawing the lines that connected them.
Then came Harry Beck and he understood that he could legibility by positioning nodes according to their connectivity, rather then their coordinates. Nowadays all underground maps are designed this way. This does not mean, of course, that the distance in the underground maps has lost all meaning: only that its meaning has changed from a geographical distance to a distance in connectivity.
41
42
43
44
… it is easy to identify the areas which contains no or few nodes, also called structural holes …
…
…
48
- Central clusters (located in the middle of the network), because centrality in a spatialized graph is a sign of high and highly diverse connectivity.- Bridging clusters (located in-between two clusters), because this clusters play a crucial role in allowing the circulation of things in the network.
- Central clusters (located in the middle of the network), because centrality in a spatialized graph is a sign of high and highly diverse connectivity.- Bridging clusters (located in-between two clusters), because this clusters play a crucial role in allowing the circulation of things in the network.
51
- The in-degree, corresponding to the number of incoming edges (the number of connection pointing toward the node). The in-degree of a node is also called its ‘authority score’, because receiving many connections is generally correlated to the fact that the node is considered ‘important’ or ‘remarkable’ by the other nodes of the network.
The out-degree, corresponding to the number of outgoing edges (the number of starting from the node). The out-degree of a node is also called its ‘hub score’. Hubs are important in networks because the play a crucial role in the circulation of the information.
Of course, in-degree and out-degree can only be computed in directed graphs (graph in which the connections have a direction). In non-directed graph (such as a graph of friendship, if we assume that friendship is always mutual), it is however possible to compute the degree of nodes (the number of edges connected to a each node).
54
But it is also interesting to observe if topology and classification are consistent (if most of the nodes of a given type are located within the same clusters and, conversely, if clusters are formed by nodes of the same type).
But it is also interesting to observe if topology and classification are consistent (if most of the nodes of a given type are located within the same clusters and, conversely, if clusters are formed by nodes of the same type).
If topology and classification are consistent, it is then interesting to zoom on the exceptions and have a closer look to the nodes that have and unusual position compared to the other nodes of the same type.