SlideShare uma empresa Scribd logo
1 de 47
Engines of Order
Social Media and the Rise of Algorithmic Knowing
Bernhard Rieder
Universiteit van Amsterdam
Mediastudies Department
Starting point
"Algorithms play an increasingly important role in selecting what information is considered
most relevant to us, a crucial feature of our participation in public life." (Gillespie 2015)
From search engines to social media and beyond, the impression is that
socially and culturally relevant tasks are delegated to and performed by
algorithms.
Because algorithms draw together many different things, there are many
ways of beginning to address them.
New forms of "knowing" that have quite different means of producing
knowledge and of making it performative. Can we think of it as a "style of
reasoning" (Hacking 1992)?
My approach to the question
As researcher and software developer with the Digital Methods Initiative, I
build and apply tools that contribute to "knowing" what is happening on
social media, most recently:
☉ Netvizz (Facebook data extraction), Rieder 2013
https://apps.facebook.com/netvizz/
☉ DMI-TCAT (DMI Twitter Capture and Analysis Toolkit), Borra & Rieder 2014
https://github.com/digitalmethodsinitiative/dmi-tcat/
This project is more closely aligned with a book project that investigates
the conceptual content and history of algorithmic information processing.
A critical approach is necessary both for my own role in algorithmic
knowledge production and for understanding how social media make use
of algorithms on various levels.
Algorithms used by computational researchers and platforms are similar.
algorithminput output
system
in use
system
in use
- interface elements
- contents
- users and uses
- interface elements
- contents
- users and uses
- capture
- formalization
- semantics
- display
- interactivity
- performativity
- techniques
- parameters
- internal states
latent
order
revealed
order
users tweeting, clicking,
navigating, reading, etc.
some math 10 trending phrases
Algorithmic configurations
loads of
data results
possible effects
Very large numbers and variety in users,
contents, purposes, arrangements, etc.
"[Commensuration] standardizes
relations between disparate things and
reduces the relevance of context."
(Espeland & Stevens 1998)
Platforms like Twitter
provide opportunities for
creating connections
between defined types of
entities (users, messages,
hashtags, resources, etc.).
They formalize and channel
expression, exchange, and
coordination.
"You cannot reply to a
hashtag."
"Simply put, a system can only
track what it can capture, and it
can only capture information that
can be expressed within a
grammar of action that has been
imposed upon the activity." (Agre
Using social media and the Web
is like living in a survey.
Or rather, in an experiment,
since so many parameters are
controlled.
Grammars need to become
more pervasive or more explicit
("deeper") so that more
semantic data can be captured.
Data pools in social media are
centralized and searchable.
Data is used by social media
platforms at various instances
for various goals.
Data is made accessible at
varying degree to various actors
for various reasons.
Taxonomy of the Encyclopédie
(Diderot and d'Alembert ca. 1783)
United States Census Form, 1910
Knowing the many
Similar experience of "too many" in different fields:
☉ Maxwell (1859): even if atoms are fully deterministic, we could never model the
behavior of a gas by observing individual atoms; => statistical mechanics
☉ Foucault (2004): epidemics, economic dynamics, etc. cast doubt on the family as a
model for understanding and governing society; => "population" and social sciences
☉ Bush (1945): "There is a growing mountain of research." => information retrieval
Between 1850 and 1940 many techniques to think and analyze "the many"
are introduced, looking at the structure and dynamics of interacting
ensembles.
The "erosion of determinism" (Hacking 1981) means that modes of
description are increasingly probabilistic and oriented towards "acting in
an uncertain world" (Callon, Lascoumes, Barthe 2001) that can be "tamed"
(Hacking 1990) through statistical techniques.
Social media deal with various kinds of "the many" (users,
messages, products, ideas, etc.) and strife to provide
answers to questions like who to talk to, what to read,
where to go, what to buy, etc. in the form of decisions.
They make use of various techniques to algorithmically
reduce complexity to allow continuous activity.
From classification to calculation
Classifications as information infrastructures (cf. Bowker and Star 1999) that
orient practice through normalization, standardization, selective
discarding, reformulation, positioning, navigational structuring, etc.
are still relevant.
But various forms of process and calculation are making things much,
much more complicated.
We are currently seeing a race toward understanding the semantics of
expression, behavior, and cultural artifacts.
There are different ways of
producing "semantic" data.
Users are not only filling up the fields, they are
increasingly participating in shaping formalizations.
From classifications to classification procedures.
"One of the simplest ways to
derive information about a user
is to look at the way he uses the
system." (Rich 1983)
Let's not forget that some of
the valuable data are simply a
byproduct of people using the
system.
What are "personal data"?
"Facebook Likes can be used to automatically
and accurately predict a range of highly
sensitive personal attributes including:
sexual orientation, ethnicity, religious and
political views, personality traits,
intelligence, happiness, use of addictive
substances, parental separation, age, and
gender." (Kosinskia, Stillwell, Graepel 2013)
The data used in this study does not
even include friends' likes.
Prediction is determination of
likelihood based on knowledge of
previous events.
Data is analyzed and made
performative immediately inside
of the system.
New categories can be derived
from other data and are instantly
made actionable.
Recapitulation
By providing functionality through always more fine-grained grammars of
action (and other data capturing techniques), social media platforms
accumulate loads of structured and unstructured data.
The semantization of data in relation to operational contexts (through
formalization, derivation, etc.) begins early on.
Classification is deeply caught up with calculation and process.
algorithminput output
system
in use
system
in use
- interface elements
- contents
- users and uses
- interface elements
- contents
- users and uses
- capture
- formalization
- semantics
- display
- interactivity
- performativity
- techniques
- parameters
- internal states
latent
order
revealed
order
Algorithmic configurations
Algorithmic configurations imply "distributed calculative agencies" (Callon
and Munesia 2005) that run through the system and its users.
The data arriving at the algorithm has both latent meaning and order: it is
related to actual practices and not random noise.
Correlation Coefficient (Galton 1885) Linear Regression (Pearson 1901)
Sociogram (Moreno 1934) Sociometric Matrix
(Forsyth and Katz 1946)
Word-Pair Linkages (Luhn 1959) Semantic Road Maps (Doyle 1961)
My Facebook Network
Friendship connections
My Facebook Network
Friends and their 'likes'
My 290 Friends liked at
least 20588 objects
My Facebook Network
Mapping users
according to 'likes'
My Facebook Network
Classifying users
according to 'likes'
My Facebook Network
My post-demographic
profile or sphere
Techniques
There are many different algorithmic techniques that have complex
histories. Each technique reveals the data from a specific angle, but they
are highly plastic and can be easily combined.
They may be reductionist (e.g. graph theory: everything is a point or line),
but also very generative (unlimited number of "views").
Many techniques focus on the relationship between populations and
individuals. In social media units can be qualified in terms of other units.
All of these techniques are "revealing" (in the sense of Heidegger) the data:
they show certain aspects of the latent order in certain ways; they make truth
that is caught up in a position towards the world, a finality.
Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 0.25)
Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 0.40)
Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 0.55)
Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 0.70)
Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 0.85)
Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 1)
Parameters
Any somewhat complex technique reacts (strongly) to variation in
parameters and data. This means that without knowledge of parameters
and data, it is hard to understand/critique an algorithm.
A single parameter can encode a commitment to a specific theory of power
(PageRank at low α is "one person, one vote", at high α "patronage of the
powerful").
Parameters are now often set through continuous testing. They are one of
the places where empirical practices and operational goals can be brought
to converge; - automatically.
We move from "what should the formula be according to our ideas about
relevance?" to "what has our testing engine identified as the optimal
parameters given our operational goal of more user interaction?".
Whenever you read "n000 factors", machine learning techniques are at work.
Machine learning techniques (e.g. Bayesian filters,
maximum entropy classifiers, etc.) can learn to
"interpret" any input signal in relation to
categories, based on feedback ("supervision").
In these techniques, the state of the machine (i.e.
the statistical model) becomes the algorithm.
These self-optimizing, empirical machines are
becoming increasingly common.
The "risk technology" is trained by associating "thousands of
pieces of data" with a probability of defaulting or not defaulting.
Every signal receives meaning as predictor for defaulting.
States
In digital media, we often need to do preciously little to "make things
calculable", since everything already has been made so.
Algorithms are increasingly empirical knowledge machines, that tie the
"real world" to operational modes of optimization and validation.
The epistemological commitment, then, is no longer to a theory or model,
but to a method for generating models.
The difference is thus not just between the "editorial" and the
"algorithmic" (Gillespie 2012), but also between "editorial algorithms" and
"generated algorithms".
"To date, the complexity of mobile and the disparate, closed
platforms that dominate it have caused most people to ignore the
possibility and benefits of A/B testing. […] To us at Taplytics this is
crazy. If you are developing on the web everything is calculated
and optimized and viewed in terms of hypotheses, significance
levels and confidence intervals. On mobile, however, for the past
6 years we have been living in the era of the 'artform' of mobile
apps, where things are viewed in terms of gut feel and shooting
from the hip." (Druxerman 2014)
Since the digital operational environment is fully
integrated, data collection, analysis, decision-
making, and execution are all folded into one.
These are engines of order.
Conclusions
Moving from classification to calculation implies a move from "thing
concepts" (Dingbegriffe) to "relational concepts" (Relationsbegriffe), of
from substance notions of knowledge to functional ones (cf. Cassirer
1910).
A good analogue to algorithmic configurations on social media platforms
are markets and in particular multi-sided markets (Rochet and Tirole 2004).
Just like markets, algorithmic configurations are "places of truth" (Foucault
2004) not in that they show "the truth" but that truth is produced as a
byproduct of their optimal functioning e.g. the right price, the right
trending topics, the right number and type of stories shown, etc.
The right algorithm is the one that produces an optimal equilibrium
between user satisfaction and value extraction through advertising.
Conclusions
"The current mythology of big data is that with more data comes greater accuracy and truth. This
epistemological position is so seductive that many industries, from advertising to automobile
manufacturing, are repositioning themselves for massive data gathering." (Crawford 2014)
This position is problematic and potentially dangerous if it frames
proponents as either naïve ("they don’t know what they are saying") or
cynical ("they don't believe what they are saying").
The danger is not that "big data" acolytes are wrong, but that they are
right. We should consider this as a real possibility.
Conclusions
If they are right, we face a series of really big problems:
☉ If better data + algorithms means better truth, we can expect further
concentration and concentric diversification of large Internet companies through
tipping markets;
☉ Operational concepts of knowledge and truth would become even more pervasive;
☉ Privacy issues pale compared to the threat of knowledge monopolization and the
reconfiguration of publicness according to operational goals that are geared
toward profit maximization;
☉ Political institutions and critical forces are direly unprepared for dealing with
algorithmic engines of order, both technically and normatively;
"I will argue that democratic talk is not essentially spontaneous but essentially role-
governed, essentially civil, and unlike the kinds of conversation often held in highest
esteem for their freedom and their wit, it is essentially oriented to problem-solving."
(Schudson 1997)
Thank You
rieder@uva.nl
@RiederB
http://thepoliticsofsystems.net
https://www.digitalmethods.net

Mais conteúdo relacionado

Mais procurados

Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleTweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleBernhard Rieder
 
Interactive visualization and exploration of network data with gephi
Interactive visualization and exploration of network data with gephiInteractive visualization and exploration of network data with gephi
Interactive visualization and exploration of network data with gephiBernhard Rieder
 
Giovanni Maria Sacco
Giovanni Maria SaccoGiovanni Maria Sacco
Giovanni Maria Saccoguest66dc5f
 
The human face of AI: how collective and augmented intelligence can help sol...
The human face of AI:  how collective and augmented intelligence can help sol...The human face of AI:  how collective and augmented intelligence can help sol...
The human face of AI: how collective and augmented intelligence can help sol...Elena Simperl
 
The story of Data Stories
The story of Data StoriesThe story of Data Stories
The story of Data StoriesElena Simperl
 
Pie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on TwitterPie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on TwitterElena Simperl
 
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...Elena Simperl
 
High-value datasets: from publication to impact
High-value datasets: from publication to impactHigh-value datasets: from publication to impact
High-value datasets: from publication to impactElena Simperl
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computingElena Simperl
 
System Analysis, Foresees and Management of E-Services Impacts on Information...
System Analysis, Foresees and Management of E-Services Impacts on Information...System Analysis, Foresees and Management of E-Services Impacts on Information...
System Analysis, Foresees and Management of E-Services Impacts on Information...SSA KPI
 
We b 20181212 v2
We b 20181212 v2We b 20181212 v2
We b 20181212 v2ISSIP
 
What Data Can Do: A Typology of Mechanisms . Angèle Christin
What Data Can Do: A Typology of Mechanisms . Angèle Christin What Data Can Do: A Typology of Mechanisms . Angèle Christin
What Data Can Do: A Typology of Mechanisms . Angèle Christin eraser Juan José Calderón
 
Perceptions of Syrian refugees and data experts on relocation algorithm
Perceptions of Syrian refugees and data experts on relocation algorithmPerceptions of Syrian refugees and data experts on relocation algorithm
Perceptions of Syrian refugees and data experts on relocation algorithmDataLab - Taltech
 
Data visualization in a Nutshell
Data visualization in a NutshellData visualization in a Nutshell
Data visualization in a NutshellWingChan46
 
Open data for smart cities
Open data for smart citiesOpen data for smart cities
Open data for smart citiesSören Auer
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?Elena Simperl
 

Mais procurados (20)

Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleTweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
 
Interactive visualization and exploration of network data with gephi
Interactive visualization and exploration of network data with gephiInteractive visualization and exploration of network data with gephi
Interactive visualization and exploration of network data with gephi
 
Giovanni Maria Sacco
Giovanni Maria SaccoGiovanni Maria Sacco
Giovanni Maria Sacco
 
The human face of AI: how collective and augmented intelligence can help sol...
The human face of AI:  how collective and augmented intelligence can help sol...The human face of AI:  how collective and augmented intelligence can help sol...
The human face of AI: how collective and augmented intelligence can help sol...
 
The story of Data Stories
The story of Data StoriesThe story of Data Stories
The story of Data Stories
 
Pie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on TwitterPie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on Twitter
 
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
 
High-value datasets: from publication to impact
High-value datasets: from publication to impactHigh-value datasets: from publication to impact
High-value datasets: from publication to impact
 
The data we want
The data we wantThe data we want
The data we want
 
Social Νetworks Data Mining
Social Νetworks Data MiningSocial Νetworks Data Mining
Social Νetworks Data Mining
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computing
 
System Analysis, Foresees and Management of E-Services Impacts on Information...
System Analysis, Foresees and Management of E-Services Impacts on Information...System Analysis, Foresees and Management of E-Services Impacts on Information...
System Analysis, Foresees and Management of E-Services Impacts on Information...
 
Data stories
Data storiesData stories
Data stories
 
We b 20181212 v2
We b 20181212 v2We b 20181212 v2
We b 20181212 v2
 
What Data Can Do: A Typology of Mechanisms . Angèle Christin
What Data Can Do: A Typology of Mechanisms . Angèle Christin What Data Can Do: A Typology of Mechanisms . Angèle Christin
What Data Can Do: A Typology of Mechanisms . Angèle Christin
 
Perceptions of Syrian refugees and data experts on relocation algorithm
Perceptions of Syrian refugees and data experts on relocation algorithmPerceptions of Syrian refugees and data experts on relocation algorithm
Perceptions of Syrian refugees and data experts on relocation algorithm
 
Data visualization in a Nutshell
Data visualization in a NutshellData visualization in a Nutshell
Data visualization in a Nutshell
 
Open data for smart cities
Open data for smart citiesOpen data for smart cities
Open data for smart cities
 
GI Management Transformation: from geometry to databased relationships
GI Management Transformation: from geometry to databased relationshipsGI Management Transformation: from geometry to databased relationships
GI Management Transformation: from geometry to databased relationships
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 

Semelhante a Engines of Order. Social Media and the Rise of Algorithmic Knowing.

Crowdsourcing Approaches for Smart City Open Data Management
Crowdsourcing Approaches for Smart City Open Data ManagementCrowdsourcing Approaches for Smart City Open Data Management
Crowdsourcing Approaches for Smart City Open Data ManagementEdward Curry
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social mediaFarida Vis
 
Black Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic TransparencyBlack Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic TransparencySimon Buckingham Shum
 
Data Visualisation: Types, Principles, and Tools
Data Visualisation: Types, Principles, and ToolsData Visualisation: Types, Principles, and Tools
Data Visualisation: Types, Principles, and ToolsSumandro C
 
Modelling the Media Logic of Software Systems
Modelling the Media Logic of Software SystemsModelling the Media Logic of Software Systems
Modelling the Media Logic of Software SystemsJan Schmidt
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...Piet J.H. Daas
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)paperpublications3
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
 
TL_Thompson.pptx.ppt
TL_Thompson.pptx.pptTL_Thompson.pptx.ppt
TL_Thompson.pptx.pptRGowthamRao
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...The Higher Education Academy
 
Alejandro Arizpe - Artificial Intelligence, Machine Learning, and Databases i...
Alejandro Arizpe - Artificial Intelligence, Machine Learning, and Databases i...Alejandro Arizpe - Artificial Intelligence, Machine Learning, and Databases i...
Alejandro Arizpe - Artificial Intelligence, Machine Learning, and Databases i...Alejandro Arizpe, MBA, MSc IT, PMP
 
Computational Thinking
Computational ThinkingComputational Thinking
Computational ThinkingJason Zagami
 
Big Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachBig Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachAndry Alamsyah
 
De- and Reassembling Data Infrastructures
De- and Reassembling Data InfrastructuresDe- and Reassembling Data Infrastructures
De- and Reassembling Data Infrastructurescgrltz
 
RUNNING HEADER Analytics Ecosystem1Analytics Ecosystem4.docx
RUNNING HEADER Analytics Ecosystem1Analytics Ecosystem4.docxRUNNING HEADER Analytics Ecosystem1Analytics Ecosystem4.docx
RUNNING HEADER Analytics Ecosystem1Analytics Ecosystem4.docxanhlodge
 
Maltego Radium Mapping Network Ties and Identities across the Internet
Maltego Radium Mapping Network Ties and Identities across the InternetMaltego Radium Mapping Network Ties and Identities across the Internet
Maltego Radium Mapping Network Ties and Identities across the InternetShalin Hai-Jew
 
FirstReview these assigned readings; they will serve as your .docx
FirstReview these assigned readings; they will serve as your .docxFirstReview these assigned readings; they will serve as your .docx
FirstReview these assigned readings; they will serve as your .docxclydes2
 
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...Edward Curry
 
MUTATION AND CROSSOVER ISSUES FOR OSN PRIVACY
MUTATION AND CROSSOVER ISSUES FOR OSN PRIVACYMUTATION AND CROSSOVER ISSUES FOR OSN PRIVACY
MUTATION AND CROSSOVER ISSUES FOR OSN PRIVACYpaperpublications3
 

Semelhante a Engines of Order. Social Media and the Rise of Algorithmic Knowing. (20)

Crowdsourcing Approaches for Smart City Open Data Management
Crowdsourcing Approaches for Smart City Open Data ManagementCrowdsourcing Approaches for Smart City Open Data Management
Crowdsourcing Approaches for Smart City Open Data Management
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social media
 
Black Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic TransparencyBlack Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic Transparency
 
Data Visualisation: Types, Principles, and Tools
Data Visualisation: Types, Principles, and ToolsData Visualisation: Types, Principles, and Tools
Data Visualisation: Types, Principles, and Tools
 
Modelling the Media Logic of Software Systems
Modelling the Media Logic of Software SystemsModelling the Media Logic of Software Systems
Modelling the Media Logic of Software Systems
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
 
TL_Thompson.pptx.ppt
TL_Thompson.pptx.pptTL_Thompson.pptx.ppt
TL_Thompson.pptx.ppt
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...
 
Alejandro Arizpe - Artificial Intelligence, Machine Learning, and Databases i...
Alejandro Arizpe - Artificial Intelligence, Machine Learning, and Databases i...Alejandro Arizpe - Artificial Intelligence, Machine Learning, and Databases i...
Alejandro Arizpe - Artificial Intelligence, Machine Learning, and Databases i...
 
Computational Thinking
Computational ThinkingComputational Thinking
Computational Thinking
 
Big Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachBig Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network Approach
 
De- and Reassembling Data Infrastructures
De- and Reassembling Data InfrastructuresDe- and Reassembling Data Infrastructures
De- and Reassembling Data Infrastructures
 
RUNNING HEADER Analytics Ecosystem1Analytics Ecosystem4.docx
RUNNING HEADER Analytics Ecosystem1Analytics Ecosystem4.docxRUNNING HEADER Analytics Ecosystem1Analytics Ecosystem4.docx
RUNNING HEADER Analytics Ecosystem1Analytics Ecosystem4.docx
 
Data Mining the City 2019 - Week 1
Data Mining the City 2019 - Week 1Data Mining the City 2019 - Week 1
Data Mining the City 2019 - Week 1
 
Maltego Radium Mapping Network Ties and Identities across the Internet
Maltego Radium Mapping Network Ties and Identities across the InternetMaltego Radium Mapping Network Ties and Identities across the Internet
Maltego Radium Mapping Network Ties and Identities across the Internet
 
FirstReview these assigned readings; they will serve as your .docx
FirstReview these assigned readings; they will serve as your .docxFirstReview these assigned readings; they will serve as your .docx
FirstReview these assigned readings; they will serve as your .docx
 
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
 
MUTATION AND CROSSOVER ISSUES FOR OSN PRIVACY
MUTATION AND CROSSOVER ISSUES FOR OSN PRIVACYMUTATION AND CROSSOVER ISSUES FOR OSN PRIVACY
MUTATION AND CROSSOVER ISSUES FOR OSN PRIVACY
 

Último

4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxruthvilladarez
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEaurabinda banchhor
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 

Último (20)

4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docx
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSE
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 

Engines of Order. Social Media and the Rise of Algorithmic Knowing.

  • 1. Engines of Order Social Media and the Rise of Algorithmic Knowing Bernhard Rieder Universiteit van Amsterdam Mediastudies Department
  • 2. Starting point "Algorithms play an increasingly important role in selecting what information is considered most relevant to us, a crucial feature of our participation in public life." (Gillespie 2015) From search engines to social media and beyond, the impression is that socially and culturally relevant tasks are delegated to and performed by algorithms. Because algorithms draw together many different things, there are many ways of beginning to address them. New forms of "knowing" that have quite different means of producing knowledge and of making it performative. Can we think of it as a "style of reasoning" (Hacking 1992)?
  • 3. My approach to the question As researcher and software developer with the Digital Methods Initiative, I build and apply tools that contribute to "knowing" what is happening on social media, most recently: ☉ Netvizz (Facebook data extraction), Rieder 2013 https://apps.facebook.com/netvizz/ ☉ DMI-TCAT (DMI Twitter Capture and Analysis Toolkit), Borra & Rieder 2014 https://github.com/digitalmethodsinitiative/dmi-tcat/ This project is more closely aligned with a book project that investigates the conceptual content and history of algorithmic information processing. A critical approach is necessary both for my own role in algorithmic knowledge production and for understanding how social media make use of algorithms on various levels. Algorithms used by computational researchers and platforms are similar.
  • 4. algorithminput output system in use system in use - interface elements - contents - users and uses - interface elements - contents - users and uses - capture - formalization - semantics - display - interactivity - performativity - techniques - parameters - internal states latent order revealed order users tweeting, clicking, navigating, reading, etc. some math 10 trending phrases Algorithmic configurations loads of data results possible effects
  • 5. Very large numbers and variety in users, contents, purposes, arrangements, etc. "[Commensuration] standardizes relations between disparate things and reduces the relevance of context." (Espeland & Stevens 1998)
  • 6. Platforms like Twitter provide opportunities for creating connections between defined types of entities (users, messages, hashtags, resources, etc.). They formalize and channel expression, exchange, and coordination. "You cannot reply to a hashtag." "Simply put, a system can only track what it can capture, and it can only capture information that can be expressed within a grammar of action that has been imposed upon the activity." (Agre
  • 7. Using social media and the Web is like living in a survey. Or rather, in an experiment, since so many parameters are controlled. Grammars need to become more pervasive or more explicit ("deeper") so that more semantic data can be captured.
  • 8. Data pools in social media are centralized and searchable. Data is used by social media platforms at various instances for various goals. Data is made accessible at varying degree to various actors for various reasons.
  • 9. Taxonomy of the Encyclopédie (Diderot and d'Alembert ca. 1783)
  • 10. United States Census Form, 1910
  • 11. Knowing the many Similar experience of "too many" in different fields: ☉ Maxwell (1859): even if atoms are fully deterministic, we could never model the behavior of a gas by observing individual atoms; => statistical mechanics ☉ Foucault (2004): epidemics, economic dynamics, etc. cast doubt on the family as a model for understanding and governing society; => "population" and social sciences ☉ Bush (1945): "There is a growing mountain of research." => information retrieval Between 1850 and 1940 many techniques to think and analyze "the many" are introduced, looking at the structure and dynamics of interacting ensembles. The "erosion of determinism" (Hacking 1981) means that modes of description are increasingly probabilistic and oriented towards "acting in an uncertain world" (Callon, Lascoumes, Barthe 2001) that can be "tamed" (Hacking 1990) through statistical techniques.
  • 12. Social media deal with various kinds of "the many" (users, messages, products, ideas, etc.) and strife to provide answers to questions like who to talk to, what to read, where to go, what to buy, etc. in the form of decisions. They make use of various techniques to algorithmically reduce complexity to allow continuous activity.
  • 13. From classification to calculation Classifications as information infrastructures (cf. Bowker and Star 1999) that orient practice through normalization, standardization, selective discarding, reformulation, positioning, navigational structuring, etc. are still relevant. But various forms of process and calculation are making things much, much more complicated. We are currently seeing a race toward understanding the semantics of expression, behavior, and cultural artifacts.
  • 14.
  • 15. There are different ways of producing "semantic" data. Users are not only filling up the fields, they are increasingly participating in shaping formalizations. From classifications to classification procedures.
  • 16. "One of the simplest ways to derive information about a user is to look at the way he uses the system." (Rich 1983) Let's not forget that some of the valuable data are simply a byproduct of people using the system.
  • 17. What are "personal data"? "Facebook Likes can be used to automatically and accurately predict a range of highly sensitive personal attributes including: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender." (Kosinskia, Stillwell, Graepel 2013) The data used in this study does not even include friends' likes. Prediction is determination of likelihood based on knowledge of previous events.
  • 18. Data is analyzed and made performative immediately inside of the system. New categories can be derived from other data and are instantly made actionable.
  • 19.
  • 20. Recapitulation By providing functionality through always more fine-grained grammars of action (and other data capturing techniques), social media platforms accumulate loads of structured and unstructured data. The semantization of data in relation to operational contexts (through formalization, derivation, etc.) begins early on. Classification is deeply caught up with calculation and process.
  • 21. algorithminput output system in use system in use - interface elements - contents - users and uses - interface elements - contents - users and uses - capture - formalization - semantics - display - interactivity - performativity - techniques - parameters - internal states latent order revealed order Algorithmic configurations Algorithmic configurations imply "distributed calculative agencies" (Callon and Munesia 2005) that run through the system and its users. The data arriving at the algorithm has both latent meaning and order: it is related to actual practices and not random noise.
  • 22. Correlation Coefficient (Galton 1885) Linear Regression (Pearson 1901)
  • 23. Sociogram (Moreno 1934) Sociometric Matrix (Forsyth and Katz 1946)
  • 24. Word-Pair Linkages (Luhn 1959) Semantic Road Maps (Doyle 1961)
  • 26. My Facebook Network Friends and their 'likes' My 290 Friends liked at least 20588 objects
  • 27. My Facebook Network Mapping users according to 'likes'
  • 28. My Facebook Network Classifying users according to 'likes'
  • 29. My Facebook Network My post-demographic profile or sphere
  • 30. Techniques There are many different algorithmic techniques that have complex histories. Each technique reveals the data from a specific angle, but they are highly plastic and can be easily combined. They may be reductionist (e.g. graph theory: everything is a point or line), but also very generative (unlimited number of "views"). Many techniques focus on the relationship between populations and individuals. In social media units can be qualified in terms of other units. All of these techniques are "revealing" (in the sense of Heidegger) the data: they show certain aspects of the latent order in certain ways; they make truth that is caught up in a position towards the world, a finality.
  • 31. Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 0.25)
  • 32. Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 0.40)
  • 33. Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 0.55)
  • 34. Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 0.70)
  • 35. Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 0.85)
  • 36. Random Network, Size: inDegree, Color (blue => yellow => red): PageRank (α = 1)
  • 37. Parameters Any somewhat complex technique reacts (strongly) to variation in parameters and data. This means that without knowledge of parameters and data, it is hard to understand/critique an algorithm. A single parameter can encode a commitment to a specific theory of power (PageRank at low α is "one person, one vote", at high α "patronage of the powerful"). Parameters are now often set through continuous testing. They are one of the places where empirical practices and operational goals can be brought to converge; - automatically.
  • 38. We move from "what should the formula be according to our ideas about relevance?" to "what has our testing engine identified as the optimal parameters given our operational goal of more user interaction?". Whenever you read "n000 factors", machine learning techniques are at work.
  • 39. Machine learning techniques (e.g. Bayesian filters, maximum entropy classifiers, etc.) can learn to "interpret" any input signal in relation to categories, based on feedback ("supervision"). In these techniques, the state of the machine (i.e. the statistical model) becomes the algorithm. These self-optimizing, empirical machines are becoming increasingly common.
  • 40. The "risk technology" is trained by associating "thousands of pieces of data" with a probability of defaulting or not defaulting. Every signal receives meaning as predictor for defaulting.
  • 41. States In digital media, we often need to do preciously little to "make things calculable", since everything already has been made so. Algorithms are increasingly empirical knowledge machines, that tie the "real world" to operational modes of optimization and validation. The epistemological commitment, then, is no longer to a theory or model, but to a method for generating models. The difference is thus not just between the "editorial" and the "algorithmic" (Gillespie 2012), but also between "editorial algorithms" and "generated algorithms".
  • 42. "To date, the complexity of mobile and the disparate, closed platforms that dominate it have caused most people to ignore the possibility and benefits of A/B testing. […] To us at Taplytics this is crazy. If you are developing on the web everything is calculated and optimized and viewed in terms of hypotheses, significance levels and confidence intervals. On mobile, however, for the past 6 years we have been living in the era of the 'artform' of mobile apps, where things are viewed in terms of gut feel and shooting from the hip." (Druxerman 2014)
  • 43. Since the digital operational environment is fully integrated, data collection, analysis, decision- making, and execution are all folded into one. These are engines of order.
  • 44. Conclusions Moving from classification to calculation implies a move from "thing concepts" (Dingbegriffe) to "relational concepts" (Relationsbegriffe), of from substance notions of knowledge to functional ones (cf. Cassirer 1910). A good analogue to algorithmic configurations on social media platforms are markets and in particular multi-sided markets (Rochet and Tirole 2004). Just like markets, algorithmic configurations are "places of truth" (Foucault 2004) not in that they show "the truth" but that truth is produced as a byproduct of their optimal functioning e.g. the right price, the right trending topics, the right number and type of stories shown, etc. The right algorithm is the one that produces an optimal equilibrium between user satisfaction and value extraction through advertising.
  • 45. Conclusions "The current mythology of big data is that with more data comes greater accuracy and truth. This epistemological position is so seductive that many industries, from advertising to automobile manufacturing, are repositioning themselves for massive data gathering." (Crawford 2014) This position is problematic and potentially dangerous if it frames proponents as either naïve ("they don’t know what they are saying") or cynical ("they don't believe what they are saying"). The danger is not that "big data" acolytes are wrong, but that they are right. We should consider this as a real possibility.
  • 46. Conclusions If they are right, we face a series of really big problems: ☉ If better data + algorithms means better truth, we can expect further concentration and concentric diversification of large Internet companies through tipping markets; ☉ Operational concepts of knowledge and truth would become even more pervasive; ☉ Privacy issues pale compared to the threat of knowledge monopolization and the reconfiguration of publicness according to operational goals that are geared toward profit maximization; ☉ Political institutions and critical forces are direly unprepared for dealing with algorithmic engines of order, both technically and normatively; "I will argue that democratic talk is not essentially spontaneous but essentially role- governed, essentially civil, and unlike the kinds of conversation often held in highest esteem for their freedom and their wit, it is essentially oriented to problem-solving." (Schudson 1997)

Notas do Editor

  1. Question of classification is not new, obviously and conflicts around classification have a long history.
  2. Parameters: a little bit shorter
  3. Image from Techcrunch: http://techcrunch.com/2014/04/03/the-filtered-feed-problem/
  4. The lists can be seen as vectors as well and then treated with the full arsenal of geometry (e.g. to calculate a similarity coefficient between two such vectors)