Metodologia para el analisis de redes sociales

Metodología para el Análisis de Redes Sociales
Montse Fernández Crespo
II Jornadas de Ciberpolítica en España
Mayo de 2013

Canales monitorización

Canales de monitorización: Twitter

Encuesta tradicional
Predicting de future within social media
“Moreover our predictions are consistently better than those produced by an
information market such as the Hollywood Stock Exchange, the gold standard in the
industry.”

Twitter Not So Good At Predicting Box Office Revenues After All
“A new study of tweets about movies suggests they are not necessarily a good
predictor of box office revenues, say computer scientists .”

¿Por qué sus resultados difieren tanto?

?¿Por qué sus resultados difieren tanto
ceteris paribus
Método en el que se mantienen constantes todas las variables de una
situación, menos aquella cuya influencia se desea estudiar.
Elementos comparados
Sujetos de comparación
Herramientas de análisis
Periodo captura
24 ordinary films
HSX
3 meses (2.89 millones de tweets)
Otros métodos propios
Sentiment analysis
34 nominated Oscar film
MDB y RottenTomatoes
2 meses (12 millones de tweets)
Otros métodos propios
Numerical ratings
Wong et al.Asur y Huberman
DIFERENCIAS

The Pulse of News in Social Media: Forecasting Popularity
“Our experiments show that it is possible to estimate ranges of popularity with an
overall accuracy of 84% considering only content features… Interestingly we have found
that in terms of number of retweets, the top news sources on twitter are not necessarily
the conventionally popular news agencies and various technology blogs such as
Mashable and the Google Blog are very widely shared in social media. Overall, we
discovered that one of the most important predictors of popularity was the source of
the article. “

Twitter Mood Predicts The Stock Market
“The calmness of the public (measured by GPOMS) is thus predictive of the DJIA rather
than general levels of positive sentiment as measured by OpinionFinder.”
87,6%
Precisión en la
predicción diaria
en los valores de
cierre del
Dow Jones
6%
Reducción del
MAE

Predicting elections with Twitter:
What 140 characters reveal about political sentiment
“The mere number of tweets mentioning a political party can be considered a plausible
reflection of the vote share and its predictive power even comes close
to traditional election polls.”
Cuantitativo ---- recuento menciones

Form tweets to polls:
linking text sentiment to public opinion time series
“While our results vary across datasets, in several cases the correlations are as high as
80%, and capture important large-scale trends. The results highlight the potential of
text streams as a substitute and supplement for traditional polling.”
Presidential job approval in 2009
Pesidential elections polls in 2008
100% correlación
Correlación no significativa
Cualitativo- ---- Sentiment Analysis: OpinionFinder
While the results do not come without caution, it is encouraging that expensive and time-
intensive polling can be supplemented or supplanted with the simple-to-gather text data
that is generated from on line social networking.
Metodología para el análisis de Redes Sociales

Limits of electoral predictions using Twitter
-Dataset 1: 2010 US Senate special election in Massachusetts
- Dataset2: US Congressional elections 2010
“Unfortunately, we find no correlation between the analysis results and
the electoral outcomes, contradicting previous reports.”
Cualitativo- ---- Sentiment Analysis: OpinionFinder Cuantitativo ---- recuento menciones

¿Por qué sus resultados difieren tanto?

?¿Por qué sus resultados difieren tanto
ceteris paribus
Método en el que se mantienen constantes todas las variables de una
situación, menos aquella cuya influencia se desea estudiar.
Tumasjan et al.
O’Connor et al.
DIFERENCIAS
Gallo-Avello et al.
Cada tweet que menciona a
un partido (candidato) es
tomado como un “voto”.
No se han contabilizado los
tweets en los que se mencionan
a candidatos opuestos.
Aunque se ha empleado el mismo tesauro,
cada tweet podía únicamente pertenecer
a una de las tres categorías definidas
(positiva, negativa o neutra.), y no a varias
de ellas.
Cada tweet podía pertenecer a varias de
las tres categorías definidas (positiva,
negativa o neutra).

“Exploring the Characteristics of Opinion Expressions for
Political Opinion Classification”
Once we have properly identified a person’s ideology, we may be able to predict his or her
opinions on various political issues.
It is our goal for future work to explore viable approaches for ideology based on political
opinion classification.
“Predicting de future within social media”
Sentiment analysis is a well-studied problem in linguistics and machine learning, with
different classifiers and language models employed in earlier work [13], [14]. It is common to
express this as a classification problem where a given text needs to be labeled as
Positive, Negative or Neutral.

Canales de monitorización: FacebookCanales de monitorización: Facebook

U.S. Politics on Facebook (2010)
http://www.facebook.com/note.php?note_id=449141550881
EEUU 2010
77 ganadores con + likes 43 ganadores con +likes y – dinero
118 elecciones
Canales de monitorización: Facebook

“What is a Social Network Worth?
Facebook and Vote Share in the 2008 Presidential Primaries”
Explainig Facebook Support in the 2008 Congressional Election Cycle”
Thus while Facebook supporter numbers would not be a useful predictor that foreshadows electoral victory
or defeat, the most electable candidates do have more Facebook supporters
14213 seguidores60339 seguidores
Canales de monitorización: Facebook

Canales de monitorización: Google

“Detecting influenza epidemics using search engine query data”
About 90 million American adults are
believed to search online for information
about specific diseases or medical problems
each year7, making web search queries a
uniquely valuable source of information
about health trends.
This system is not designed to be a
replacement for traditional surveillance
networks or supplant the need for
laboratory-based diagnoses and
surveillance.(...) Demographic data, often
provided by traditional surveillance, cannot
be obtained using search queries.
http://www.google.org/flutrends/about/how.html

“On the predictability of the U.S. Elections through search volume activity”
In this paper we report that Google Trends was, actually, not a good predictor of both
the 2008 and 2010 elections
http://cs.wellesley.edu/~webtrust/insights/?cand_id=4
A variable that may have affected G-trends
effectiveness as a tool for predicting political
elections is the sentiment of a user’s query.
It is difficult, though not impossible, to
determine the circumstances behind a user’s
search of the profile of a certain candidate
to make a guess about that candidate’s
public image and why a user might be
interested in the candidate. This is part of
future research that we plan for the next
stage of our work.

Canales de monitorización: Otros

Early Prediction of Movie Box Office Success based on
Wikipedia Activity Big Data
“However, bridging between real time monitoring"
and early predicting"
remains as a big challenge. Here, we report on an
endeavor to build a minimalistic predictive model
for the financial success of movies based on
collective activity data of online users. We show
that the popularity of a movie could be predicted
well in advance by measuring and analyzing the
activity level of editors
and viewers of the corresponding entry to the
movie in Wikipedia, the well-known online
encyclopedia.”
boxofficemojo.com + wikipedia
Canales de monitorización: Wikipedia

Replublican candidates: The Wikipedia effect
“Millions of Americans use Wikipedia as
their primary source of information about
politicians. The user-edited encyclopedia
comes up as the first or second search result
for every candidate for the Republican
nomination, and in most respects provides a
very thorough and accurate profile of their
lives and careers.”
“Wikipedia preserves every version of an
article ever published, so it's possible to
watch the evolution of a page over time.
While all four major candidates were well
known before the primary began, editors
have continued to finesse their biographies
and quarrel over their records. “
Canales de monitorización: Wikipedia

Amazon Election Heat Map 2012
“…los 'best-sellers' de corte republicano
representan un 56% del total despachado,
mientras que los de afiliación demócrata
constituyen un 44%.”
“What about categorizing O’Reilly’s book
about President Lincoln as a “red” book?
“Well, Lincoln was a Republican, but that
doesn’t add much.” Schluep says. “We did
take into consideration Mr. O’Reilly’s
background, as well as the buying habits of
people who bought this book.”
Canales de monitorización: Amazon

Crawling Big Data in a New Frontier for Socioeconomic Research:
Testing with Social Tagging
“On the other hand, the relation between users and
resources, which is largely employed by
traditional Recommender Systems, changes into a ternary
relation between users, resources,
and tags, which is more complex to manage.”
Canales de monitorización: Delicious

¿Existe un Método único?
El análisis de redes sociales se caracteriza por la alta
heterogeneidad de sus fuentes de información y las grandes
cantidades de datos disponibles para el estudio. Así, mientras
que el volumen de datos es un aspecto tremendamente
atractivo para la investigación, la diversidad de fuentes y sus
modos de captura y entrega de información, suponen una
barrera metodológica que consigue que, en muchos casos, los
resultados de los estudios se afirmen con salvedades nada
desdeñables, a la vez que imposibilitan la comparación entre
“pares”.
Conclusiones

With regards to the process of retrieval of
information, the method presented here was
somewhat complex but easy to apply if there is
some computer knowledge. Nevertheless, working
in interdisciplinary teams could greatly help to
develop this kind of knowledge, as it was in our
case. Though the technical process described was
successful, improvements are necessary in the
future…
Conclusiones

Una ponencia de…
Montserrat Fernández Crespo
@montsefc
montsefcfr40@hotmail.com

Metodologia para el analisis de redes sociales

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Metodologia para el analisis de redes sociales

Similar to Metodologia para el analisis de redes sociales (20)

More from Montse Fernández Crespo

More from Montse Fernández Crespo (20)

Recently uploaded

Recently uploaded (20)

Metodologia para el analisis de redes sociales

Editor's Notes