SlideShare a Scribd company logo
1 of 10
Download to read offline
Proceedings of the 39th Hawaii International Conference on System Sciences - 2006




                                   You Are Who You Talk To:
                              Detecting Roles in Usenet Newsgroups


                      Danyel Fisher, Marc Smith, and Howard T. Welser
                                     Microsoft Research
           danyelf@microsoft.com, masmith@microsoft.com, twelser@u.washington.edu

Abstract
   Understanding the social roles of the members a          understand how their social behavior suggests their
group can help to understand the social context of the      role within the space. By generating these networks
group. We present a method of applying social               for each author in a newsgroup, we can begin to
network analysis to support the task of characterizing      roughly characterize the newsgroup in terms of the
authors in Usenet newsgroups. We compute and                different types of participant authors.
visualize networks created by patterns of replies for          In this paper, we examine the social networks
each author in selected newsgroups and find that            created by patterns of reply between authors
second-degree ego-centric networks give us clear            contributing messages to nine Usenet newsgroups
distinctions between different types of authors and         during a one-month period (Table 1). These nine
newsgroups.                                                 groups were hand-selected to provide a broad
   Results show that newsgroups vary in terms of the        overview of Usenet topics. These newsgroups are of
populations of participants and the roles that they         medium size, relative to the rest of Usenet: they are
play. Newsgroups can be characterized by                    not among either the biggest groups, or the smallest.
populations that include question and answer                   We sought out groups that we already knew were
newsgroups, conversational newsgroups, social               of several general genres: question and answer, social
support newsgroups, and flame newsgroups.                   support, discussion, and flame. We chose these
   This approach has applications for both                  newsgroups because they are exemplars of their types.
researchers seeking to characterize different types of      Microsoft.public.windows.server.general,            for
social cyberspaces as well as participants seeking to       example, is low on discussion, but active with
distinguish interaction partners and content authors.       questions and answers; in contrast, alt.flame is
                                                            uncluttered by rational discussion.
                                                               There have been a number of studies that have
1. Introduction                                             attempted to examine and describe the sets of social
                                                            roles that occur in both online and offline
   Joining a new social group can be a disorienting         conversation [6, 11]. In this paper, we expand on the
experience. Understanding the social roles of different     ideas of Turner et al [11] and Welser [14], in
members of a group is a good way to understand both         examining a small number of visibly different roles.
the social context of the group, and information about         A social role “refers to the behavior of status-
the people within it. A number of different                 occupants that is oriented toward the patterned
visualization tools exist [2, 10, 12] that are meant to     expectations of others” (pg 41) [7]. This notion
give some sense of how a conversation space is              emphasizes structure – in terms of relations to others,
constructed. These various visualizations have all          and structure, in terms of meaningful expectations for
shown different perspectives on metadata within             systematic behavior. We focus on emergent roles that
online social spaces [11].                                  grow from participation in the focal activities, as
   In this paper we present a way of applying social        distinct from formal organizational roles like
network techniques to online conversation spaces. In        “manager”.
particular, we focus on the question “who is this
person with whom I am interacting?” looking to




                                       0-7695-2507-5/06/$20.00 (C) 2006 IEEE                                          1
Proceedings of the 39th Hawaii International Conference on System Sciences - 2006




                       Table 1. Nine newsgroups, with some basic statistics.
 Name                       authors    replies % threads with            % authors with
 January 2001 except as                        2 messages  5+ messages   degree > 10 1 message
 noted
 Technical Newsgroups
  comp.soft-sys.matlab           437         712           28%         20%            3%           41%
  microsoft.public.window        855         1489          33%         10%            3%           51%
  s.server.general, Jan.
  2004
 Discussion Newsgroups
  rec.kites                      276         924           19%         32%            12%          32%
  rec.music.phish                1263        6105          23%         22%            20%          29%
 Political Newsgroup
  alt.politics                   2187        16059         17%         40%            20%          26%
 Flame Newsgroups
  alt.flame                      618         3802          16%         34%            13%          32%
  alt.alien-                     755         6494          17%         50%            16%          33%
  vampire.flonk.flonk.flonk
 Social Support Newsgroups
  alt.support.divorce            339         2937          14%         53%            21%          23%
  alt.support.autism             188         3095          17%         60%            32%          20%
    We further focus our attention on the relationship       persistent conversation spaces. All messages we
between the communication behaviors of participants          examine are publicly visible; thus, notions of
in Usenet with their social structural position derived      information transfer within the group are less
from network analysis. In network analysis, we can           confining than they might be in an organizational
identify roles as positions that have a distinct pattern     network. In short, we expect people to choose their
of relations to other positions [13]. Our analysis takes     correspondents, but not to be thoroughly-embedded
fundamental features of an actor’s structural position       within a broader network.
(like their number of neighbors) and compares how               Data to construct the networks presented here was
those basic attributes vary across different                 generated by the Netscan system [8] at Microsoft
conversational spaces.                                       Research. Netscan stores message headers from
                                                             Usenet newsgroups and constructs a range of
2. Methods                                                   measures of author, newsgroup and thread size and
                                                             activity. Netscan was used to generate reports
    Our approach focuses on linking two scales of            showing which authors sent messages in response to
analysis between individual and collective structures        messages from another author. This data was further
in Usenet newsgroups. We alternately examine                 analyzed and graphically presented making extensive
newsgroup behavior and the behavior of individual            use of the JUNG [9] system for visualizing and
authors. Each casts light upon the other. The key            computing social network data.
social structure used throughout is egocentric social           Table 1 shows the nine newsgroups selected, and
networks constructed through patterns of reply. An           some overview data related to these newsgroups.
egocentric network reports the other people and all             We collected data from the “conversational subset”
ties surrounding an actor to a limited distance. In this     of the newsgroup’s messages: the set of messages that
case, we have limited our distance to two: that is, we       are either responded to by someone else, or are
are looking at the neighbors of the neighbors of the         themselves responses to someone else. (These do not,
actors.                                                      then, include messages that are do not receive
    Phrased less formally, we are trying to understand       replies.) This is examining only replies within the
whether the people with whom the actor interacts are         month of January, 2001. (For the more recent
themselves well-connected or poorly-connected, and           newsgroup microsoft.public.windows.server.general,
so we look at their neighbors.                               we examine January of 2004).
    The choice of using egocentric measures is
discussed more thoroughly in [6]. It is driven by the
particular properties of network structure in public




                                                                                                                     2
Proceedings of the 39th Hawaii International Conference on System Sciences - 2006




2.1. A Person’s Social Network                               groups. We briefly describe these three strategies
                                                             here.
    The core of our data is composed of social                  A basic way to contrast different newsgroups is to
networks built from the interactions present in a set of     compare the percentage of authors who have appeared
newsgroups. This network will, as with all social            in only one message during the time period. In all
networks, connect pairs of vertices with edges.              newsgroups, appearing only once can be a result of
Vertices represent individual authors within the             many different things: some authors, for example, will
newsgroup who have replied to at least one message,          post infrequently, and so only be seen once. Some
or been replied to at least once. Directed edges             authors are only visiting the newsgroup, or are
represent the replies: an edge from A to B means that        unintentionally cross-posted into the newsgroup.
A replied to B’s message. Edges are further annotated        However, differences in newsgroup behavior can also
with a weight, which indicates the number of times           cause different reasons to appear only once. In
which A has replied to B. Note that we use the word          particular, we might believe that the more inclusive a
“reply” slightly loosely; within newsgroups, one does        newsgroup is, the more likely it is that authors will
not directly reply to someone else, but rather posts a       appear more than once. We can similarly examine
message in response to another’s message. The                thread length: newsgroups that have short threads are
distinction is subtle, but important: while we discuss       likely to be oriented toward question and answer,
interpersonal relations in terms of who replied to           while longer threads suggest discussion.
whom, we acknowledge that it is possible to post a              Another way to compare newsgroups is to compare
follow-up message to a thread without giving any             their in-degree distribution and out-degree
thought to the person who is being responded to.             distribution. Any large newsgroup’s participation
    None the less, we will—for simplicity—elide “A           level reflects a rough power-law curve: there will be a
posted a message, in direct response to a previous           few very active participants, and a great many
message posted by B” to “A replies to B.” As this is         inactive participants. While the shape is the same, we
interpreted as an edge in the graph, we will also speak      can compare the rate at which these power laws drop
of the graph properties as social interaction. Thus, “A      off—that is, the degree of inequality in the group—
is of high out-degree” means “A responded to many            between different groups. We might expect a group
different people,” while “B is of high in-degree”            with a strong central core to have a very different
means “B was responded to by many people.”                   distribution from a group that has broad participation.
    We visualize these networks by showing the               In addition, we can compare the distribution of in-
second-degree egocentric network around any given            degree to the distribution of out-degree for any given
actors: that is, we look at all of that actor’s neighbors,   group. Intuitively, this is the process of seeing
and their neighbors, as in Figure 1b. Directed ties are      whether the most active participants are as replied-to
depicted with arrowheads, while the weight of edges          as they do replying: are they central as people who
is illustrated with thickness.                               start conversations, or as ones who end them?
    We generated one- and two-degree egocentric                 Last, we can adapt the technique used with the
networks for all 6918 authors in our data set, and our       individual network visualizations of looking at the
analysis summarizes insights from comparing these            degree distribution of out-neighbors by combining the
networks across the nine groups. In addition, to look        distributions across the groups. By combining the
at individual connectivity, we calculate a distribution      distributions across the most active actors, we
of the out-degree of each actor’s out-neighbors: that        generate distributions which can be, again, compared.
is, we look at whether the actor typically replies to
well-connected or poorly-connected alters. We                3. Results
examine the interconnectivity of members of
newsgroups, and we look at the reciprocity of ties               We discuss first the individual measures that
within these limited networks.                               identify personal roles. For each newsgroup and each
                                                             role, we present a typical person who exemplifies that
2.2. Group-Wide Comparisons                                  role, with a discussion of that sort of person and some
                                                             of the characteristic identifiers for locating them
   We have applied several techniques to compare             within the group.
groups to each other. We compare the percentage of               Next, we discuss group-wide statistics, and
authors who have appeared only once in the group,            connect the group types we had previously identified
and look at thread length. We analyze collective in-         to these broader statistics.
degree and out-degree statistics, and we compare
degree distribution coefficients across the nine




                                                                                                                       3
Proceedings of the 39th Hawaii International Conference on System Sciences - 2006




 Figure 1: A question-
 and-answer, from the
 questioner’s
 perspective, at
 distance 1 (above) and
 2 (right).

 The questioner is the
 large black dot;
 distance 1 is in blue,
 and distance 2 in gray.



3.1. Network Diagrams                                      send most of his or her replies to people who are
                                                           disconnected from each other. There is a small cluster
    We can refine our understanding of the differences
                                                           in the bottom-right where several people of high
between these newsgroups—and the individual
                                                           degree are inter-connected, but that is dwarfed by the
authors within them—by examining visualizations of
                                                           number of individual messages.
the network structures themselves. Let us begin with a
                                                               To re-examine this from an answer person’s
simple example: that of a typical question and answer
                                                           perspective, we see an answer person in Figure 2a.
(Figure 1). From the perspective of the questioner, this
                                                           Note, again, the large number of connections to nodes
is a simple transaction: they (the large dark dot) ask a
                                                           that are not themselves interconnected.
question, and get a response. They get only one
                                                              When we expand to distance 2, we see that some
response, and never connect to any other person. We
                                                           few of the people to which this person interacted were
gain a greater understanding of the social roles
                                                           answer people – but the vast majority was not. To
involved, when we examine the network at distance
                                                           quantify this, we can examine the out-degree
two. This allows us to see the network around the
                                                           distribution of his neighbors: that is what was the out-
person who responded.
                                                           degree of the people to whom this person responded?
    We see that this response was not an unusual thing
                                                           This question allows us to understand whether they
for this answer person. Indeed, this person seems to
                                                           preferentially reply to people who are well-connected,
                                                           or people who are more isolated.




                                                                                                                      4
Proceedings of the 39th Hawaii International Conference on System Sciences - 2006




Figure 2. Reply-to network at distance 1 (left) and distance 2 (center), and out-degree
distribution histogram, for one representative member for each of five newsgroups.
From top to bottom, (a) microsoft.public.windows.server.general, (b) rec.kites, (c)
alt.flame, (d) alt.politics, and (e) alt.support.divorce.




                                                                                                    5
Proceedings of the 39th Hawaii International Conference on System Sciences - 2006




    We see that this person responded largely to            This suggests that the newsgroup’s core has an
people of low out-degree. The graph at right is on a        exclusive clique aspect to it.
log-log scale, and so we see that virtually all his
communication occurred with people of degree below              We can confirm this by examining newsgroup-
3, with the most at 0. This, then, is a reasonable          wide statistics. Across all of Usenet, according to
characterization: someone who responds to many              Netscan, 66% of posters post only one message.
people of low degree, but infrequently to those who         Ordinarily, 93% of those are initial turns: they are
respond to many.                                            from people who (try to) initiate a thread. In this
    We shall see that this structural pattern is a          newsgroup, however, only 50% of those posts are
distinguishing characteristic of question and answer        initial turns: the rest are replying to messages.
newsgroups.                                                     While we don’t have a direct explanation for how
    In figure 3, we show the average of the                 the newsgroup maintains its exclusion of outsiders, it
distributions of out-degree for the most active posters     is definitely the case that the most prolific posters
within the newsgroup—those who have been involved           clearly prefer to reply to other prolific posters. This is
in ten or more replies—to emphasize that this               shown, again, by the degree distribution.
individual is indeed fairly representative.                     We contrast this with a very different newsgroup:
    Contrast the alt.flame newsgroup (2c). This             alt.support.divorce (figure 2e), which is dedicated to
newsgroup, dedicated to the art of the flame war,           social support of people who are going through or
involves authors flinging insults at each other on a        have recently been through a divorce. We have
broad variety of topics, interest areas, and degrees of     already seen that alt.support.divorce is different from
obscenity and personalization. We see, immediately,         any of the newsgroups we have discussed so far: it
that alt.flame looks significantly different from the       has a smaller percentage of authors who post only one
technical support newsgroup.                                message; it has a more equitable distribution of both
    Perhaps most visible is the density of the network.     in-degree and out-degree across the group. Now we
In alt.flame, many people connect to many others; the       will see that its most active users are also less
network is a tightly-knit one. Only slightly more           exclusive then alt.politics.
subtly, we see many thick edges. In alt.flame, users            Figure 2e shows a not-atypical member of
get into intense pair-wise connections: flame wars, as      alt.support.divorce newsgroups with the by-now-
the two find a topic that mutually interests them.          familiar degree distribution. Note the radically
While reading the newsgroup shows that there are            different degree distribution: the strong bump at the
often multiple participants in early stages of threads,     front indicates a newsgroup-wide interest in
threads often evolve (or devolve) into strong               welcoming new participants, while the large bump at
reciprocal ties. Indeed, a clear indicator of a flame       the back indicates a continuing preference for well-
newsgroup seems to be strong reciprocity. While             linked in-group members.
traditionally, reciprocal turn-taking is a sign of a good
conversation, we should recall that newsgroups tend         3.2. Cumulative Degree Distribution
to act as public spaces; the strong reciprocity is a sign
of an inward focus and vigorous debate.                        We now begin to compare newsgroups to each
    Note that the alt.flame out-degree distribution is      other. In the social networks section, we examined
different: it peaks at a middle value. People in            individual out-neighbor degree distribution. We now
alt.flame preferentially reply to people who are fairly     compare these degree distributions to each other,
well-connected: people who have responded to few            which allows us to compare newsgroups to each
people in the newsgroup are not, in turn, linked.           other.
    These groups were extremes, in opposite                    The histograms are shown in Figure 3. For each
directions. rec.kites (figure 2b) is a happy medium: a      group, we present the average of the number of out-
discussion forum that has an in-group, but admits           degree neighbors in each size bracket: 0, 1, 2-3, 4-7,
outsiders. Note both the well-connected core, and the       8-15, 16-31, 32-63, and 64-127. In order to get only
less-well-connected (but largely still of greater           authors with non-trivial histograms, we focused
degree, and actively participating) periphery.              specifically on authors with more than ten neighbors
    alt.politics (figure 2d) is a very large newsgroup      (that is, people who either they replied to, or who
dedicated to discussion. The distance-1 graph visibly       replied to them): an active minority, ranging from 3%
shows, again, a tightly-connected newsgroup; the            of the group’s membership (in the technical groups)
distance-2 graph shows an interesting phenomenon: a         to 30% of the group’s membership (in the flame
population of outsiders who reply to messages within        groups), as illustrated in Table 1.
the newsgroup, but are largely not, in turn, replied to.




                                                                                                                         6
Proceedings of the 39th Hawaii International Conference on System Sciences - 2006




       We note that these patterns reinforce what was                                                                                     (for degree > 0) falls along a power law: there are
    observed above: there is a distinct preference for                                                                                    many at the low end, and far fewer at the high end.
    replying to people who had not replied to anyone for                                                                                     The exponent on a power law can be estimated by
    the technical support groups; there is an opposite                                                                                    examining the slope of the distribution, plotted on
    preference in the politics groups. Support groups have                                                                                log-log axes; a negative exponent indicates a curve
    both a peak at the low end and high end, representing                                                                                 with lower y at higher x [1]. The nine groups are then
    an interest in both connecting to new people and in                                                                                   plotted on a scatter-plot, with in-degree distribution –
    maintaining ties with more senior ones.                                                                                               that is, the distribution of the number of replies that
                                                                                                                                          authors get – is compared to the out-degree
    3.3. Degree Equality Within Newsgroups                                                                                                distribution, as shown in Figure 4.
                                                                                                                                             Recall that the exponent on this power law is
        Another dimension of difference that separates                                                                                    roughly a measure of equality of contribution: a
    newsgroups from one another compares the                                                                                              coefficient of 0 suggests a newsgroup which has a
    distribution of the degrees of nodes to each other. For                                                                               completely uniform contribution, in which each
    all these newsgroups, the distribution of vertex degree                                                                                     member has the same in-degree and out-degree
                            rec.music.phish                                                             rec.kites                               as each other member; a strongly negative
6                                                                                 6
                                                                                                                                                number is more inequitably distributed.
5                                                                                 5                                                                 By this reading, most newsgroups fall nearly
4                                                                                 4                                                             along a balanced axis, with similar distributions
3                                                                                 3
                                                                                                                                                of in-degree and out-degree: that is, most people
2                                                                                 2

1                                                                                 1
                                                                                                                                                have the same sort of reply rate as their being-
0                                                                                 0                                                             replied-to rate. The social support newsgroups,
        0            1,1         2,3     4,7    8,15    16,31    32,63   64,127        0    1,1   2,3     4,7   8,15   16,31   32,63   64,127
                                                                                                                                                alt.support.autism and alt.support.divorce, both
                    microsoft.public.server.                                               comp.software.systems.                               have a more balanced distribution, while the
14
                           general                                                6               matlab                                        other newsgroups have radically different
12                                                                                5                                                             distributions.
10
 8
                                                                                  4
                                                                                                                                                    Note also that server.general and matlab both
                                                                                  3
 6
                                                                                  2
                                                                                                                                                fall substantially off of the main axis. For both
 4
 2                                                                                1                                                             newsgroups, their out-degree is distributed less
 0                                                                                0                                                             equitably than their in-degree. This is, again,
                0          1,1     2,3    4,7    8,15    16,31    32,63 64,127         0    1,1   2,3     4,7   8,15   16,31   32,63   64,127
                                                                                                                                                consistent with the model of a small population
                           alt.alien-vampire.                                                           alt.flame                               of answer people responding to a large
6
                           flonk.flonk.flonk                                      6
                                                                                                                                                population of questions. Virtually everyone, no
                                                                                  5
5
                                                                                                                                                matter their degree, gets a response – and so the
4                                                                                 4


3                                                                                 3

2                                                                                 2                                                                                                                                           0
1                                                                                 1                                                                                   -2.5      -2       -1.5          -1            -0.5          0
0                                                                                 0
        0            1,1         2,3     4,7    8,15    16,31    32,63   64,127        0    1,1   2,3     4,7   8,15   16,31   32,63   64,127
                                                                                                                                                                                                                            -0.5
                                                                                                                                                 In-degree exponent




                       alt.support.autism                                                     alt.support.divorce                                                                                           Autism
                                                                                  6
    6                                                                                                                                                                                              Divorce                   -1
                                                                                  5                                                                                                    Flonk
    5
                                                                                  4                                                                                                 Flam e
    4                                                                                                                                                                                          Kites
                                                                                  3                                                                                             Server
    3                                                                                                                                                                    Matlab             Phish                           -1.5
                                                                                  2
    2
                                                                                  1
                                                                                                                                                                                        Politics
    1

    0
                                                                                  0
                                                                                       0    1,1   2,3     4,7   8,15   16,31   32,63   64,127
                                                                                                                                                                                                                             -2
            0         1,1        2,3     4,7    8,15    16,31    32,63 64,127



                                   alt.politics
                                                                                                                                                                                                                            -2.5
6
                                                                                      Figure 3. Average out-
                                                                                      degree distribution for                                                                         Out-degree exponent
5

4                                                                                     high participators in
3
                                                                                      each of the nine                                          Figure 4. In-degree power-law exponent
2

1
                                                                                      groups. x-axis for each                                   against out-degree power-law exponent
0                                                                                     doubles in size for                                       for the nine groups. Shapes indicate the
        0            1,1         2,3     4,7    8,15    16,31    32,63   64,127
                                                                                      each bar.                                                 general group type.




                                                                                                                                                                                                                                       7
Proceedings of the 39th Hawaii International Conference on System Sciences - 2006




curve for in-degree is fairly balanced. In contrast, the   4.1. Roles within Newsgroups
only people doing the answering are the answer
people and so their out-degree is very high compared           These techniques seem to have provided some
to a population with extremely low (often 0) out-          description of the behavior that occurs within Usenet
degree.                                                    newsgroups. While these have are generally
   It should be noted that it is hard to ascribe           accessible by “feel”—by reading the group for a long
statistical significance for power-law measurements;       period and coming to understand it—this is a first
thus, while these techniques are intriguing, they will     pass at a quantitative attempt to describe how
require more development and precision before they         welcoming a group is to strangers, how long
can be said to be reliable.                                conversations run, and how connected members of the
                                                           groups are. This, in turn, seems to partition between
3.4. Authors Appearing Once                                different types of groups nicely.
                                                               In this paper, we have examined several types of
   One of the computationally simplest comparisons         discussion-oriented newsgroups: social support,
that can be made is to look at thread activity within      hobbyist discussion, and political discussion. In
the newsgroups. Counting the number of authors who         addition, we have examined flame groups and
appear only once gives a dramatic idea of how              question-and-answer groups.
committed to the group individual authors feel.                Discussion-oriented newsgroups have several
   This is illustrated most vividly in the question-and-   attributes that cause them to tend toward both longer
answer newsgroups. In these, many authors will             threads and toward more involvement from
show up once to ask their question. If they get a          individuals. Discussion newsgroups are places where
satisfactory answer they may never be seen again.          people attempt to establish reputations [3], get to
Therefore, it might be considered broadly acceptable       know each other, and exchange views. Conversations,
to post a small number of times.                           therefore, run somewhat longer. In contrast, online
   As further evidence that this pattern of question       question-and-answer newsgroups also have a
and answer is prevalent within specific groups, we         population aiming to develop a reputation, but they
examined a distribution of the length of threads within    develop their reputation by answering questions.
the groups. As table 2 illustrates, technical support          As such, the newsgroup has two noticeably
newsgroups had a disproportionately high number of         different populations: those who ask questions (who
threads of length two, while a disproportionately          start conversations, send few messages, and appear
small number of threads of length five or greater. In      infrequently in the newsgroup), and those who answer
contrast, social support groups tended much longer.        questions (who largely respond to messages that have
                                                           been sent, post a great many messages, and appear
4. Discussion                                              continuously in the newsgroup.)
                                                               At the other end of the scale, social support
    We have presented several techniques that allow        newsgroups (like alt.support.divorce) go out of their
different social spaces to be compared to each other       way to greet new people. Group members also seem
based on patterns of social interaction within each        to want to be more participatory -- and so we see that
group. There is a particular, and interesting,             the percentage of people who appear only once is
advantage from using Usenet data: Usenet data comes        small: 21%.
from a largely uniform set of interfaces across groups.        This categorization of newsgroups into “question
While users may access Usenet in a number of               and answer” against “discussion” is overbroad, in
different clients, the underlying platform is uniform.     several ways. First, of course, conversation spaces are
In addition, clients are not directly linked to            not monolithic: even flame groups engage in question
newsgroups: thus, one would not expect everyone in a       and answer. And, second, there are many styles of
particular group to be using a specific client. This       discussion: conversation and social support are very
allows us an unusual degree of comparability: we can       different. Thus, this conversation from alt.flame
examine how one newsgroup develops against others,              “Taht's [sic] like being a very juicy peace [sic]
with no external cues or affordances other than the             of poop”
title of the group. Of course, the content of the group    is radically different from that in a social support
and the culture within it certainly is significant.        newsgroup (in this case, discussing the emotional
                                                           after-effects of therapy):
                                                               Yes - I had hoped that it might help, but in the end
                                                           it only gave me a few pieces of paper. Which was




                                                                                                                      8
Proceedings of the 39th Hawaii International Conference on System Sciences - 2006




pretty much expected, but now I know it was right to        have done their part to characterize the content and
expect it....                                               makeup of a group. The contribution here is the start
   We have shown that these structural methods can          of a new set of tools, perhaps more quantitative,
help distinguish between types of authors involved in       which can be used to break down participation in
discussion: support, dialogue, and flame wars.              more detail.

4.2. Analysis Tools for Newsgroups                          4.3. Beyond Newsgroups
   We believe that the strategy of visualizing roles           But newsgroups are not the only place where this
from local networks and structural metrics can be           work may be applied. While other communities may
extended across the Usenet and to other threaded            have different micro-dynamics, it seems at least
discussion spaces. For example, these techniques            reasonable to suspect that these sorts of patterns
could be applied to online spaces that record response      would recur within different online spaces. The back-
patterns (such as Slashdot’s threads); they could also      and-forth of a flamewar, the multiparty discussion,
be applied to email lists. These types of structural        and the hub-and-spoke of an authoritative answer
analyses have important implications for social             person can all be used more generally. Perhaps some
research, and for the ways that people use and access       of these analysis and visualization techniques could
newsgroups.                                                 be extended as “social proxies” [4] to provide
   The underlying structural principles in this analysis    indicators of the activity of different online spaces.
can be extended to the design of user interfaces. There        These patterns may also guide the community
are two important foundations for building such             maintainer in understanding better how to interpret
interfaces: first, the general concepts explored in this    their online community. Depending on their goals for
paper must be expanded into more precise heuristics;        the community development, a transition toward—for
second, the heuristics that describe individual             example—ignoring new members might be not be
behavior must be aggregated to describe groups.             interpreted as a pathology. Instead, it might be read as
   The pathways toward the former are clearly laid          a cue that the group has formed a tight discussion
out: we have demonstrated in this paper a number of         core, and is moving away from a social support phase.
both qualitative and quantitative approaches which
can seem to discriminate between different types of         5. Bibliography
groups and activities. Welser et al. [14] expands on
these techniques with more detailed work on                 [1] Adamic, L. “Zipf, Power-laws, and Pareto: a ranking
identifying answer people.                                      tutorial.” Online, retrieved on 6/14/2005 from
   Characterizing groups might be best done by                  http://www.hpl.hp.com/research/idl/papers/ranking/ran
                                                                king.html
aggregating data about individuals. The latter suggests
an analysis of groups in which the makeup of people         [2] Donath, J., Karahalios, K., and Viegas, F. “Visualizing
might be summarized: perhaps that a Q&A group is                Conversations.” Journal of Computer Mediated
“70% questioners, 20% answers, and 10% discussion,              Communication. 4(4) 1999.
by volume of posts.” Groups could, en masse, by
summarized by these sorts of statistical abbreviations.     [3] Donath, J. “Identity and Deception in the Virtual
   This is desirable for the analyst, of course, but also       Community.” In Kollock, P. and Smith, M. (Eds).
                                                                Communities in Cyberspace. London: Routledge.
for the user. Consider a person searching for
information in a field they know little in. The             [4] Erickson, T. "Designing Visualizations of Social
techniques in this paper, if automated, could point this        Activity: Six Claims" The Proceedings of CHI 2003:
user toward groups that have both discussed this topic          Extended Abstracts. New York: ACM Press, 2003.
and are of the appropriate term: is the user looking for
technical support, or social support? Do they want a        [5] Fisher, D. “Egocentric Networks To Understand
                                                                Communiation.” IEEE Internet Computing. 9(5) 2005.
brisk discussion, or a straightforward answer? Indeed,
these techniques could lead not only toward the             [6] Golder, S. and Donath, J. “Social Roles in Electronic
particular newsgroup, but to the particular persons             Communities.” Presented at the Association of Internet
who are likely to be of greatest interest.                      Researchers (AoIR). Brighton, England. Sept. 19-22,
   This notion of characterizing groups by their                2004
emotional makeup is not a new one. This work shares
it with projects like Loom [2], which—in part—              [7] Merton, T. Social Theory and Social Structure,
                                                                Enlarged Edition. The Free Press: New York. 1968.
attempted to color groups by their emotional content.
Similarly, predecessors to this project [5, 11, 12, 14]




                                                                                                                          9
Proceedings of the 39th Hawaii International Conference on System Sciences - 2006




[8] Netscan, from Microsoft Research. Available at              Collective Action.” Journal of Computer-Mediated
    http://netscan.research.microsoft.com                       Communication. 10(4) 2005.

[9] O'Madadhain, J., Fisher, D., White, S., and Boey, Y.-   [12] Viegas, F. and Smith, M. “Newsgroup Crowds and
    B. “The JUNG (Java Universal Network/Graph)                  AuthorLines: Visualizing the Activity of Individuals in
    Framework.” Technical Report UCI-ICS 03-17. School           Conversational Cyberspaces.” HICSS 2004.
    of Information and Computer Science, UC Irvine.
    Irvine, California, 2003                                [13] Wasserman, S. and Faust, K. Social Network Analysis.
                                                                 Cambridge: Cambridge University Press. 1994.
[10] Sack, W. “Discourse Diagrams: Interface Design for
     Very Large Scale Conversations.” HICSS 2000.           [14] Welser, H., Gleave, E., Fisher, D., and Smith, M.
                                                                 “Visualizing the Signatures of Social Roles in Online
[11] Turner, T., Fisher, D., Smith, M., and Welser, T.           Discussion Groups.” In preparation.
     “Picturing Usenet: Mapping Computer-Mediated




                                                                                                                           10

More Related Content

What's hot

Network Exposure Influence on Facebook Behaviors
Network Exposure Influence on Facebook BehaviorsNetwork Exposure Influence on Facebook Behaviors
Network Exposure Influence on Facebook BehaviorsKyounghee Hazel Kwon
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018Arsalan Khan
 
APPLYING THE AFFECTIVE AWARE PSEUDO ASSOCIATION METHOD TO ENHANCE THE TOP-N R...
APPLYING THE AFFECTIVE AWARE PSEUDO ASSOCIATION METHOD TO ENHANCE THE TOP-N R...APPLYING THE AFFECTIVE AWARE PSEUDO ASSOCIATION METHOD TO ENHANCE THE TOP-N R...
APPLYING THE AFFECTIVE AWARE PSEUDO ASSOCIATION METHOD TO ENHANCE THE TOP-N R...ijnlc
 
02 Introduction to Social Networks and Health: Key Concepts and Overview
02 Introduction to Social Networks and Health: Key Concepts and Overview02 Introduction to Social Networks and Health: Key Concepts and Overview
02 Introduction to Social Networks and Health: Key Concepts and OverviewDuke Network Analysis Center
 
Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011guillaume ereteo
 
Community Detection in Social Media
Community Detection in Social MediaCommunity Detection in Social Media
Community Detection in Social Mediarezahk
 
Clique-based Network Clustering
Clique-based Network ClusteringClique-based Network Clustering
Clique-based Network ClusteringGuang Ouyang
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network AnalysisSujoy Bag
 
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
Mining and analyzing social media   part 2 - hicss47 tutorial - dave kingMining and analyzing social media   part 2 - hicss47 tutorial - dave king
Mining and analyzing social media part 2 - hicss47 tutorial - dave kingDave King
 
16 zaman nips10_workshop_v2
16 zaman nips10_workshop_v216 zaman nips10_workshop_v2
16 zaman nips10_workshop_v2talktoharry
 

What's hot (19)

Network Exposure Influence on Facebook Behaviors
Network Exposure Influence on Facebook BehaviorsNetwork Exposure Influence on Facebook Behaviors
Network Exposure Influence on Facebook Behaviors
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018
 
APPLYING THE AFFECTIVE AWARE PSEUDO ASSOCIATION METHOD TO ENHANCE THE TOP-N R...
APPLYING THE AFFECTIVE AWARE PSEUDO ASSOCIATION METHOD TO ENHANCE THE TOP-N R...APPLYING THE AFFECTIVE AWARE PSEUDO ASSOCIATION METHOD TO ENHANCE THE TOP-N R...
APPLYING THE AFFECTIVE AWARE PSEUDO ASSOCIATION METHOD TO ENHANCE THE TOP-N R...
 
02 Introduction to Social Networks and Health: Key Concepts and Overview
02 Introduction to Social Networks and Health: Key Concepts and Overview02 Introduction to Social Networks and Health: Key Concepts and Overview
02 Introduction to Social Networks and Health: Key Concepts and Overview
 
Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
 
Community Detection in Social Media
Community Detection in Social MediaCommunity Detection in Social Media
Community Detection in Social Media
 
Clique-based Network Clustering
Clique-based Network ClusteringClique-based Network Clustering
Clique-based Network Clustering
 
12 SN&H Keynote: Thomas Valente, USC
12 SN&H Keynote: Thomas Valente, USC12 SN&H Keynote: Thomas Valente, USC
12 SN&H Keynote: Thomas Valente, USC
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
05 Whole Network Descriptive Stats
05 Whole Network Descriptive Stats05 Whole Network Descriptive Stats
05 Whole Network Descriptive Stats
 
09 Ego Network Analysis
09 Ego Network Analysis09 Ego Network Analysis
09 Ego Network Analysis
 
24 The Evolution of Network Thinking
24 The Evolution of Network Thinking24 The Evolution of Network Thinking
24 The Evolution of Network Thinking
 
09 Diffusion Models & Peer Influence
09 Diffusion Models & Peer Influence09 Diffusion Models & Peer Influence
09 Diffusion Models & Peer Influence
 
Presentation OOC2013
Presentation OOC2013Presentation OOC2013
Presentation OOC2013
 
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
Mining and analyzing social media   part 2 - hicss47 tutorial - dave kingMining and analyzing social media   part 2 - hicss47 tutorial - dave king
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
 
13 Community Detection
13 Community Detection13 Community Detection
13 Community Detection
 
CSE509 Lecture 6
CSE509 Lecture 6CSE509 Lecture 6
CSE509 Lecture 6
 
16 zaman nips10_workshop_v2
16 zaman nips10_workshop_v216 zaman nips10_workshop_v2
16 zaman nips10_workshop_v2
 

Viewers also liked

2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-Box2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-BoxMarc Smith
 
20110128 connected action-node xl-sea of connections
20110128 connected action-node xl-sea of connections20110128 connected action-node xl-sea of connections
20110128 connected action-node xl-sea of connectionsMarc Smith
 
Jerome Nadel @ M Love
Jerome Nadel @ M LoveJerome Nadel @ M Love
Jerome Nadel @ M LoveJeromeNadel
 
2009-C&T-NodeXL and social queries - a social media network analysis toolkit
2009-C&T-NodeXL and social queries - a social media network analysis toolkit2009-C&T-NodeXL and social queries - a social media network analysis toolkit
2009-C&T-NodeXL and social queries - a social media network analysis toolkitMarc Smith
 
The Key Success Factor in Knowledge Management... What Else? Change Management
The Key Success Factor in Knowledge Management... What Else? Change ManagementThe Key Success Factor in Knowledge Management... What Else? Change Management
The Key Success Factor in Knowledge Management... What Else? Change ManagementPatti Anklam
 
Digital User Experience Strategies: A Roadmap for the Post 2.0 World
Digital User Experience Strategies: A Roadmap for the Post 2.0 WorldDigital User Experience Strategies: A Roadmap for the Post 2.0 World
Digital User Experience Strategies: A Roadmap for the Post 2.0 WorldJeromeNadel
 
Social Network Analysis in Two Parts
Social Network Analysis in Two PartsSocial Network Analysis in Two Parts
Social Network Analysis in Two PartsPatti Anklam
 

Viewers also liked (7)

2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-Box2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-Box
 
20110128 connected action-node xl-sea of connections
20110128 connected action-node xl-sea of connections20110128 connected action-node xl-sea of connections
20110128 connected action-node xl-sea of connections
 
Jerome Nadel @ M Love
Jerome Nadel @ M LoveJerome Nadel @ M Love
Jerome Nadel @ M Love
 
2009-C&T-NodeXL and social queries - a social media network analysis toolkit
2009-C&T-NodeXL and social queries - a social media network analysis toolkit2009-C&T-NodeXL and social queries - a social media network analysis toolkit
2009-C&T-NodeXL and social queries - a social media network analysis toolkit
 
The Key Success Factor in Knowledge Management... What Else? Change Management
The Key Success Factor in Knowledge Management... What Else? Change ManagementThe Key Success Factor in Knowledge Management... What Else? Change Management
The Key Success Factor in Knowledge Management... What Else? Change Management
 
Digital User Experience Strategies: A Roadmap for the Post 2.0 World
Digital User Experience Strategies: A Roadmap for the Post 2.0 WorldDigital User Experience Strategies: A Roadmap for the Post 2.0 World
Digital User Experience Strategies: A Roadmap for the Post 2.0 World
 
Social Network Analysis in Two Parts
Social Network Analysis in Two PartsSocial Network Analysis in Two Parts
Social Network Analysis in Two Parts
 

Similar to 2006 hicss - you are who you talk to - detecting roles in usenet newsgroups

a modified weight balanced algorithm for influential users community detectio...
a modified weight balanced algorithm for influential users community detectio...a modified weight balanced algorithm for influential users community detectio...
a modified weight balanced algorithm for influential users community detectio...INFOGAIN PUBLICATION
 
Current trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networksCurrent trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networkseSAT Publishing House
 
Sna based reasoning for multiagent
Sna based reasoning for multiagentSna based reasoning for multiagent
Sna based reasoning for multiagentijaia
 
2009-Social computing-Analyzing social media networks
2009-Social computing-Analyzing social media networks2009-Social computing-Analyzing social media networks
2009-Social computing-Analyzing social media networksMarc Smith
 
Networking Portfolio Term Paper
Networking Portfolio Term PaperNetworking Portfolio Term Paper
Networking Portfolio Term PaperWriters Per Hour
 
Network Structure For Social Network
Network Structure For Social NetworkNetwork Structure For Social Network
Network Structure For Social NetworkVaibhav Sathe
 
New Metrics for New Media Bay Area CIO IT Executives Meetup
New Metrics for New Media Bay Area CIO IT Executives MeetupNew Metrics for New Media Bay Area CIO IT Executives Meetup
New Metrics for New Media Bay Area CIO IT Executives MeetupTatyana Kanzaveli
 
Taxonomy and survey of community
Taxonomy and survey of communityTaxonomy and survey of community
Taxonomy and survey of communityIJCSES Journal
 
Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)Behrang Mehrparvar
 
Using the Framework of Networks to Enhance Learning and Social Interactions
Using the Framework of Networks to Enhance Learning and Social InteractionsUsing the Framework of Networks to Enhance Learning and Social Interactions
Using the Framework of Networks to Enhance Learning and Social InteractionsDmitry Paranyushkin
 
Multimode network based efficient and scalable learning of collective behavior
Multimode network based efficient and scalable learning of collective behaviorMultimode network based efficient and scalable learning of collective behavior
Multimode network based efficient and scalable learning of collective behaviorIAEME Publication
 
Mining and Analyzing Academic Social Networks
Mining and Analyzing Academic Social NetworksMining and Analyzing Academic Social Networks
Mining and Analyzing Academic Social NetworksEditor IJCATR
 
Improving Knowledge Handling by building intellegent social systems
Improving Knowledge Handling by building intellegent social systemsImproving Knowledge Handling by building intellegent social systems
Improving Knowledge Handling by building intellegent social systemsnazeeh
 
STRUCTURAL COUPLING IN WEB 2.0 APPLICATIONS
STRUCTURAL COUPLING IN WEB 2.0 APPLICATIONSSTRUCTURAL COUPLING IN WEB 2.0 APPLICATIONS
STRUCTURAL COUPLING IN WEB 2.0 APPLICATIONSijcsit
 
01 Introduction to Networks Methods and Measures (2016)
01 Introduction to Networks Methods and Measures (2016)01 Introduction to Networks Methods and Measures (2016)
01 Introduction to Networks Methods and Measures (2016)Duke Network Analysis Center
 
01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measuresdnac
 
KASW'08 - Invited Talk
KASW'08 - Invited TalkKASW'08 - Invited Talk
KASW'08 - Invited TalkRalf Klamma
 

Similar to 2006 hicss - you are who you talk to - detecting roles in usenet newsgroups (20)

a modified weight balanced algorithm for influential users community detectio...
a modified weight balanced algorithm for influential users community detectio...a modified weight balanced algorithm for influential users community detectio...
a modified weight balanced algorithm for influential users community detectio...
 
Current trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networksCurrent trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networks
 
Sna based reasoning for multiagent
Sna based reasoning for multiagentSna based reasoning for multiagent
Sna based reasoning for multiagent
 
2009-Social computing-Analyzing social media networks
2009-Social computing-Analyzing social media networks2009-Social computing-Analyzing social media networks
2009-Social computing-Analyzing social media networks
 
Networking Portfolio Term Paper
Networking Portfolio Term PaperNetworking Portfolio Term Paper
Networking Portfolio Term Paper
 
Network Structure For Social Network
Network Structure For Social NetworkNetwork Structure For Social Network
Network Structure For Social Network
 
New Metrics for New Media Bay Area CIO IT Executives Meetup
New Metrics for New Media Bay Area CIO IT Executives MeetupNew Metrics for New Media Bay Area CIO IT Executives Meetup
New Metrics for New Media Bay Area CIO IT Executives Meetup
 
Taxonomy and survey of community
Taxonomy and survey of communityTaxonomy and survey of community
Taxonomy and survey of community
 
Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)
 
Using the Framework of Networks to Enhance Learning and Social Interactions
Using the Framework of Networks to Enhance Learning and Social InteractionsUsing the Framework of Networks to Enhance Learning and Social Interactions
Using the Framework of Networks to Enhance Learning and Social Interactions
 
Multimode network based efficient and scalable learning of collective behavior
Multimode network based efficient and scalable learning of collective behaviorMultimode network based efficient and scalable learning of collective behavior
Multimode network based efficient and scalable learning of collective behavior
 
Mining and Analyzing Academic Social Networks
Mining and Analyzing Academic Social NetworksMining and Analyzing Academic Social Networks
Mining and Analyzing Academic Social Networks
 
Q046049397
Q046049397Q046049397
Q046049397
 
Internet
InternetInternet
Internet
 
Internet
InternetInternet
Internet
 
Improving Knowledge Handling by building intellegent social systems
Improving Knowledge Handling by building intellegent social systemsImproving Knowledge Handling by building intellegent social systems
Improving Knowledge Handling by building intellegent social systems
 
STRUCTURAL COUPLING IN WEB 2.0 APPLICATIONS
STRUCTURAL COUPLING IN WEB 2.0 APPLICATIONSSTRUCTURAL COUPLING IN WEB 2.0 APPLICATIONS
STRUCTURAL COUPLING IN WEB 2.0 APPLICATIONS
 
01 Introduction to Networks Methods and Measures (2016)
01 Introduction to Networks Methods and Measures (2016)01 Introduction to Networks Methods and Measures (2016)
01 Introduction to Networks Methods and Measures (2016)
 
01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures
 
KASW'08 - Invited Talk
KASW'08 - Invited TalkKASW'08 - Invited Talk
KASW'08 - Invited Talk
 

More from Marc Smith

How to use social media network analysis for amplification
How to use social media network analysis for amplificationHow to use social media network analysis for amplification
How to use social media network analysis for amplificationMarc Smith
 
Think link what is an edge - NodeXL
Think link   what is an edge - NodeXLThink link   what is an edge - NodeXL
Think link what is an edge - NodeXLMarc Smith
 
2017 05-26 NodeXL Twitter search #shakeupshow
2017 05-26 NodeXL Twitter search #shakeupshow2017 05-26 NodeXL Twitter search #shakeupshow
2017 05-26 NodeXL Twitter search #shakeupshowMarc Smith
 
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNAMarc Smith
 
20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...Marc Smith
 
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...Marc Smith
 
2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media sna2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media snaMarc Smith
 
2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXLMarc Smith
 
Think Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsThink Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsMarc Smith
 
20130724 ted x-marc smith-digital health futures empowerment or coercion
20130724 ted x-marc smith-digital health futures empowerment or coercion20130724 ted x-marc smith-digital health futures empowerment or coercion
20130724 ted x-marc smith-digital health futures empowerment or coercionMarc Smith
 
2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formatted2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formattedMarc Smith
 
2013 NodeXL Social Media Network Analysis
2013 NodeXL Social Media Network Analysis2013 NodeXL Social Media Network Analysis
2013 NodeXL Social Media Network AnalysisMarc Smith
 
20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...Marc Smith
 
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...Marc Smith
 
2012 ona practitioner-courseflyer
2012 ona practitioner-courseflyer2012 ona practitioner-courseflyer
2012 ona practitioner-courseflyerMarc Smith
 
20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …Marc Smith
 
20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...Marc Smith
 
20111123 mwa2011-marc smith
20111123 mwa2011-marc smith20111123 mwa2011-marc smith
20111123 mwa2011-marc smithMarc Smith
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smithMarc Smith
 
20110830 Introducing the Social Media Research Foundation
20110830 Introducing the Social Media Research Foundation20110830 Introducing the Social Media Research Foundation
20110830 Introducing the Social Media Research FoundationMarc Smith
 

More from Marc Smith (20)

How to use social media network analysis for amplification
How to use social media network analysis for amplificationHow to use social media network analysis for amplification
How to use social media network analysis for amplification
 
Think link what is an edge - NodeXL
Think link   what is an edge - NodeXLThink link   what is an edge - NodeXL
Think link what is an edge - NodeXL
 
2017 05-26 NodeXL Twitter search #shakeupshow
2017 05-26 NodeXL Twitter search #shakeupshow2017 05-26 NodeXL Twitter search #shakeupshow
2017 05-26 NodeXL Twitter search #shakeupshow
 
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
 
20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...
 
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
 
2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media sna2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media sna
 
2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL
 
Think Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsThink Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming Skills
 
20130724 ted x-marc smith-digital health futures empowerment or coercion
20130724 ted x-marc smith-digital health futures empowerment or coercion20130724 ted x-marc smith-digital health futures empowerment or coercion
20130724 ted x-marc smith-digital health futures empowerment or coercion
 
2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formatted2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formatted
 
2013 NodeXL Social Media Network Analysis
2013 NodeXL Social Media Network Analysis2013 NodeXL Social Media Network Analysis
2013 NodeXL Social Media Network Analysis
 
20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...
 
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
 
2012 ona practitioner-courseflyer
2012 ona practitioner-courseflyer2012 ona practitioner-courseflyer
2012 ona practitioner-courseflyer
 
20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …
 
20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...
 
20111123 mwa2011-marc smith
20111123 mwa2011-marc smith20111123 mwa2011-marc smith
20111123 mwa2011-marc smith
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smith
 
20110830 Introducing the Social Media Research Foundation
20110830 Introducing the Social Media Research Foundation20110830 Introducing the Social Media Research Foundation
20110830 Introducing the Social Media Research Foundation
 

Recently uploaded

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Recently uploaded (20)

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

2006 hicss - you are who you talk to - detecting roles in usenet newsgroups

  • 1. Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 You Are Who You Talk To: Detecting Roles in Usenet Newsgroups Danyel Fisher, Marc Smith, and Howard T. Welser Microsoft Research danyelf@microsoft.com, masmith@microsoft.com, twelser@u.washington.edu Abstract Understanding the social roles of the members a understand how their social behavior suggests their group can help to understand the social context of the role within the space. By generating these networks group. We present a method of applying social for each author in a newsgroup, we can begin to network analysis to support the task of characterizing roughly characterize the newsgroup in terms of the authors in Usenet newsgroups. We compute and different types of participant authors. visualize networks created by patterns of replies for In this paper, we examine the social networks each author in selected newsgroups and find that created by patterns of reply between authors second-degree ego-centric networks give us clear contributing messages to nine Usenet newsgroups distinctions between different types of authors and during a one-month period (Table 1). These nine newsgroups. groups were hand-selected to provide a broad Results show that newsgroups vary in terms of the overview of Usenet topics. These newsgroups are of populations of participants and the roles that they medium size, relative to the rest of Usenet: they are play. Newsgroups can be characterized by not among either the biggest groups, or the smallest. populations that include question and answer We sought out groups that we already knew were newsgroups, conversational newsgroups, social of several general genres: question and answer, social support newsgroups, and flame newsgroups. support, discussion, and flame. We chose these This approach has applications for both newsgroups because they are exemplars of their types. researchers seeking to characterize different types of Microsoft.public.windows.server.general, for social cyberspaces as well as participants seeking to example, is low on discussion, but active with distinguish interaction partners and content authors. questions and answers; in contrast, alt.flame is uncluttered by rational discussion. There have been a number of studies that have 1. Introduction attempted to examine and describe the sets of social roles that occur in both online and offline Joining a new social group can be a disorienting conversation [6, 11]. In this paper, we expand on the experience. Understanding the social roles of different ideas of Turner et al [11] and Welser [14], in members of a group is a good way to understand both examining a small number of visibly different roles. the social context of the group, and information about A social role “refers to the behavior of status- the people within it. A number of different occupants that is oriented toward the patterned visualization tools exist [2, 10, 12] that are meant to expectations of others” (pg 41) [7]. This notion give some sense of how a conversation space is emphasizes structure – in terms of relations to others, constructed. These various visualizations have all and structure, in terms of meaningful expectations for shown different perspectives on metadata within systematic behavior. We focus on emergent roles that online social spaces [11]. grow from participation in the focal activities, as In this paper we present a way of applying social distinct from formal organizational roles like network techniques to online conversation spaces. In “manager”. particular, we focus on the question “who is this person with whom I am interacting?” looking to 0-7695-2507-5/06/$20.00 (C) 2006 IEEE 1
  • 2. Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 Table 1. Nine newsgroups, with some basic statistics. Name authors replies % threads with % authors with January 2001 except as 2 messages 5+ messages degree > 10 1 message noted Technical Newsgroups comp.soft-sys.matlab 437 712 28% 20% 3% 41% microsoft.public.window 855 1489 33% 10% 3% 51% s.server.general, Jan. 2004 Discussion Newsgroups rec.kites 276 924 19% 32% 12% 32% rec.music.phish 1263 6105 23% 22% 20% 29% Political Newsgroup alt.politics 2187 16059 17% 40% 20% 26% Flame Newsgroups alt.flame 618 3802 16% 34% 13% 32% alt.alien- 755 6494 17% 50% 16% 33% vampire.flonk.flonk.flonk Social Support Newsgroups alt.support.divorce 339 2937 14% 53% 21% 23% alt.support.autism 188 3095 17% 60% 32% 20% We further focus our attention on the relationship persistent conversation spaces. All messages we between the communication behaviors of participants examine are publicly visible; thus, notions of in Usenet with their social structural position derived information transfer within the group are less from network analysis. In network analysis, we can confining than they might be in an organizational identify roles as positions that have a distinct pattern network. In short, we expect people to choose their of relations to other positions [13]. Our analysis takes correspondents, but not to be thoroughly-embedded fundamental features of an actor’s structural position within a broader network. (like their number of neighbors) and compares how Data to construct the networks presented here was those basic attributes vary across different generated by the Netscan system [8] at Microsoft conversational spaces. Research. Netscan stores message headers from Usenet newsgroups and constructs a range of 2. Methods measures of author, newsgroup and thread size and activity. Netscan was used to generate reports Our approach focuses on linking two scales of showing which authors sent messages in response to analysis between individual and collective structures messages from another author. This data was further in Usenet newsgroups. We alternately examine analyzed and graphically presented making extensive newsgroup behavior and the behavior of individual use of the JUNG [9] system for visualizing and authors. Each casts light upon the other. The key computing social network data. social structure used throughout is egocentric social Table 1 shows the nine newsgroups selected, and networks constructed through patterns of reply. An some overview data related to these newsgroups. egocentric network reports the other people and all We collected data from the “conversational subset” ties surrounding an actor to a limited distance. In this of the newsgroup’s messages: the set of messages that case, we have limited our distance to two: that is, we are either responded to by someone else, or are are looking at the neighbors of the neighbors of the themselves responses to someone else. (These do not, actors. then, include messages that are do not receive Phrased less formally, we are trying to understand replies.) This is examining only replies within the whether the people with whom the actor interacts are month of January, 2001. (For the more recent themselves well-connected or poorly-connected, and newsgroup microsoft.public.windows.server.general, so we look at their neighbors. we examine January of 2004). The choice of using egocentric measures is discussed more thoroughly in [6]. It is driven by the particular properties of network structure in public 2
  • 3. Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 2.1. A Person’s Social Network groups. We briefly describe these three strategies here. The core of our data is composed of social A basic way to contrast different newsgroups is to networks built from the interactions present in a set of compare the percentage of authors who have appeared newsgroups. This network will, as with all social in only one message during the time period. In all networks, connect pairs of vertices with edges. newsgroups, appearing only once can be a result of Vertices represent individual authors within the many different things: some authors, for example, will newsgroup who have replied to at least one message, post infrequently, and so only be seen once. Some or been replied to at least once. Directed edges authors are only visiting the newsgroup, or are represent the replies: an edge from A to B means that unintentionally cross-posted into the newsgroup. A replied to B’s message. Edges are further annotated However, differences in newsgroup behavior can also with a weight, which indicates the number of times cause different reasons to appear only once. In which A has replied to B. Note that we use the word particular, we might believe that the more inclusive a “reply” slightly loosely; within newsgroups, one does newsgroup is, the more likely it is that authors will not directly reply to someone else, but rather posts a appear more than once. We can similarly examine message in response to another’s message. The thread length: newsgroups that have short threads are distinction is subtle, but important: while we discuss likely to be oriented toward question and answer, interpersonal relations in terms of who replied to while longer threads suggest discussion. whom, we acknowledge that it is possible to post a Another way to compare newsgroups is to compare follow-up message to a thread without giving any their in-degree distribution and out-degree thought to the person who is being responded to. distribution. Any large newsgroup’s participation None the less, we will—for simplicity—elide “A level reflects a rough power-law curve: there will be a posted a message, in direct response to a previous few very active participants, and a great many message posted by B” to “A replies to B.” As this is inactive participants. While the shape is the same, we interpreted as an edge in the graph, we will also speak can compare the rate at which these power laws drop of the graph properties as social interaction. Thus, “A off—that is, the degree of inequality in the group— is of high out-degree” means “A responded to many between different groups. We might expect a group different people,” while “B is of high in-degree” with a strong central core to have a very different means “B was responded to by many people.” distribution from a group that has broad participation. We visualize these networks by showing the In addition, we can compare the distribution of in- second-degree egocentric network around any given degree to the distribution of out-degree for any given actors: that is, we look at all of that actor’s neighbors, group. Intuitively, this is the process of seeing and their neighbors, as in Figure 1b. Directed ties are whether the most active participants are as replied-to depicted with arrowheads, while the weight of edges as they do replying: are they central as people who is illustrated with thickness. start conversations, or as ones who end them? We generated one- and two-degree egocentric Last, we can adapt the technique used with the networks for all 6918 authors in our data set, and our individual network visualizations of looking at the analysis summarizes insights from comparing these degree distribution of out-neighbors by combining the networks across the nine groups. In addition, to look distributions across the groups. By combining the at individual connectivity, we calculate a distribution distributions across the most active actors, we of the out-degree of each actor’s out-neighbors: that generate distributions which can be, again, compared. is, we look at whether the actor typically replies to well-connected or poorly-connected alters. We 3. Results examine the interconnectivity of members of newsgroups, and we look at the reciprocity of ties We discuss first the individual measures that within these limited networks. identify personal roles. For each newsgroup and each role, we present a typical person who exemplifies that 2.2. Group-Wide Comparisons role, with a discussion of that sort of person and some of the characteristic identifiers for locating them We have applied several techniques to compare within the group. groups to each other. We compare the percentage of Next, we discuss group-wide statistics, and authors who have appeared only once in the group, connect the group types we had previously identified and look at thread length. We analyze collective in- to these broader statistics. degree and out-degree statistics, and we compare degree distribution coefficients across the nine 3
  • 4. Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 Figure 1: A question- and-answer, from the questioner’s perspective, at distance 1 (above) and 2 (right). The questioner is the large black dot; distance 1 is in blue, and distance 2 in gray. 3.1. Network Diagrams send most of his or her replies to people who are disconnected from each other. There is a small cluster We can refine our understanding of the differences in the bottom-right where several people of high between these newsgroups—and the individual degree are inter-connected, but that is dwarfed by the authors within them—by examining visualizations of number of individual messages. the network structures themselves. Let us begin with a To re-examine this from an answer person’s simple example: that of a typical question and answer perspective, we see an answer person in Figure 2a. (Figure 1). From the perspective of the questioner, this Note, again, the large number of connections to nodes is a simple transaction: they (the large dark dot) ask a that are not themselves interconnected. question, and get a response. They get only one When we expand to distance 2, we see that some response, and never connect to any other person. We few of the people to which this person interacted were gain a greater understanding of the social roles answer people – but the vast majority was not. To involved, when we examine the network at distance quantify this, we can examine the out-degree two. This allows us to see the network around the distribution of his neighbors: that is what was the out- person who responded. degree of the people to whom this person responded? We see that this response was not an unusual thing This question allows us to understand whether they for this answer person. Indeed, this person seems to preferentially reply to people who are well-connected, or people who are more isolated. 4
  • 5. Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 Figure 2. Reply-to network at distance 1 (left) and distance 2 (center), and out-degree distribution histogram, for one representative member for each of five newsgroups. From top to bottom, (a) microsoft.public.windows.server.general, (b) rec.kites, (c) alt.flame, (d) alt.politics, and (e) alt.support.divorce. 5
  • 6. Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 We see that this person responded largely to This suggests that the newsgroup’s core has an people of low out-degree. The graph at right is on a exclusive clique aspect to it. log-log scale, and so we see that virtually all his communication occurred with people of degree below We can confirm this by examining newsgroup- 3, with the most at 0. This, then, is a reasonable wide statistics. Across all of Usenet, according to characterization: someone who responds to many Netscan, 66% of posters post only one message. people of low degree, but infrequently to those who Ordinarily, 93% of those are initial turns: they are respond to many. from people who (try to) initiate a thread. In this We shall see that this structural pattern is a newsgroup, however, only 50% of those posts are distinguishing characteristic of question and answer initial turns: the rest are replying to messages. newsgroups. While we don’t have a direct explanation for how In figure 3, we show the average of the the newsgroup maintains its exclusion of outsiders, it distributions of out-degree for the most active posters is definitely the case that the most prolific posters within the newsgroup—those who have been involved clearly prefer to reply to other prolific posters. This is in ten or more replies—to emphasize that this shown, again, by the degree distribution. individual is indeed fairly representative. We contrast this with a very different newsgroup: Contrast the alt.flame newsgroup (2c). This alt.support.divorce (figure 2e), which is dedicated to newsgroup, dedicated to the art of the flame war, social support of people who are going through or involves authors flinging insults at each other on a have recently been through a divorce. We have broad variety of topics, interest areas, and degrees of already seen that alt.support.divorce is different from obscenity and personalization. We see, immediately, any of the newsgroups we have discussed so far: it that alt.flame looks significantly different from the has a smaller percentage of authors who post only one technical support newsgroup. message; it has a more equitable distribution of both Perhaps most visible is the density of the network. in-degree and out-degree across the group. Now we In alt.flame, many people connect to many others; the will see that its most active users are also less network is a tightly-knit one. Only slightly more exclusive then alt.politics. subtly, we see many thick edges. In alt.flame, users Figure 2e shows a not-atypical member of get into intense pair-wise connections: flame wars, as alt.support.divorce newsgroups with the by-now- the two find a topic that mutually interests them. familiar degree distribution. Note the radically While reading the newsgroup shows that there are different degree distribution: the strong bump at the often multiple participants in early stages of threads, front indicates a newsgroup-wide interest in threads often evolve (or devolve) into strong welcoming new participants, while the large bump at reciprocal ties. Indeed, a clear indicator of a flame the back indicates a continuing preference for well- newsgroup seems to be strong reciprocity. While linked in-group members. traditionally, reciprocal turn-taking is a sign of a good conversation, we should recall that newsgroups tend 3.2. Cumulative Degree Distribution to act as public spaces; the strong reciprocity is a sign of an inward focus and vigorous debate. We now begin to compare newsgroups to each Note that the alt.flame out-degree distribution is other. In the social networks section, we examined different: it peaks at a middle value. People in individual out-neighbor degree distribution. We now alt.flame preferentially reply to people who are fairly compare these degree distributions to each other, well-connected: people who have responded to few which allows us to compare newsgroups to each people in the newsgroup are not, in turn, linked. other. These groups were extremes, in opposite The histograms are shown in Figure 3. For each directions. rec.kites (figure 2b) is a happy medium: a group, we present the average of the number of out- discussion forum that has an in-group, but admits degree neighbors in each size bracket: 0, 1, 2-3, 4-7, outsiders. Note both the well-connected core, and the 8-15, 16-31, 32-63, and 64-127. In order to get only less-well-connected (but largely still of greater authors with non-trivial histograms, we focused degree, and actively participating) periphery. specifically on authors with more than ten neighbors alt.politics (figure 2d) is a very large newsgroup (that is, people who either they replied to, or who dedicated to discussion. The distance-1 graph visibly replied to them): an active minority, ranging from 3% shows, again, a tightly-connected newsgroup; the of the group’s membership (in the technical groups) distance-2 graph shows an interesting phenomenon: a to 30% of the group’s membership (in the flame population of outsiders who reply to messages within groups), as illustrated in Table 1. the newsgroup, but are largely not, in turn, replied to. 6
  • 7. Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 We note that these patterns reinforce what was (for degree > 0) falls along a power law: there are observed above: there is a distinct preference for many at the low end, and far fewer at the high end. replying to people who had not replied to anyone for The exponent on a power law can be estimated by the technical support groups; there is an opposite examining the slope of the distribution, plotted on preference in the politics groups. Support groups have log-log axes; a negative exponent indicates a curve both a peak at the low end and high end, representing with lower y at higher x [1]. The nine groups are then an interest in both connecting to new people and in plotted on a scatter-plot, with in-degree distribution – maintaining ties with more senior ones. that is, the distribution of the number of replies that authors get – is compared to the out-degree 3.3. Degree Equality Within Newsgroups distribution, as shown in Figure 4. Recall that the exponent on this power law is Another dimension of difference that separates roughly a measure of equality of contribution: a newsgroups from one another compares the coefficient of 0 suggests a newsgroup which has a distribution of the degrees of nodes to each other. For completely uniform contribution, in which each all these newsgroups, the distribution of vertex degree member has the same in-degree and out-degree rec.music.phish rec.kites as each other member; a strongly negative 6 6 number is more inequitably distributed. 5 5 By this reading, most newsgroups fall nearly 4 4 along a balanced axis, with similar distributions 3 3 of in-degree and out-degree: that is, most people 2 2 1 1 have the same sort of reply rate as their being- 0 0 replied-to rate. The social support newsgroups, 0 1,1 2,3 4,7 8,15 16,31 32,63 64,127 0 1,1 2,3 4,7 8,15 16,31 32,63 64,127 alt.support.autism and alt.support.divorce, both microsoft.public.server. comp.software.systems. have a more balanced distribution, while the 14 general 6 matlab other newsgroups have radically different 12 5 distributions. 10 8 4 Note also that server.general and matlab both 3 6 2 fall substantially off of the main axis. For both 4 2 1 newsgroups, their out-degree is distributed less 0 0 equitably than their in-degree. This is, again, 0 1,1 2,3 4,7 8,15 16,31 32,63 64,127 0 1,1 2,3 4,7 8,15 16,31 32,63 64,127 consistent with the model of a small population alt.alien-vampire. alt.flame of answer people responding to a large 6 flonk.flonk.flonk 6 population of questions. Virtually everyone, no 5 5 matter their degree, gets a response – and so the 4 4 3 3 2 2 0 1 1 -2.5 -2 -1.5 -1 -0.5 0 0 0 0 1,1 2,3 4,7 8,15 16,31 32,63 64,127 0 1,1 2,3 4,7 8,15 16,31 32,63 64,127 -0.5 In-degree exponent alt.support.autism alt.support.divorce Autism 6 6 Divorce -1 5 Flonk 5 4 Flam e 4 Kites 3 Server 3 Matlab Phish -1.5 2 2 1 Politics 1 0 0 0 1,1 2,3 4,7 8,15 16,31 32,63 64,127 -2 0 1,1 2,3 4,7 8,15 16,31 32,63 64,127 alt.politics -2.5 6 Figure 3. Average out- degree distribution for Out-degree exponent 5 4 high participators in 3 each of the nine Figure 4. In-degree power-law exponent 2 1 groups. x-axis for each against out-degree power-law exponent 0 doubles in size for for the nine groups. Shapes indicate the 0 1,1 2,3 4,7 8,15 16,31 32,63 64,127 each bar. general group type. 7
  • 8. Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 curve for in-degree is fairly balanced. In contrast, the 4.1. Roles within Newsgroups only people doing the answering are the answer people and so their out-degree is very high compared These techniques seem to have provided some to a population with extremely low (often 0) out- description of the behavior that occurs within Usenet degree. newsgroups. While these have are generally It should be noted that it is hard to ascribe accessible by “feel”—by reading the group for a long statistical significance for power-law measurements; period and coming to understand it—this is a first thus, while these techniques are intriguing, they will pass at a quantitative attempt to describe how require more development and precision before they welcoming a group is to strangers, how long can be said to be reliable. conversations run, and how connected members of the groups are. This, in turn, seems to partition between 3.4. Authors Appearing Once different types of groups nicely. In this paper, we have examined several types of One of the computationally simplest comparisons discussion-oriented newsgroups: social support, that can be made is to look at thread activity within hobbyist discussion, and political discussion. In the newsgroups. Counting the number of authors who addition, we have examined flame groups and appear only once gives a dramatic idea of how question-and-answer groups. committed to the group individual authors feel. Discussion-oriented newsgroups have several This is illustrated most vividly in the question-and- attributes that cause them to tend toward both longer answer newsgroups. In these, many authors will threads and toward more involvement from show up once to ask their question. If they get a individuals. Discussion newsgroups are places where satisfactory answer they may never be seen again. people attempt to establish reputations [3], get to Therefore, it might be considered broadly acceptable know each other, and exchange views. Conversations, to post a small number of times. therefore, run somewhat longer. In contrast, online As further evidence that this pattern of question question-and-answer newsgroups also have a and answer is prevalent within specific groups, we population aiming to develop a reputation, but they examined a distribution of the length of threads within develop their reputation by answering questions. the groups. As table 2 illustrates, technical support As such, the newsgroup has two noticeably newsgroups had a disproportionately high number of different populations: those who ask questions (who threads of length two, while a disproportionately start conversations, send few messages, and appear small number of threads of length five or greater. In infrequently in the newsgroup), and those who answer contrast, social support groups tended much longer. questions (who largely respond to messages that have been sent, post a great many messages, and appear 4. Discussion continuously in the newsgroup.) At the other end of the scale, social support We have presented several techniques that allow newsgroups (like alt.support.divorce) go out of their different social spaces to be compared to each other way to greet new people. Group members also seem based on patterns of social interaction within each to want to be more participatory -- and so we see that group. There is a particular, and interesting, the percentage of people who appear only once is advantage from using Usenet data: Usenet data comes small: 21%. from a largely uniform set of interfaces across groups. This categorization of newsgroups into “question While users may access Usenet in a number of and answer” against “discussion” is overbroad, in different clients, the underlying platform is uniform. several ways. First, of course, conversation spaces are In addition, clients are not directly linked to not monolithic: even flame groups engage in question newsgroups: thus, one would not expect everyone in a and answer. And, second, there are many styles of particular group to be using a specific client. This discussion: conversation and social support are very allows us an unusual degree of comparability: we can different. Thus, this conversation from alt.flame examine how one newsgroup develops against others, “Taht's [sic] like being a very juicy peace [sic] with no external cues or affordances other than the of poop” title of the group. Of course, the content of the group is radically different from that in a social support and the culture within it certainly is significant. newsgroup (in this case, discussing the emotional after-effects of therapy): Yes - I had hoped that it might help, but in the end it only gave me a few pieces of paper. Which was 8
  • 9. Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 pretty much expected, but now I know it was right to have done their part to characterize the content and expect it.... makeup of a group. The contribution here is the start We have shown that these structural methods can of a new set of tools, perhaps more quantitative, help distinguish between types of authors involved in which can be used to break down participation in discussion: support, dialogue, and flame wars. more detail. 4.2. Analysis Tools for Newsgroups 4.3. Beyond Newsgroups We believe that the strategy of visualizing roles But newsgroups are not the only place where this from local networks and structural metrics can be work may be applied. While other communities may extended across the Usenet and to other threaded have different micro-dynamics, it seems at least discussion spaces. For example, these techniques reasonable to suspect that these sorts of patterns could be applied to online spaces that record response would recur within different online spaces. The back- patterns (such as Slashdot’s threads); they could also and-forth of a flamewar, the multiparty discussion, be applied to email lists. These types of structural and the hub-and-spoke of an authoritative answer analyses have important implications for social person can all be used more generally. Perhaps some research, and for the ways that people use and access of these analysis and visualization techniques could newsgroups. be extended as “social proxies” [4] to provide The underlying structural principles in this analysis indicators of the activity of different online spaces. can be extended to the design of user interfaces. There These patterns may also guide the community are two important foundations for building such maintainer in understanding better how to interpret interfaces: first, the general concepts explored in this their online community. Depending on their goals for paper must be expanded into more precise heuristics; the community development, a transition toward—for second, the heuristics that describe individual example—ignoring new members might be not be behavior must be aggregated to describe groups. interpreted as a pathology. Instead, it might be read as The pathways toward the former are clearly laid a cue that the group has formed a tight discussion out: we have demonstrated in this paper a number of core, and is moving away from a social support phase. both qualitative and quantitative approaches which can seem to discriminate between different types of 5. Bibliography groups and activities. Welser et al. [14] expands on these techniques with more detailed work on [1] Adamic, L. “Zipf, Power-laws, and Pareto: a ranking identifying answer people. tutorial.” Online, retrieved on 6/14/2005 from Characterizing groups might be best done by http://www.hpl.hp.com/research/idl/papers/ranking/ran king.html aggregating data about individuals. The latter suggests an analysis of groups in which the makeup of people [2] Donath, J., Karahalios, K., and Viegas, F. “Visualizing might be summarized: perhaps that a Q&A group is Conversations.” Journal of Computer Mediated “70% questioners, 20% answers, and 10% discussion, Communication. 4(4) 1999. by volume of posts.” Groups could, en masse, by summarized by these sorts of statistical abbreviations. [3] Donath, J. “Identity and Deception in the Virtual This is desirable for the analyst, of course, but also Community.” In Kollock, P. and Smith, M. (Eds). Communities in Cyberspace. London: Routledge. for the user. Consider a person searching for information in a field they know little in. The [4] Erickson, T. "Designing Visualizations of Social techniques in this paper, if automated, could point this Activity: Six Claims" The Proceedings of CHI 2003: user toward groups that have both discussed this topic Extended Abstracts. New York: ACM Press, 2003. and are of the appropriate term: is the user looking for technical support, or social support? Do they want a [5] Fisher, D. “Egocentric Networks To Understand Communiation.” IEEE Internet Computing. 9(5) 2005. brisk discussion, or a straightforward answer? Indeed, these techniques could lead not only toward the [6] Golder, S. and Donath, J. “Social Roles in Electronic particular newsgroup, but to the particular persons Communities.” Presented at the Association of Internet who are likely to be of greatest interest. Researchers (AoIR). Brighton, England. Sept. 19-22, This notion of characterizing groups by their 2004 emotional makeup is not a new one. This work shares it with projects like Loom [2], which—in part— [7] Merton, T. Social Theory and Social Structure, Enlarged Edition. The Free Press: New York. 1968. attempted to color groups by their emotional content. Similarly, predecessors to this project [5, 11, 12, 14] 9
  • 10. Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 [8] Netscan, from Microsoft Research. Available at Collective Action.” Journal of Computer-Mediated http://netscan.research.microsoft.com Communication. 10(4) 2005. [9] O'Madadhain, J., Fisher, D., White, S., and Boey, Y.- [12] Viegas, F. and Smith, M. “Newsgroup Crowds and B. “The JUNG (Java Universal Network/Graph) AuthorLines: Visualizing the Activity of Individuals in Framework.” Technical Report UCI-ICS 03-17. School Conversational Cyberspaces.” HICSS 2004. of Information and Computer Science, UC Irvine. Irvine, California, 2003 [13] Wasserman, S. and Faust, K. Social Network Analysis. Cambridge: Cambridge University Press. 1994. [10] Sack, W. “Discourse Diagrams: Interface Design for Very Large Scale Conversations.” HICSS 2000. [14] Welser, H., Gleave, E., Fisher, D., and Smith, M. “Visualizing the Signatures of Social Roles in Online [11] Turner, T., Fisher, D., Smith, M., and Welser, T. Discussion Groups.” In preparation. “Picturing Usenet: Mapping Computer-Mediated 10