Mapping Influencers by Network Connections with Google Refine (Beth Granter, Brilliant Noise at Big Data Brighton)
1. Mapping influencers by
network connections with
Google Refine
Brilliant Noise - case study
thanks to NixonMcInnes
Beth Granter
November 2012
@bethgranter
bethgranter.com
@brilliantnoise
brilliantnoise.com
Thursday, 29 November 12
2. Background and brief
The client engages with individuals via an email list
in an internal database, and a LinkedIn group.
A client spokesperson is one of the ‘faces’ of the
department with a keen following on Twitter via
his personal account e.g. @bethgranter (!)
The brief was to look at the people in the three
groups & use that insight to create a list of similar
influencers they should be engaging with.
Thursday, 29 November 12
3. Approach
LinkedIn group: LinkedIn API and terms and
conditions -exporting member names or any
details from the group not legal... no further action!
Email list: created a temporary Gmail account &
added users as contacts, then used a temporary
Twitter account, imported (via Gmail) contacts into
Twitter, copied list of matching Twitter accounts to
spreadsheet.
Output: list of 95 Twitter accounts w/ full details,
who we know also receive the clients’ emails.
Data stored in shared Google Docs spreadsheet.
Thursday, 29 November 12
5. Approach: Twitter network
@bethgranter’s followers: exported a list of all of
followers via Twitter API, and again using the Twitter
API gathered a list of everybody else they follow.
This gave us a niche, ‘two tier network’ of ~600,000
people.
We then calculated a unique index - a ‘network
follower count’ - by calculating how many of
@bethgranter’s followers follow each person in the
network. This gave us a popularity figure.
Overall there were over 1 million connections mapped.
Thursday, 29 November 12
6. The network
Network follower count
A = 0 not followed by
A
A anyone in the network
C
C
B = 2 followed by 2 other
B
B
followers of @bethgranter
D
D
C = 1 followed by 1 of
@bethgranter’s followers
D = 2 followed by 2 other
followers of @bethgranter
Thursday, 29 November 12
7. Detail: method to get network
- Use Twitter API to get all followers of @bethgranter
= level 1 network follower
- For each level 1 network follower, get everyone else they follow
= level 2 network follower
- For everyone in level 1 & level 2, count how many level 1 followers they
have (we don’t know who level 2 follows).
= network follower count
- Twitter API limits rate of calls to do this...
Thursday, 29 November 12
8. Outputs: accounts by network follower count
(network popularity)
Thursday, 29 November 12
9. Approach: network influence/relevance
Filtered list to top 500 ppl by total follower count, so
only looking at ppl w/ minimum of ~250 followers total.
Calculate potential ‘influence’ figure for members in the
network: proportion of each person’s total followers
who were also followers of @bethgranter, i.e. their
network follower count as a percentage of their total
follower count.
= likelihood that a person’s follower chosen at random
is also following @bethgranter. i.e. how relevant are
their followers? We can use this figure as a network
influence/relevance metric
Thursday, 29 November 12
10. Approach: network influence/relevance
% network follows vs total follows
@guardianeco is followed by 428 of
@bethgranter’s followers and 98933
people in total, so network influence =
0.43% (low)
@Siemens_Energy follows @bethgranter,
is followed by 101 of @bethgranter’s
followers and 32008 in total, so network
influence = 0.32% (low)
@SDStephDraper is followed by 73 of
@bethgranter’s followers and 269 in total,
so network influence = 27.14% (high)
Thursday, 29 November 12
13. Summary of project outputs
List of 96 Twitter accounts w/ full details, which we know are also
subscribed to client’s email newsletter
List of 500 Twitter accounts in a newly mapped network, people
within two steps of @bethgranter which can be sorted by:
- overall popularity (total followers)
- network popularity (network followers) or by
- network influence/relevance (% network follows vs total follows)
Demographic and bio data about @bethgranter’s followers
Sorting list by relevance or popularity can be used to achieve different objectives.
Sorting by relevance identifies ppl who could amplify messages in the current
network, sorting by popularity identifies ppl who can extend the reach of
messages, although popular accounts will be harder to engage with.
Thursday, 29 November 12
14. Conclusions
This project used innovative data analysis techniques to explore a
bespoke network, based on relationships between people rather
than focusing on self-defined spokespeople on a topic.
The outputs of this project will only be effective if they are used by
the client to achieve their goals through building relationships with
the influencers identified.
The client will then need a strategic approach to engaging with
influencers online.
Thursday, 29 November 12
15. Next steps for the project
Case studying the project & publishing some of its outputs online would
attract the interest of those influencers we identified, and could therefore
be used as a PR asset in itself.
The approach could be re-applied to different spokespeople within and
beyond the department, and to different email lists.
Further research using the lists created in this project, such as:
- investigating ‘hubs’ within the network (core groups)
- creating an interactive visual map of the network as an asset
- looking at overlaps between different lists, to identify gaps, e.g. looking
at people on the email list who have a Twitter account, flagging
whether or not they follow @bethgranter, and then tailoring outgoing
comms with a relevant call to action (follow @bethgranter etc.)
Thursday, 29 November 12
17. Getting the Twitter user IDs for the two tier
network
import CSV
Thursday, 29 November 12
18. Getting the Twitter user IDs for the two tier
network
import CSV
Thursday, 29 November 12
19. Google refine - from list of network follower
Twitter user ids & network follower count
import CSV
Thursday, 29 November 12
20. Google refine - from list of network follower
Twitter user ids & network follower count
Create column
based on
twitter_user_id
column by fetching
URLs...
Thursday, 29 November 12
21. Google refine - from list of network follower
Twitter user ids & network follower count
Create column
based on
twitter_user_id
column by fetching
URLs...
Use the Twitter API
guide to get the
URL for the data
required
Thursday, 29 November 12
22. Google refine - from list of network follower
Twitter user ids & network follower count
Now you have the
Twitter user data,
you can separate it
out...
Thursday, 29 November 12
23. Google refine - from list of network follower
Twitter user ids & network follower count
Then export to
CSV / Google
Docs / excel to
sort & calculate
influence metrics
etc.
Thursday, 29 November 12
24. Thanks to NixonMcInnes!
Brilliant Noise
November 2012
@bethgranter
bethgranter.com
@brilliantnoise
brilliantnoise.com
Thursday, 29 November 12