This document summarizes a study applying social network analysis techniques to discover communities of politicians debating specific policy areas in Dutch parliamentary proceedings. The study represents debates as weighted, directed graphs and uses k-clique community detection and language modeling to identify groups discussing single policy topics. The results show most communities could be traced to a single policy area, demonstrating the potential of these methods for automatic discovery of issue-based communities in political networks. The study also discusses avenues for future related research.
Applying social network analysis to Parliamentary Proceedings
1. Applying social network analysis
to Parliamentary Proceedings
Automatic discovery of meaningful cliques
Author:
Justin van Wees
Supervisors:
Dr. Maarten Marx
Dr. Johan van Doornik
June 23, 2011
3. Research question
Can we discover communities of politicians
that debate on a speci c policy area?
Motivation
• It’s unknown which member is responsible for a certain
policy area
• Discover what issues are discussed within a policy area
• Serve as example application of social network analysis
techniques
7. <root>
<docinfo>...</docinfo>
<meta>...</meta>
<proceedings>
<topic>
<scene type="speaker" speaker="Hamer" party="PvdA" function="Mevrouw"
role="mp" title="Mevrouw Hamer (PvdA)" MPid="02221">
<speech party="PvdA" speaker="Hamer" function="Mevrouw"
role="mp" MPid="02221">
<p>Dat is helemaal niet waar. U bewijst nu voor de derde keer
dat u niet ...</p>
</speech>
<speech type="interruption" party="Verdonk" speaker="Verdonk"
function="Mevrouw" role="mp" MPid="02995">
<p>Mag ik even uitpraten? Dank u. Zo werkt dat, gewoon fatsoen.
Dank u wel. [...]</p>
</speech>
</scence>
</topic>
</proceedings>
</root>
11. .8&&%9":3()(;&/%3<"3='()(,-
8
456",,%#()(+77()(,-
8
2
4
!"#$%&'()(**+()(,-
2
4,"2'()(B1$A()(,-
>":#%1%#$)456/?2%3()(@+A()(,-
.//0%&1/&'2()(0/1%&3,%32
A single debate represented in a graph
15. Finding issues that a community is discussing
• Retrieve all ‘community text’
• Tokenized at word level
• Lemmatize
• Use parsimonious language models to nd most
‘descriptive’ terms
17. General network statistics of Kok II
No distinction With distinction
between MP/MG between MP/MG
roles roles
Nodes 211 218
Edges 3594 3615
Density 0,081 0,076
18. Finding k-clique communties
• By default, found groups are note ‘cohesive’
• Filter out ‘noise’ by setting a threshold on edge weights
• At 15 interruptions: 197 nodes, 741 edges, 31 k-clique
communities
21. Finding k-clique communties
• All k-clique communities could be traced back to a single
policy area
• Except for more ‘general’ policy areas
• 92% of the community members directly related to the policy
area covered by the community
• 85% of top 20 ‘issue terms’ relevant to policy area
• K-clique community detection and parsimonious language
models are successful methods for automatic discovery of
communities within debate networks
23. • Method for setting edge weight threshold
• Reviewing of k-cliques done by single person
• Used four years of data, shorter time-window possible?
• Focused on Cabinet Kok II, what about other (earlier)
cabinets?
• Completely different data?