SlideShare uma empresa Scribd logo
1 de 6
Effective Crowdsourcing for Software Feature
Ideation in Online Co-Creation Forums
Karthikeyan Rajasekharan, Aditya P Mathur, See-Kiong Ng
Information Systems Technology and Design
Singapore University of Technology and Design
karthikeyan@sutd.edu.sg, aditya_mathur@sutd.edu.sg, ngseekiong@sutd.edu.sg
Abstract—Many software companies are creating firm-centric
online forums for customer engagement. These forums can be an
effective crowdsourcing platform for software product feature
ideation and co-creation with the end users. We studied the
community interaction data from the ideation forums of two
software providers. Link analysis revealed that a small core
community was responsible for generating a large proportion of
the implemented ideas. This indicated the need to identify key
users in the online forum. Our analysis showed the applicability
of centrality measures such as betweenness in ranking key users.
We also found that commenting was likely to produce better
community formation amongst the participants than voting.
Keywords-co-creation; key users; ideation; link analysis;
crowdsourcing; social network analysis; expertise ranking;
software feature requirements
I. INTRODUCTION
Company-centric online user forums are an attractive
platform for company and end-user interactions and offer the
potential to co-opt customer knowledge as part of the
innovation process. Several consumer goods companies such as
Dell, Nike etc. manage online participation communities that
help to strengthen their product portfolio through customer-
suggested features. In particular, for the Software-as-a-Service
(SaaS) arena, the new product development process carries
higher risks of market adoption relative to the risks of technical
failure. Such company-owned online user forums can be used
to help mitigate the market adoption risk by transferring
knowledge from the user to the company, thereby enabling
better decision making pertaining to creating new customer-
centric product features.
User-led innovation has been suggested to be a key part of
the ideation process that can lead to breakthrough product
features, [1] found that "that on average user ideas score
higher in novelty and customer benefit, but lower in feasibility.
Even more interestingly, user ideas are placed more frequently
than expected among the very best in terms of novelty and
customer benefit." [12],[2] argued in favor of taking advantage
of online communities for generating ideas and suggest that the
system needs to be open and social for it to be successful. In
this paper, we focus on the addition of new features to an
existing software product through crowdsourcing in firm-
centric online forums.
For a company to effectively extract value and manage
knowledge creation in such an online community, there are two
key questions that merit consideration
1. How can a firm identify the key users for ideation in
the online ideation forum?
2. Which of the activities in the online ideation forums
are more effective in fostering community formation?
II. DATA GATHERING
To perform the analysis, the online ideation forums of
Salesforce.com (SFDC) and SAP were used. Salesforce.com is
a leading SaaS provider and as part of its online community
SFDC involves end users in its ideation process in a forum
entitled Ideaexchange. Given the nature of its business
(providing software services over the internet), the company
has an active ecosystem of partners and customers who
interact with each other and with SFDC in this online forum.
SAP also has an active ecosystem and has been pursuing Open
Innovation and Crowd-Sourcing as a means of generating new
customer insight. SAP’s ideation forum was called IdeaPlace.
A. Ideation Forum Structure
The screenshot in Figure 1 shows the structure of an
ideation forum using SFDC as an example.
Figure 1. Salesforce Idea Exchange. (forum structure)
The key activities that users can perform in such a forum
are the suggestion of ideas, voting on ideas (up and down),
commenting on ideas and annotating the ideas with meta-data
tags. Each idea belongs to a single user and users cannot vote
on an idea more than once. They can however, comment on a
single idea many times. Each user is uniquely identified by a
user identifier. Ideas and comments are linked to the users who
created them.
B. Crawling the Forum
The forums of Salesforce and SAP were crawled using the
Selenium and Scrapy toolkits for publicly available ideation
information and the data that was obtained were encapsulated
into PostgreSQL databases for further analysis.
C. Dataset Description
The datasets that were obtained is described in detail in table I.
TABLE I. IDEATION FORUM DATA THAT WAS GATHERED
Forum #Ideas #Participants #Comments #Votes
SFDC 19,593 73,942 62,389 516,514
SAP 7,506 2,226 7,276 40,765
In the subsequent sections, the discussion will focus on the
SFDC dataset; similar results were obtained on the SAP dataset
and are summarized in the section on related work.
III. ACTIVITY GRAPH GENERATION
We construct activity graphs (one for voting and one for
commenting) from the dataset as follows. Each Node in the
activity graph represents a unique user account. The edges in
this activity graph correspond to a particular communication
activity between two users.
An assumption made in this analysis is that each user
account refers to a unique individual. Each node is annotated
with properties such as the number of ideas and votes that were
contributed by that user.
Each edge in our graph is a reflection of communication
between two users in relation to a particular idea. Edges are
derived through the procedure illustrated by example below
1. User A makes an idea contribution to the community.
User A is identified as the originator and the node's
idea count is increased by 1.
2. User B then comments on the idea proposed by User
A. Thus, this indicates a communication interaction
from User B to User A on his or her idea. An edge is
created from User B to User A to capture this
interaction. The number of such interactions between
the two users will determine the strength of that edge.
3. User C comments on the same idea. Now an edge is
drawn between C and A.
4. User D introduces a new Idea. User D's Idea count is
incremented but no edges are drawn.
5. User C and User A comment on User D's idea. Edges
are drawn from User A and User C to User D.
This is illustrated in the Figure 2. This process is repeated
with the voting activity data to obtain the voting activity graph.
Figure 2. Activity Graph Construction
Vote graph degree distribution plot
Comment graph degree distribution plot
Figure 3. Degree Distribution of Activity Graphs
The activity graphs were visualized using [5] and were
found to exhibit a core-periphery structure. There is a highly
connected (relative to rest of the graph) core community whose
members have diverse interests and connect with the less active
periphery community of users. Studying the degree
distributions plots of these activity graphs as shown in Figure
3, the activity networks’ degree distribution likely follow
power law distributions as per the formulation,
( )
Typically, for real world power law distributions, the value
of α is between 2 and 3. The values that were obtained were
2.12 and 2.05 for the vote graph and comment graph
respectively. The power law fitting libraries used in [3] were
used to make these calculations. This seems to suggest that
these are scale free networks within empirical limits and show
behavior similar to those observed in other empirical networks
in [3].
IV. ISOLATING THE CORE COMMUNITY
Given the observation that the user community structure is
that of a core-periphery type, we develop a heuristic algorithm
based on average degree of a sub-graph to isolate the core
ideation community. Intuitively, the sub-graph that forms the
core of the activity graph will have an average degree that is
maximal.
A. Core Community Isolation Results
The above algorithm was applied and the results are shown
in Figure 4. The Y axis tracks the value of the Average Degree
of the sub-graph and the X axis shows the degree cutoff. Based
on the maximal average degree of the sub-graph, we find the
degree cutoff points for the core were 150 and 61 for the Vote
graph and the Comment graph respectively
Having obtained the degree cutoffs, the core community
can be isolated. We used the actual ideation output to evaluate
the core community detected. Table II showed that while the
core community comprises of a relatively few users, they
contribute a significant portion of the ideas that are
implemented. This result when combined with the fact that
SFDC implemented only 4.3% of the total ideas put forth by
the users suggests that it is important to identify the key users
in the community for effective ideation co-creation.
SFDC Vote Graph
SFDC Comment Graph
Figure 4. Core Community Isolation
TABLE II. SFDC IDEA EXCHANGE CORE COMMUNITY PERFORMANCE
#Graph
% of total
users in Core
Idea contribution
fraction of core
Implemented Idea
fraction of Core
Vote
Graph
0.35% =38%
Comment
Graph
0.68%
V. KEY USER RANKING
We conduct link analysis to rank community users for their
relative importance. This can be done by calculating the
prestige of a node and also by looking at measures of centrality
of a node. Structural prestige in network analysis has been the
basis for analyzing many networks. [11] details the PageRank
algorithm that was used to rank web pages according to
structural prestige. [16] proposed the idea of betweenness
centrality as a measure of a node’s importance in the overall
graph.
Based on the original page rank algorithm [11], we define
the community activity rank as follows
( ) ( ) ∑
( )
∑
( )
Where C(i) is the community activity rank of node i, E is
the set of all edges in the graph, d is the damping factor (set to
0.85), is the weight of outbound link from node j to i,
∑ is the sum of all of the weights of outbound edges from
node j. Thus, the final activity rank of a user is dependent on
the activity ranks of the users who collaborate with the user in
question. The key difference is that the original page rank
algorithm didn’t cater for edge weights and in our formulation
we use a directed graph with weighted edges.
Betweenness Centrality is defined as follows
( ) ∑
( )
Where is the number of shortest paths between j and k
and ( )is the number of shortest paths that have node i as
part of the path. Thus, Betweenness is a measure of the
number of times a node is part of the shortest path between
any two other nodes in the graph. The intuition that guides this
centrality measure is the idea that a node in the shortest path
between two other nodes can influence the flow of information
between those two nodes.
A. Ranking Results
The two approaches to ranking users were applied using [5]
and the users were ranked. An abbreviated subset of the results
(due to space constraints) - the top 10 users - for the comment
graph are shown in tables III and IV
TABLE III. SFDC IDEAS COMMENT COMMUNITY RANK TOP 10
# User Name
Community
Activity Rank
Community
Recognition
1 Alexander Sutherland 0.019588813 MVP Winter 11
2 Christoph K 0.008686445 None
3 werewolf 0.007827351 MVP Winter 11
4 Andres G 0.007087102 MVP Winter 11
5 jcohen 0.006898484 None
6 TomaszO 0.006523066 None
7 ToddJanzen 0.005924399 SFDC
8 eyewellse 0.005483297 None
9 ErikM 0.005006845 None
10 chris925 0.004876349 None
TABLE IV. SFDC IDEAS COMMENT BETWEENNESS CENTRALITY TOP 10
# User Name
Betweenness
Centrality
Community
Recognition
1 Alexander Sutherland 0.019771398 MVP Winter 11
2 Rhonda Ross 0.015190872 MVP Winter 11,12,13
3 Scott J 0.013384961 SFDC
4 Andres G 0.00886717 MVP Winter 11
5 Matthew Lamb 0.008623408 MVP Spring 11
# User Name
Betweenness
Centrality
Community
Recognition
6 AMartin 0.007343319 MVP Spring 11
7 Mattias Nordin 0.005807566 MVP Winter 11,12
8 mattybme1 0.005726696 MVP Winter 11,12,13
9 Christoph K. 0.00516802 None
10 Jakester 0.004668099 None
B. Evaluation of the Ranking
To evaluate the ranking of nodes, a measurement of the
firm’s evaluation of the importance of a user is useful.
Salesforce runs a community recognition program called the
MVP program where it periodically chooses members from
the community for their outstanding achievements and
recognizes them with virtual badges as MVPs. The
Salesforce.com website describes the program as “This
program recognizes exceptional individuals within the
Salesforce community for their leadership, knowledge, and
ongoing contributions. These individuals represent the spirit
of the community and what it is all about!”
In the result tables, the Community Recognition column
shows if the individual has been the recipient of any such
award. In cases, where the contributor is part of Salesforce, the
employee is not eligible for recognition. Such members have
also been highlighted. To evaluate the ranking approaches, the
MVP recognition of a user can be used to as a qualitative
measure. I.e. to what extent can network prestige or centrality
be linked to the firm’s recognition of individual users.
If the firm's recognition of community member's
contribution is the key criteria then the Betweenness measure
does much better than the Community Activity Rank measure.
Most of the people in the top 10 as ranked by the betweenness
measure are already members that the firm (SFDC) has also
recognized publicly. This does imply that this could be
measure that can potentially be used to identify users who
have not been yet recognized. This measure could also be used
in a dynamic fashion (as the community grows) to identify
newer key users. It is interesting to note that the community
rank based approach didn’t perform as well as the
betweenness centrality measure. While the transfer of prestige
from one user to another through out-links has an intuitive
appeal, in this instance, it didn’t perform as well empirically.
[15] performed a similar analysis on the java question and
answer forum and report similar findings that in online
expertise networks PageRank derivatives did not outperform
simpler measures.
The results also pose interesting qualitative questions for
analysis. For instance, the user Jakester (number 10 as per
betweenness ranking) has suggested 26 ideas, of which 10
have been implemented by SFDC. He has also contributed 534
comments and 771 votes on ideas. It would be of interest to
understand the reasons in the decision making process of the
firm that led to him not being recognized. In a similar fashion,
it would be interesting to understand the motivational impact
of having been granted a MVP badge. While, the analysis
covered in this paper didn’t evaluate this, it presents an
interesting avenue for further research.
Thus, betweenness centrality is a potential tool to answer
the first question posed at the start of this paper. In an actual
implementation scenario, this metric could be calculated in an
offline batch mode for analysis. [6] has proposed a fast way of
calculating betweenness centrality that could be used to
perform this calculation.
VI. COMPARING VOTING AND COMMENTING
The next key question then is which of the two online
forum activities (voting and commenting) encourage a tighter
and close knit community to be formed? This question is tied to
what motivates users to engage and participate in innovation
forums with the firm. If the activity fosters intrinsic
motivational factors, then it is likely to be self-sustaining. [13]
note that in innovation communities a key motivating factor for
users is learning. In [9], Lakhani and Eric Von Hippel studied
the Apache Open Source community and report that in their
study "98% of the effort expended by information providers in
fact returns direct learning benefits to those providers".
To evaluate the voting activity against the commenting
activity, a measure of community quality is required. [10] uses
the notion of conductance as a measure of community quality.
According to [10], if A is the adjacency matrix of the graph G
= (V, E), then
( )
∑
{ ( ) ( )
Where ( ) ∑ ∑
Conductance is a measure of the intra-community
connections versus the inter-community connections. The
lower the value of conductance, the better the quality of the
community i.e. the community is densely connected internally
and sparsely connected to the rest of the graph.
[10] also introduces the notion of a community profile plot.
Network Community Profile (NCP) plot characterizes the best
possible community over a range of size scales. In this plot, the
size of the nodes in a community (community size) is plotted
on the x axis and on the y axis the best possible community of
the given size (based on conductance) is tracked. Both the axis
are on a log scale. In real world networks, the value of
conductance decreases initially and then starts to increase. In
our analysis, the global minimum of the NCP plot can be a
measure of the community formation tendencies of an activity
graph. A comparison of the community size at which the global
minimum occurred was used to draw conclusions on
community formation characteristics of voting and
commenting activities.
Using this approach, the activity graphs constructed out of
comment and voting data were treated as un-directed graphs
and used to create separate network community profile plots.
The plots for the vote activity graph and the comment activity
graph are shown in the figures 5 and 6 respectively. The SNAP
[17] (Stanford Network Analysis Project) toolkit was used to
create this plots.
Both the profile plots show the expected behavior of
initially decreasing conductance followed by increasing
conductance. This is to say that the quality of communities
increases with node count for a while and then starts to
degenerate. The vote activity graph reaches a community size
of 10 nodes when conductance is at the global minimum, while
for the comment activity graph; the community size where the
global minimum is found is around ~33 nodes. In other words,
in the comment activity graph, the highest quality community
was found involving up to ~33 users whilst in the vote activity
graph, the best community size is comprised of only 10 users.
Figure 5. SFDC Vote Graph NCP Profile Plot
Figure 6. SFDC Comment Graph NCP Profile Plot
This comparison suggests that commenting activity has a
higher community creation effect than voting activity. This is
to be expected as psychologically, there is higher intrinsic
motivations and rewards (through the knowledge gained) for
engaging in discourse as opposed to merely voting on an idea.
While, this analysis has been based on a single ideation
community, it shows the distinction between voting and
commenting activity in objective and measurable terms.
Further work is required to analyze other ideation networks to
understand if similar characteristics are observed there. This
result is also in line with [9], [13] which have suggested that a
key motivating factor is learning through participation. Such
understanding will be important for designing suitable activity
features for the online user forums to be effective ideation co-
creation platforms.
VII. RELATED WORK
Similar analysis was performed on the SAP dataset and the
following results were obtained. The activity graph also
displayed power law distribution of node degree with an α of
2.57 and exhibited similar core-periphery structure. The size of
the core community obtained by the heuristic algorithm was
5.4% of the overall community but accounted for 20% of the
suggested ideas and 46% of the implemented ideas (only 4% of
all suggested ideas were implemented). Qualitative analysis of
the ranking also demonstrated that betweenness performed
better than the PageRank derived community activity rank.
Many analyses of online networks have used the notions of
node prestige to rank and evaluate participants. [7] used a
PageRank based approach to identify key users in online
communities. [14], [15] applied activity based ranking
techniques to the study of expertise in online question and
answer forums. In these forums, one user poses a question and
other users contribute answers to the posed question. [15]
obtained similar results where the PageRank derivatives of
node importance did not outperform simpler measures. In their
analysis, they found that “z_score” and “z_num” -simple
metrics derived from a node’s in and out degree- performed
best in their dataset. [8] used the notion of out-links as a means
of identifying rising stars in bibliography networks. The
intuition here is that the nodes in this network (namely
researchers) have prestige which they confer on others through
their co-authorship and collaboration. [4] analyzed the online
ideation community of DELL and concluded that past success
likely has detrimental effects on the productivity of new ideas.
While much work has been done on online communities, the
study of ideation in online communities is still evolving and
presents an opportunity for continued research.
VIII. CONCLUSIONS
In this paper, we have performed link analyses on the
online ideation communities of two software providers for
crowdsourcing new product features. We found that most of
the implemented ideas were originated from a small core
community in the forums. To identify the key users for product
feature ideation, we found that Betweenness centrality is a
better measure for user ranking than PageRank. We also found
that the community cohesion tendencies of commenting
activity were higher than that of voting activity. These findings
will be useful for designing such company-centric user forums
for effective co-creation of new product features.
A. Limitations
The analysis in this paper adopted a static approach to the
network activity. In reality, collaborations in online
communities weaken / strengthen over time. If two users
communicated on a certain task once, it doesn't necessarily
imply that the link remains active for their entire lifetime on the
community. This could potentially be handled by varying the
edge weight as a function of time. This is a potential area for
further research.
The analysis of community formation required splitting the
community into sub-communities. Other approaches such as
those demonstrated by [18] could be used to measure
community quality of overlapping communities. These will be
evaluated in future work on the data set.
REFERENCES
[1] MK Poetz and Martin Schreier. The value of crowdsourcing: can users
really compete with professionals in generating new product ideas?
Journal of Product Innovation, 29(2):245-256, 2012.
[2] Dahlander, Linus, Lars Frederiksen, and Francesco Rullani. "Online
communities and open innovation." Industry and innovation 15.2 (2008):
115-123.
[3] Clauset, Aaron, Cosma Rohilla Shalizi, and Mark EJ Newman. "Power-
law distributions in empirical data." SIAM review 51.4 (2009): 661-703.
[4] B. Bayus. Crowdsourcing and individual creativity over time: the
detrimental effects of past success. Available at SSRN 1667101, 2010.
[5] Mathieu Bastian, Sebastien Heymann, and M Jacomy. Gephi: An open
source software for exploring and manipulating networks. In Interna-
tional AAAI Conference on Weblogs and Social Media. Association for
the Advancement of Artificial Intelligence, 361-362 ,2009.
[6] Ulrik Brandes. A faster algorithm for betweenness centrality. Journal of
Mathematical Sociology, 25(1994):163-177, 2001.
[7] Julia Heidemann, Mathias Klier, and Florian Probst. Identifying key
users in online social networks: A PageRank based approach.
Information Systems Journal, 4801(December):12-15, 2010.
[8] XL Li, C Foo, K Tew, and SK Ng. Searching for rising stars in
bibliography networks. In Database Systems for Advanced
Applications,pages 288-292, 2009.
[9] KR Lakhani and Eric Von Hippel. How open source software works:free
user-to-user assistance. Research policy, 32(July 2002):923-943, 2003.
[10] Leskovec, Jure, et al. "Community structure in large networks: Natural
cluster sizes and the absence of large well-defined clusters." Internet
Mathematics 6.1 (2009): 29-123.
[11] L Page, S Brin, R Motwani, and T Winograd. The PageRank citation
ranking: bringing order to the web. pages 1-17, 1999.
[12] E. Prandelli, M. Swahney, and G. Verona. Collaborating with customers
to innovate: conceiving and marketing products in the networking age.
Edward Elgar Publishing, 2008.
[13] Anna Stahlbrost and Birgitta Bergvall-Kareborn. Exploring users
motivation in innovation communities. International Journal of
Entrepreneurship and Innovation Management, 14(4):298-314, 2011.
[14] KK Nam, MS Ackerman, and LA Adamic. Questions in, knowledge in?:
a study of naver's question answering community. Human Factors,pages
779-788, 2009.
[15] Jun Zhang, MS Ackerman, and L Adamic. Expertise networks in online
communities: structure and algorithms. Proceedings of the 16th
international conference on World Wide Web, pages 221-230, 2007
[16] Freeman, Linton. "A set of measures of centrality based on
betweenness". Sociometry, 40: (1977):35–41
[17] Stanford Network Analysis Project, http://snap.stanford.edu/index.html
[18] Palla, Gergely, Imre Derényi, Illés Farkas, and Tamás Vicsek.
"Uncovering the overlapping community structure of complex networks
in nature and society." Nature 435, no. 7043 (2005): 814-818.

Mais conteúdo relacionado

Mais procurados

Paper at ePart 2011: System Generated Requests for Rewriting Proposals
Paper at ePart 2011: System Generated Requests for Rewriting ProposalsPaper at ePart 2011: System Generated Requests for Rewriting Proposals
Paper at ePart 2011: System Generated Requests for Rewriting ProposalsPietro Speroni di Fenizio
 
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...Asoka Korale
 
FACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTING
FACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTINGFACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTING
FACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTINGcsandit
 
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...IJMTST Journal
 
Studying user footprints in different online social networks
Studying user footprints in different online social networksStudying user footprints in different online social networks
Studying user footprints in different online social networksIIIT Hyderabad
 
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...ijcsit
 
A Community Detection and Recommendation System
A Community Detection and Recommendation SystemA Community Detection and Recommendation System
A Community Detection and Recommendation SystemIRJET Journal
 
Information Sharing in Social Networks: PhD Thesis Thomas Langenberg
Information Sharing in Social Networks: PhD Thesis Thomas LangenbergInformation Sharing in Social Networks: PhD Thesis Thomas Langenberg
Information Sharing in Social Networks: PhD Thesis Thomas LangenbergThomas Langenberg
 
Domain sensitive recommendation with user-item subgroup analysis
Domain sensitive recommendation with user-item subgroup analysisDomain sensitive recommendation with user-item subgroup analysis
Domain sensitive recommendation with user-item subgroup analysisShakas Technologies
 
Supporting user innovation activities in crowdsourcing community
Supporting user innovation activities in crowdsourcing communitySupporting user innovation activities in crowdsourcing community
Supporting user innovation activities in crowdsourcing communityMiia Kosonen
 
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...learjk
 
Crowdsourcing Systems on world wide web
Crowdsourcing Systems on world wide webCrowdsourcing Systems on world wide web
Crowdsourcing Systems on world wide webSanjeev Kumar Jaiswal
 
2010 07 modeling web evolution amarantidis antoniou vafopoulos final
2010 07 modeling web evolution amarantidis antoniou vafopoulos final2010 07 modeling web evolution amarantidis antoniou vafopoulos final
2010 07 modeling web evolution amarantidis antoniou vafopoulos finalvafopoulos
 
Data Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering AlgorithmData Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering Algorithmnishant24894
 

Mais procurados (17)

Paper at ePart 2011: System Generated Requests for Rewriting Proposals
Paper at ePart 2011: System Generated Requests for Rewriting ProposalsPaper at ePart 2011: System Generated Requests for Rewriting Proposals
Paper at ePart 2011: System Generated Requests for Rewriting Proposals
 
2013 chi ferro_walz
2013 chi ferro_walz2013 chi ferro_walz
2013 chi ferro_walz
 
Bayesian Network backbone of clippy
Bayesian Network backbone of clippyBayesian Network backbone of clippy
Bayesian Network backbone of clippy
 
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...
 
FACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTING
FACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTINGFACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTING
FACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTING
 
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...
 
Studying user footprints in different online social networks
Studying user footprints in different online social networksStudying user footprints in different online social networks
Studying user footprints in different online social networks
 
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...
 
A Community Detection and Recommendation System
A Community Detection and Recommendation SystemA Community Detection and Recommendation System
A Community Detection and Recommendation System
 
Information Sharing in Social Networks: PhD Thesis Thomas Langenberg
Information Sharing in Social Networks: PhD Thesis Thomas LangenbergInformation Sharing in Social Networks: PhD Thesis Thomas Langenberg
Information Sharing in Social Networks: PhD Thesis Thomas Langenberg
 
Domain sensitive recommendation with user-item subgroup analysis
Domain sensitive recommendation with user-item subgroup analysisDomain sensitive recommendation with user-item subgroup analysis
Domain sensitive recommendation with user-item subgroup analysis
 
Supporting user innovation activities in crowdsourcing community
Supporting user innovation activities in crowdsourcing communitySupporting user innovation activities in crowdsourcing community
Supporting user innovation activities in crowdsourcing community
 
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...
 
Crowdsourcing Systems on world wide web
Crowdsourcing Systems on world wide webCrowdsourcing Systems on world wide web
Crowdsourcing Systems on world wide web
 
Social cloud
Social cloudSocial cloud
Social cloud
 
2010 07 modeling web evolution amarantidis antoniou vafopoulos final
2010 07 modeling web evolution amarantidis antoniou vafopoulos final2010 07 modeling web evolution amarantidis antoniou vafopoulos final
2010 07 modeling web evolution amarantidis antoniou vafopoulos final
 
Data Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering AlgorithmData Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering Algorithm
 

Destaque

Security3 aut
Security3 autSecurity3 aut
Security3 autKomilbek
 
576 charles dickens
576 charles dickens576 charles dickens
576 charles dickensMeMa-1
 
Unit 3 anatomy and physiology (muscles)
Unit 3  anatomy and physiology (muscles)Unit 3  anatomy and physiology (muscles)
Unit 3 anatomy and physiology (muscles)14koestlerk
 
Powerpoint on Troy Davis
Powerpoint on Troy DavisPowerpoint on Troy Davis
Powerpoint on Troy Davisbritt28
 
Chaos magick condensed
Chaos magick condensedChaos magick condensed
Chaos magick condensedIspas Elena
 
fractions: 1 whole, 1/2, 1/3 and 1/4
fractions: 1 whole, 1/2, 1/3 and 1/4fractions: 1 whole, 1/2, 1/3 and 1/4
fractions: 1 whole, 1/2, 1/3 and 1/4Marykris Rivera
 
Heller's book of magic
Heller's book of magicHeller's book of magic
Heller's book of magicIspas Elena
 
Astral projection class on compu serve
Astral projection class on compu serveAstral projection class on compu serve
Astral projection class on compu serveIspas Elena
 

Destaque (13)

Security3 aut
Security3 autSecurity3 aut
Security3 aut
 
Presentation1
Presentation1Presentation1
Presentation1
 
576 charles dickens
576 charles dickens576 charles dickens
576 charles dickens
 
사진 앨범
사진 앨범사진 앨범
사진 앨범
 
Presentation1
Presentation1Presentation1
Presentation1
 
Texturas
TexturasTexturas
Texturas
 
Unit 3 anatomy and physiology (muscles)
Unit 3  anatomy and physiology (muscles)Unit 3  anatomy and physiology (muscles)
Unit 3 anatomy and physiology (muscles)
 
Powerpoint on Troy Davis
Powerpoint on Troy DavisPowerpoint on Troy Davis
Powerpoint on Troy Davis
 
Chaos magick condensed
Chaos magick condensedChaos magick condensed
Chaos magick condensed
 
Presentation1
Presentation1Presentation1
Presentation1
 
fractions: 1 whole, 1/2, 1/3 and 1/4
fractions: 1 whole, 1/2, 1/3 and 1/4fractions: 1 whole, 1/2, 1/3 and 1/4
fractions: 1 whole, 1/2, 1/3 and 1/4
 
Heller's book of magic
Heller's book of magicHeller's book of magic
Heller's book of magic
 
Astral projection class on compu serve
Astral projection class on compu serveAstral projection class on compu serve
Astral projection class on compu serve
 

Semelhante a Effective Crowdsourcing for Software Feature Ideation

Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...reshma reshu
 
An improvised model for identifying influential nodes in multi parameter soci...
An improvised model for identifying influential nodes in multi parameter soci...An improvised model for identifying influential nodes in multi parameter soci...
An improvised model for identifying influential nodes in multi parameter soci...csandit
 
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNINGSENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNINGIRJET Journal
 
Social media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / LecturerSocial media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / Lecturergomathi chlm
 
Navigation Cost Modeling Based On Ontology
Navigation Cost Modeling Based On OntologyNavigation Cost Modeling Based On Ontology
Navigation Cost Modeling Based On OntologyIOSR Journals
 
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...Experiences in the Design and Implementation of a Social Cloud for Volunteer ...
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...ryanchard
 
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningA Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningIRJET Journal
 
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...inventionjournals
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics DomainDrjabez
 
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...csandit
 
Local Service Search Engine Management System LSSEMS
Local Service Search Engine Management System LSSEMSLocal Service Search Engine Management System LSSEMS
Local Service Search Engine Management System LSSEMSYogeshIJTSRD
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 
Visual Relation Identification Using BoFT Labels in Social Media Feeds
Visual Relation Identification Using BoFT Labels in Social Media FeedsVisual Relation Identification Using BoFT Labels in Social Media Feeds
Visual Relation Identification Using BoFT Labels in Social Media FeedsIRJET Journal
 
Modeling Object Oriented Applications by Using Dynamic Information for the I...
Modeling Object Oriented Applications by Using Dynamic  Information for the I...Modeling Object Oriented Applications by Using Dynamic  Information for the I...
Modeling Object Oriented Applications by Using Dynamic Information for the I...IOSR Journals
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Evolution of social developer network in oss survey
Evolution of social developer network in oss surveyEvolution of social developer network in oss survey
Evolution of social developer network in oss surveyeSAT Publishing House
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)paperpublications3
 

Semelhante a Effective Crowdsourcing for Software Feature Ideation (20)

Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
 
Q046049397
Q046049397Q046049397
Q046049397
 
An improvised model for identifying influential nodes in multi parameter soci...
An improvised model for identifying influential nodes in multi parameter soci...An improvised model for identifying influential nodes in multi parameter soci...
An improvised model for identifying influential nodes in multi parameter soci...
 
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNINGSENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
 
B017650510
B017650510B017650510
B017650510
 
Social media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / LecturerSocial media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / Lecturer
 
F0433439
F0433439F0433439
F0433439
 
Navigation Cost Modeling Based On Ontology
Navigation Cost Modeling Based On OntologyNavigation Cost Modeling Based On Ontology
Navigation Cost Modeling Based On Ontology
 
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...Experiences in the Design and Implementation of a Social Cloud for Volunteer ...
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...
 
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningA Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
 
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics Domain
 
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
 
Local Service Search Engine Management System LSSEMS
Local Service Search Engine Management System LSSEMSLocal Service Search Engine Management System LSSEMS
Local Service Search Engine Management System LSSEMS
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
Visual Relation Identification Using BoFT Labels in Social Media Feeds
Visual Relation Identification Using BoFT Labels in Social Media FeedsVisual Relation Identification Using BoFT Labels in Social Media Feeds
Visual Relation Identification Using BoFT Labels in Social Media Feeds
 
Modeling Object Oriented Applications by Using Dynamic Information for the I...
Modeling Object Oriented Applications by Using Dynamic  Information for the I...Modeling Object Oriented Applications by Using Dynamic  Information for the I...
Modeling Object Oriented Applications by Using Dynamic Information for the I...
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Evolution of social developer network in oss survey
Evolution of social developer network in oss surveyEvolution of social developer network in oss survey
Evolution of social developer network in oss survey
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
 

Effective Crowdsourcing for Software Feature Ideation

  • 1. Effective Crowdsourcing for Software Feature Ideation in Online Co-Creation Forums Karthikeyan Rajasekharan, Aditya P Mathur, See-Kiong Ng Information Systems Technology and Design Singapore University of Technology and Design karthikeyan@sutd.edu.sg, aditya_mathur@sutd.edu.sg, ngseekiong@sutd.edu.sg Abstract—Many software companies are creating firm-centric online forums for customer engagement. These forums can be an effective crowdsourcing platform for software product feature ideation and co-creation with the end users. We studied the community interaction data from the ideation forums of two software providers. Link analysis revealed that a small core community was responsible for generating a large proportion of the implemented ideas. This indicated the need to identify key users in the online forum. Our analysis showed the applicability of centrality measures such as betweenness in ranking key users. We also found that commenting was likely to produce better community formation amongst the participants than voting. Keywords-co-creation; key users; ideation; link analysis; crowdsourcing; social network analysis; expertise ranking; software feature requirements I. INTRODUCTION Company-centric online user forums are an attractive platform for company and end-user interactions and offer the potential to co-opt customer knowledge as part of the innovation process. Several consumer goods companies such as Dell, Nike etc. manage online participation communities that help to strengthen their product portfolio through customer- suggested features. In particular, for the Software-as-a-Service (SaaS) arena, the new product development process carries higher risks of market adoption relative to the risks of technical failure. Such company-owned online user forums can be used to help mitigate the market adoption risk by transferring knowledge from the user to the company, thereby enabling better decision making pertaining to creating new customer- centric product features. User-led innovation has been suggested to be a key part of the ideation process that can lead to breakthrough product features, [1] found that "that on average user ideas score higher in novelty and customer benefit, but lower in feasibility. Even more interestingly, user ideas are placed more frequently than expected among the very best in terms of novelty and customer benefit." [12],[2] argued in favor of taking advantage of online communities for generating ideas and suggest that the system needs to be open and social for it to be successful. In this paper, we focus on the addition of new features to an existing software product through crowdsourcing in firm- centric online forums. For a company to effectively extract value and manage knowledge creation in such an online community, there are two key questions that merit consideration 1. How can a firm identify the key users for ideation in the online ideation forum? 2. Which of the activities in the online ideation forums are more effective in fostering community formation? II. DATA GATHERING To perform the analysis, the online ideation forums of Salesforce.com (SFDC) and SAP were used. Salesforce.com is a leading SaaS provider and as part of its online community SFDC involves end users in its ideation process in a forum entitled Ideaexchange. Given the nature of its business (providing software services over the internet), the company has an active ecosystem of partners and customers who interact with each other and with SFDC in this online forum. SAP also has an active ecosystem and has been pursuing Open Innovation and Crowd-Sourcing as a means of generating new customer insight. SAP’s ideation forum was called IdeaPlace. A. Ideation Forum Structure The screenshot in Figure 1 shows the structure of an ideation forum using SFDC as an example. Figure 1. Salesforce Idea Exchange. (forum structure)
  • 2. The key activities that users can perform in such a forum are the suggestion of ideas, voting on ideas (up and down), commenting on ideas and annotating the ideas with meta-data tags. Each idea belongs to a single user and users cannot vote on an idea more than once. They can however, comment on a single idea many times. Each user is uniquely identified by a user identifier. Ideas and comments are linked to the users who created them. B. Crawling the Forum The forums of Salesforce and SAP were crawled using the Selenium and Scrapy toolkits for publicly available ideation information and the data that was obtained were encapsulated into PostgreSQL databases for further analysis. C. Dataset Description The datasets that were obtained is described in detail in table I. TABLE I. IDEATION FORUM DATA THAT WAS GATHERED Forum #Ideas #Participants #Comments #Votes SFDC 19,593 73,942 62,389 516,514 SAP 7,506 2,226 7,276 40,765 In the subsequent sections, the discussion will focus on the SFDC dataset; similar results were obtained on the SAP dataset and are summarized in the section on related work. III. ACTIVITY GRAPH GENERATION We construct activity graphs (one for voting and one for commenting) from the dataset as follows. Each Node in the activity graph represents a unique user account. The edges in this activity graph correspond to a particular communication activity between two users. An assumption made in this analysis is that each user account refers to a unique individual. Each node is annotated with properties such as the number of ideas and votes that were contributed by that user. Each edge in our graph is a reflection of communication between two users in relation to a particular idea. Edges are derived through the procedure illustrated by example below 1. User A makes an idea contribution to the community. User A is identified as the originator and the node's idea count is increased by 1. 2. User B then comments on the idea proposed by User A. Thus, this indicates a communication interaction from User B to User A on his or her idea. An edge is created from User B to User A to capture this interaction. The number of such interactions between the two users will determine the strength of that edge. 3. User C comments on the same idea. Now an edge is drawn between C and A. 4. User D introduces a new Idea. User D's Idea count is incremented but no edges are drawn. 5. User C and User A comment on User D's idea. Edges are drawn from User A and User C to User D. This is illustrated in the Figure 2. This process is repeated with the voting activity data to obtain the voting activity graph. Figure 2. Activity Graph Construction Vote graph degree distribution plot Comment graph degree distribution plot Figure 3. Degree Distribution of Activity Graphs
  • 3. The activity graphs were visualized using [5] and were found to exhibit a core-periphery structure. There is a highly connected (relative to rest of the graph) core community whose members have diverse interests and connect with the less active periphery community of users. Studying the degree distributions plots of these activity graphs as shown in Figure 3, the activity networks’ degree distribution likely follow power law distributions as per the formulation, ( ) Typically, for real world power law distributions, the value of α is between 2 and 3. The values that were obtained were 2.12 and 2.05 for the vote graph and comment graph respectively. The power law fitting libraries used in [3] were used to make these calculations. This seems to suggest that these are scale free networks within empirical limits and show behavior similar to those observed in other empirical networks in [3]. IV. ISOLATING THE CORE COMMUNITY Given the observation that the user community structure is that of a core-periphery type, we develop a heuristic algorithm based on average degree of a sub-graph to isolate the core ideation community. Intuitively, the sub-graph that forms the core of the activity graph will have an average degree that is maximal. A. Core Community Isolation Results The above algorithm was applied and the results are shown in Figure 4. The Y axis tracks the value of the Average Degree of the sub-graph and the X axis shows the degree cutoff. Based on the maximal average degree of the sub-graph, we find the degree cutoff points for the core were 150 and 61 for the Vote graph and the Comment graph respectively Having obtained the degree cutoffs, the core community can be isolated. We used the actual ideation output to evaluate the core community detected. Table II showed that while the core community comprises of a relatively few users, they contribute a significant portion of the ideas that are implemented. This result when combined with the fact that SFDC implemented only 4.3% of the total ideas put forth by the users suggests that it is important to identify the key users in the community for effective ideation co-creation. SFDC Vote Graph SFDC Comment Graph Figure 4. Core Community Isolation TABLE II. SFDC IDEA EXCHANGE CORE COMMUNITY PERFORMANCE #Graph % of total users in Core Idea contribution fraction of core Implemented Idea fraction of Core Vote Graph 0.35% =38% Comment Graph 0.68% V. KEY USER RANKING We conduct link analysis to rank community users for their relative importance. This can be done by calculating the prestige of a node and also by looking at measures of centrality of a node. Structural prestige in network analysis has been the basis for analyzing many networks. [11] details the PageRank algorithm that was used to rank web pages according to structural prestige. [16] proposed the idea of betweenness centrality as a measure of a node’s importance in the overall graph. Based on the original page rank algorithm [11], we define the community activity rank as follows
  • 4. ( ) ( ) ∑ ( ) ∑ ( ) Where C(i) is the community activity rank of node i, E is the set of all edges in the graph, d is the damping factor (set to 0.85), is the weight of outbound link from node j to i, ∑ is the sum of all of the weights of outbound edges from node j. Thus, the final activity rank of a user is dependent on the activity ranks of the users who collaborate with the user in question. The key difference is that the original page rank algorithm didn’t cater for edge weights and in our formulation we use a directed graph with weighted edges. Betweenness Centrality is defined as follows ( ) ∑ ( ) Where is the number of shortest paths between j and k and ( )is the number of shortest paths that have node i as part of the path. Thus, Betweenness is a measure of the number of times a node is part of the shortest path between any two other nodes in the graph. The intuition that guides this centrality measure is the idea that a node in the shortest path between two other nodes can influence the flow of information between those two nodes. A. Ranking Results The two approaches to ranking users were applied using [5] and the users were ranked. An abbreviated subset of the results (due to space constraints) - the top 10 users - for the comment graph are shown in tables III and IV TABLE III. SFDC IDEAS COMMENT COMMUNITY RANK TOP 10 # User Name Community Activity Rank Community Recognition 1 Alexander Sutherland 0.019588813 MVP Winter 11 2 Christoph K 0.008686445 None 3 werewolf 0.007827351 MVP Winter 11 4 Andres G 0.007087102 MVP Winter 11 5 jcohen 0.006898484 None 6 TomaszO 0.006523066 None 7 ToddJanzen 0.005924399 SFDC 8 eyewellse 0.005483297 None 9 ErikM 0.005006845 None 10 chris925 0.004876349 None TABLE IV. SFDC IDEAS COMMENT BETWEENNESS CENTRALITY TOP 10 # User Name Betweenness Centrality Community Recognition 1 Alexander Sutherland 0.019771398 MVP Winter 11 2 Rhonda Ross 0.015190872 MVP Winter 11,12,13 3 Scott J 0.013384961 SFDC 4 Andres G 0.00886717 MVP Winter 11 5 Matthew Lamb 0.008623408 MVP Spring 11 # User Name Betweenness Centrality Community Recognition 6 AMartin 0.007343319 MVP Spring 11 7 Mattias Nordin 0.005807566 MVP Winter 11,12 8 mattybme1 0.005726696 MVP Winter 11,12,13 9 Christoph K. 0.00516802 None 10 Jakester 0.004668099 None B. Evaluation of the Ranking To evaluate the ranking of nodes, a measurement of the firm’s evaluation of the importance of a user is useful. Salesforce runs a community recognition program called the MVP program where it periodically chooses members from the community for their outstanding achievements and recognizes them with virtual badges as MVPs. The Salesforce.com website describes the program as “This program recognizes exceptional individuals within the Salesforce community for their leadership, knowledge, and ongoing contributions. These individuals represent the spirit of the community and what it is all about!” In the result tables, the Community Recognition column shows if the individual has been the recipient of any such award. In cases, where the contributor is part of Salesforce, the employee is not eligible for recognition. Such members have also been highlighted. To evaluate the ranking approaches, the MVP recognition of a user can be used to as a qualitative measure. I.e. to what extent can network prestige or centrality be linked to the firm’s recognition of individual users. If the firm's recognition of community member's contribution is the key criteria then the Betweenness measure does much better than the Community Activity Rank measure. Most of the people in the top 10 as ranked by the betweenness measure are already members that the firm (SFDC) has also recognized publicly. This does imply that this could be measure that can potentially be used to identify users who have not been yet recognized. This measure could also be used in a dynamic fashion (as the community grows) to identify newer key users. It is interesting to note that the community rank based approach didn’t perform as well as the betweenness centrality measure. While the transfer of prestige from one user to another through out-links has an intuitive appeal, in this instance, it didn’t perform as well empirically. [15] performed a similar analysis on the java question and answer forum and report similar findings that in online expertise networks PageRank derivatives did not outperform simpler measures. The results also pose interesting qualitative questions for analysis. For instance, the user Jakester (number 10 as per betweenness ranking) has suggested 26 ideas, of which 10 have been implemented by SFDC. He has also contributed 534 comments and 771 votes on ideas. It would be of interest to understand the reasons in the decision making process of the firm that led to him not being recognized. In a similar fashion, it would be interesting to understand the motivational impact of having been granted a MVP badge. While, the analysis
  • 5. covered in this paper didn’t evaluate this, it presents an interesting avenue for further research. Thus, betweenness centrality is a potential tool to answer the first question posed at the start of this paper. In an actual implementation scenario, this metric could be calculated in an offline batch mode for analysis. [6] has proposed a fast way of calculating betweenness centrality that could be used to perform this calculation. VI. COMPARING VOTING AND COMMENTING The next key question then is which of the two online forum activities (voting and commenting) encourage a tighter and close knit community to be formed? This question is tied to what motivates users to engage and participate in innovation forums with the firm. If the activity fosters intrinsic motivational factors, then it is likely to be self-sustaining. [13] note that in innovation communities a key motivating factor for users is learning. In [9], Lakhani and Eric Von Hippel studied the Apache Open Source community and report that in their study "98% of the effort expended by information providers in fact returns direct learning benefits to those providers". To evaluate the voting activity against the commenting activity, a measure of community quality is required. [10] uses the notion of conductance as a measure of community quality. According to [10], if A is the adjacency matrix of the graph G = (V, E), then ( ) ∑ { ( ) ( ) Where ( ) ∑ ∑ Conductance is a measure of the intra-community connections versus the inter-community connections. The lower the value of conductance, the better the quality of the community i.e. the community is densely connected internally and sparsely connected to the rest of the graph. [10] also introduces the notion of a community profile plot. Network Community Profile (NCP) plot characterizes the best possible community over a range of size scales. In this plot, the size of the nodes in a community (community size) is plotted on the x axis and on the y axis the best possible community of the given size (based on conductance) is tracked. Both the axis are on a log scale. In real world networks, the value of conductance decreases initially and then starts to increase. In our analysis, the global minimum of the NCP plot can be a measure of the community formation tendencies of an activity graph. A comparison of the community size at which the global minimum occurred was used to draw conclusions on community formation characteristics of voting and commenting activities. Using this approach, the activity graphs constructed out of comment and voting data were treated as un-directed graphs and used to create separate network community profile plots. The plots for the vote activity graph and the comment activity graph are shown in the figures 5 and 6 respectively. The SNAP [17] (Stanford Network Analysis Project) toolkit was used to create this plots. Both the profile plots show the expected behavior of initially decreasing conductance followed by increasing conductance. This is to say that the quality of communities increases with node count for a while and then starts to degenerate. The vote activity graph reaches a community size of 10 nodes when conductance is at the global minimum, while for the comment activity graph; the community size where the global minimum is found is around ~33 nodes. In other words, in the comment activity graph, the highest quality community was found involving up to ~33 users whilst in the vote activity graph, the best community size is comprised of only 10 users. Figure 5. SFDC Vote Graph NCP Profile Plot Figure 6. SFDC Comment Graph NCP Profile Plot This comparison suggests that commenting activity has a higher community creation effect than voting activity. This is to be expected as psychologically, there is higher intrinsic motivations and rewards (through the knowledge gained) for engaging in discourse as opposed to merely voting on an idea. While, this analysis has been based on a single ideation community, it shows the distinction between voting and commenting activity in objective and measurable terms. Further work is required to analyze other ideation networks to understand if similar characteristics are observed there. This result is also in line with [9], [13] which have suggested that a key motivating factor is learning through participation. Such
  • 6. understanding will be important for designing suitable activity features for the online user forums to be effective ideation co- creation platforms. VII. RELATED WORK Similar analysis was performed on the SAP dataset and the following results were obtained. The activity graph also displayed power law distribution of node degree with an α of 2.57 and exhibited similar core-periphery structure. The size of the core community obtained by the heuristic algorithm was 5.4% of the overall community but accounted for 20% of the suggested ideas and 46% of the implemented ideas (only 4% of all suggested ideas were implemented). Qualitative analysis of the ranking also demonstrated that betweenness performed better than the PageRank derived community activity rank. Many analyses of online networks have used the notions of node prestige to rank and evaluate participants. [7] used a PageRank based approach to identify key users in online communities. [14], [15] applied activity based ranking techniques to the study of expertise in online question and answer forums. In these forums, one user poses a question and other users contribute answers to the posed question. [15] obtained similar results where the PageRank derivatives of node importance did not outperform simpler measures. In their analysis, they found that “z_score” and “z_num” -simple metrics derived from a node’s in and out degree- performed best in their dataset. [8] used the notion of out-links as a means of identifying rising stars in bibliography networks. The intuition here is that the nodes in this network (namely researchers) have prestige which they confer on others through their co-authorship and collaboration. [4] analyzed the online ideation community of DELL and concluded that past success likely has detrimental effects on the productivity of new ideas. While much work has been done on online communities, the study of ideation in online communities is still evolving and presents an opportunity for continued research. VIII. CONCLUSIONS In this paper, we have performed link analyses on the online ideation communities of two software providers for crowdsourcing new product features. We found that most of the implemented ideas were originated from a small core community in the forums. To identify the key users for product feature ideation, we found that Betweenness centrality is a better measure for user ranking than PageRank. We also found that the community cohesion tendencies of commenting activity were higher than that of voting activity. These findings will be useful for designing such company-centric user forums for effective co-creation of new product features. A. Limitations The analysis in this paper adopted a static approach to the network activity. In reality, collaborations in online communities weaken / strengthen over time. If two users communicated on a certain task once, it doesn't necessarily imply that the link remains active for their entire lifetime on the community. This could potentially be handled by varying the edge weight as a function of time. This is a potential area for further research. The analysis of community formation required splitting the community into sub-communities. Other approaches such as those demonstrated by [18] could be used to measure community quality of overlapping communities. These will be evaluated in future work on the data set. REFERENCES [1] MK Poetz and Martin Schreier. The value of crowdsourcing: can users really compete with professionals in generating new product ideas? Journal of Product Innovation, 29(2):245-256, 2012. [2] Dahlander, Linus, Lars Frederiksen, and Francesco Rullani. "Online communities and open innovation." Industry and innovation 15.2 (2008): 115-123. [3] Clauset, Aaron, Cosma Rohilla Shalizi, and Mark EJ Newman. "Power- law distributions in empirical data." SIAM review 51.4 (2009): 661-703. [4] B. Bayus. Crowdsourcing and individual creativity over time: the detrimental effects of past success. Available at SSRN 1667101, 2010. [5] Mathieu Bastian, Sebastien Heymann, and M Jacomy. Gephi: An open source software for exploring and manipulating networks. In Interna- tional AAAI Conference on Weblogs and Social Media. Association for the Advancement of Artificial Intelligence, 361-362 ,2009. [6] Ulrik Brandes. A faster algorithm for betweenness centrality. Journal of Mathematical Sociology, 25(1994):163-177, 2001. [7] Julia Heidemann, Mathias Klier, and Florian Probst. Identifying key users in online social networks: A PageRank based approach. Information Systems Journal, 4801(December):12-15, 2010. [8] XL Li, C Foo, K Tew, and SK Ng. Searching for rising stars in bibliography networks. In Database Systems for Advanced Applications,pages 288-292, 2009. [9] KR Lakhani and Eric Von Hippel. How open source software works:free user-to-user assistance. Research policy, 32(July 2002):923-943, 2003. [10] Leskovec, Jure, et al. "Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters." Internet Mathematics 6.1 (2009): 29-123. [11] L Page, S Brin, R Motwani, and T Winograd. The PageRank citation ranking: bringing order to the web. pages 1-17, 1999. [12] E. Prandelli, M. Swahney, and G. Verona. Collaborating with customers to innovate: conceiving and marketing products in the networking age. Edward Elgar Publishing, 2008. [13] Anna Stahlbrost and Birgitta Bergvall-Kareborn. Exploring users motivation in innovation communities. International Journal of Entrepreneurship and Innovation Management, 14(4):298-314, 2011. [14] KK Nam, MS Ackerman, and LA Adamic. Questions in, knowledge in?: a study of naver's question answering community. Human Factors,pages 779-788, 2009. [15] Jun Zhang, MS Ackerman, and L Adamic. Expertise networks in online communities: structure and algorithms. Proceedings of the 16th international conference on World Wide Web, pages 221-230, 2007 [16] Freeman, Linton. "A set of measures of centrality based on betweenness". Sociometry, 40: (1977):35–41 [17] Stanford Network Analysis Project, http://snap.stanford.edu/index.html [18] Palla, Gergely, Imre Derényi, Illés Farkas, and Tamás Vicsek. "Uncovering the overlapping community structure of complex networks in nature and society." Nature 435, no. 7043 (2005): 814-818.