DevEX - reference for building teams, processes, and platforms
SPARJA: a Distributed Social Graph Partitioning and Replication Middleware
1. SPARJA: a Distributed Social Graph Partitioning and
Replication Middleware
Stylianou Maria
KTH Royal Institute of Technology
Stockholm, Sweden
mariasty@kth.se
Girdzijauskas Šar¯unas
KTH Royal Institute of Technology
Stockholm, Sweden
sarunas@sics.se
ABSTRACT
The rapid growth of Online Social Networks (OSNs) has
lead to the necessity of effective and low-cost scalability.
Approaches like vertical and horizontal scaling have been
proven to be inefficient due to the strong community struc-
ture of OSNs. We propose SPARJA, a distributed graph
partitioning and replication middleware for scaling OSNs.
SPARJA is an extension of SPAR [8] with an improved par-
titioning algorithm which functions in a distributed man-
ner and eliminates the requirement of a global view. We
compare and evaluate SPARJA with a variance of SPAR
on synthesized datasets and datasets from Facebook. Our
results show that our proposed system is on par with and
even outperforms SPAR depending on graphs natures and
clusterization.
Categories and Subject Descriptors
C.4 [Performance of Systems]: Miscellaneous
; D.4.8 [Performance]: Metrics—performance measures
General Terms
Online Social Networks, Scalability, Partitioning, Replica-
tion
Keywords
Online Social Networks, scalability, partitioning, replication,
SPAR, JA-BE-JA
1. INTRODUCTION
Recently, there has been an abrupt transition of interest
from traditional web applications to social applications and
especially to Online Social Networks (OSNs), e.g. Facebook1
and Twitter2
. Both Facebook and Twitter have millions of
active users who post, comment and update their status at
1
http://www.facebook.com
2
http://www.twitter.com
very high rates. This trend makes OSNs a popular object of
analysis and research.
Recent research showed that OSNs produce a non-traditional
form of workloads [1, 12], mostly because of the different
nature of data. Data are highly personalized and intercon-
nected due to the strong community structure [5–7]. These
new characteristics impose new challenges in terms of OSNs
maintenance and scalability.
To address scalability, two approaches have been followed
so far; the vertical and the horizontal scaling. The former
solution, which implies replacements of existing hardware
with high-performance servers, tends to be very expensive
and sometimes not infeasible because of the very large size of
OSNs. The latter solution proposes load partitioning among
several cheap commodity servers or virtual machines (VMs),
the second of which derives from the emergence of cloud
computing systems, e.g. Amazon EC23
and Google Ap-
pEngine4
. With this approach, data are partitioned into
disjoint components, offering horizontal scaling in low cost.
However, problems arise when it comes to OSNs.
In OSNs, users can be members of several social communi-
ties [5–7], making the partitioning unfeasible. Most of the
operations in OSNs concern a user and her friends, who are
her neighbors. Thus, if a user belongs to many commu-
nities, clean partitioning is apparently impossible. Subse-
quently, queries are resolved with high inter-server traffic.
An attempt to eliminate this traffic is to replicate all users
data in multiple or all the servers. However, this leads to
an increased replication overhead which hinders consistency
among replicas.
SPAR [8], a social partitioning and replication middleware,
addresses the problem of partitioning - and consequently
scaling - OSNs. However, a global view of the entire network
is required and therefore, its use for extremely large-scale
OSNs may be defective. In this paper, we present a variant
of the SPAR algorithm which uses the partitioning technique
proposed in [10]. The new system has an improved - and
distributed - partitioning phase which does not require the
global view. We evaluate and compare our heuristic with
the initial SPAR algorithm, proving that scalability can be
3
http://aws.amazon.com/ec2/
4
http://cloud.google.com/appengine
2. improved with a distributed approach for partitioning.
In the next section, we discuss about related work and the
background of our research. In Section 3, we present our
contribution and describe the system deployed. Section 4
consists of the evaluation with the experiments conducted
and in section 5, our conclusions are listed.
2. BACKGROUND AND RELATED WORK
Due to the recent emergence of OSNs, scaling and main-
taining such networks constitute a new area of research with
limited work so far. In this section, we describe approaches
followed in the past and associate them with SPAR and our
work.
Scaling out web applications is achievable with the use of
Cloud providers, like Amazon EC2 and Google AppEngine.
Developers have the ability to dynamically add or remove
computing resources depending on the workload of their ap-
plications. This facility requires the applications to be state-
less and the data to be independent and easily sharded into
clean partitions. OSNs deal with highly interconnected and
dependent data and therefore scaling out is not a scaling
solution.
Nowadays, Key-Value stores have become the scaling solu-
tion for several popular OSNs. Key-Value stores are de-
signed to scale with the tradeoff to partition data randomly
across the servers. This requirement limits the performance
of OSNs. Puyol et al. [8] have proven that SPAR performs
better than Key-Value stores. Because of their principle
of preserving data locality, they managed to minimize the
inter-server traffic and, therefore, improve the performance.
Another approach for scaling and maintaining applications
is the use of Distributed File Systems. Such systems [4,11]
distribute and replicate data for achieving high availability.
In the case of OSNs, most of the queries concern data from
several users which would imply fetching data from multi-
ple servers. SPAR does not follow the same approach as
Distributed File Systems, but it replicates data in such a
manner that all necessary data can be found locally and
more efficiently.
SPAR is the initial work and motivation for our research.
It is a partitioning and replication algorithm, designed for
social applications. SPAR offers transparent scalability [9]
by preserving local semantics, i.e. storing all relevant data
for a user on one server. Moreover, it aims at minimizing
the replication overhead for keeping the overall performance
and system efficiency high. SPAR achieves load balancing
by acquiring the global view of the network. However, hav-
ing access to all data at all times can be very costly and
impractical, especially for large-scale systems. Additionally,
SPAR has a central partition manager which imposes single
point of failure. Both drawbacks are addressed in our imple-
mentation which is described in next section. Furthermore,
we tackle the possibility of SPAR’s partition manager to fall
into a local optima while trying to preserve load balancing.
This likelihood may result to an increased replication over-
head which we also try to improve in the proposed system.
3. OUR CONTRIBUTION - SPARJA
Our main contribution is the implementation of SPARJA,
a variant of SPAR which is based on JA-BE-JA [10]. JA-
BE-JA is a distributed graph partitioning algorithm that
does not require the global view of the system. SPARJA
eliminates the single point of failure from the initial SPAR
by replacing its main algorithm with JA-BE-JA. It also aims
at replication overhead minimization with the execution of
a simple straightforward technique.
3.1 System Architecture
Figure 1 depicts a three-tier web architecture with SPARJA
and it is based on the architecture of SPAR. The application
can interact with SPARJA through the Middleware (MW).
When the application requests a read or write operation on
a user’s data, it calls the MW which locates the back-end
server that contains the data of the specific user. The MW
sends back to the application the address of the server to
initiate a data-store interface, like MySQL, Cassandra or
others.
Figure 1: SPARJA Architecture
3.2 Description
SPARJA is a dynamic gossip-based local search algorithm.
Its goal is to group and store connected users, i.e. friends,
into the same server. In that way, SPARJA aims to reduce
the replication overhead as well as the inter-server traffic.
Initially, the system takes as an input a partial graph and
partitions it into k equal size components. All components
have the same amount of users, thus achieving load balanc-
ing. Afterwards, each node behaves as an independent pro-
cessing unit which periodically executes the algorithm based
on local information about the graph topology. These peri-
odical executions are essential for repartitioning the graph
and minimizing the replica nodes. Nodes can work in par-
allel, nevertheless SPARJA can work as a central system as
well.
3.3 System Operations
SPARJA is responsible to preserve scalability and trans-
parency of the application by distributing users, partitioning
3. the network and creating replicas under certain conditions.
Below, we describe operations that SPARJA executes for
achieving all its goals.
3.3.1 Data Distribution and Partitioning
SPARJA guarantees that users are equally distributed among
all servers. When a new user joins the network, a node -
called master node - is created and stored in the server with
the minimum number of master nodes. Hence, the data
distribution is fair and the load balanced. Recalling from
SPAR, users may move from one server to another. In con-
trast, users in SPARJA may exchange positions, i.e. User
A can move to the server of user B, and user B can move
to the server of user A, in order to be co-located with their
friends.
3.3.2 Data Replication
Data Replication is an important function of SPARJA. By
replicating master nodes, two requirements are satisfied; lo-
cal semantics and fault tolerance. When a new user joins
the network, along with the master node, replica nodes are
created and stored in servers. The number of replicas is a
custom value, initially set before the execution of the al-
gorithm and serves for preserving fault tolerance. When a
new friendship is established, new replicas may be created,
if needed, for data locality. SPARJA attempts to keep the
number of replicas to the minimum, by solving the set-cover
problem [2]. Particularly, with the creation of additional
replicas for data locality, some of the fault tolerance repli-
cas may be removed. Listing 1 presents the replication al-
gorithm in pseudocode which is executed periodically from
each node. The first part of the algorithm guarantees lo-
cality by creating replica nodes of a user - if not exist - in
the servers of her friends. The second part guarantees fault
tolerance by creating additional replica nodes for a user in
servers that do not already have the user.
1 f o r user in graph :
2 f r i e n d s = g e t f r i e n d s ( user )
3 f o r f r i e n d in f r i e n d s :
4 i f server i s not same :
5 i f ! r e p l i c a e x i s t s ( user ) :
6 c r e a t e r e p l i c a ( user )
7
8 f o r user in graph :
9 r e p l i c a s = g e t r e p l i c a s ( user )
10 i f num replicas < k r e p l i c a s :
11 new replicas = k r e p l i c a s −
e x i s t i n g r e p l i c a s
12 f o r j in range ( new replicas ) :
13 f o r k in range ( t o t a l s e r v e r s ) :
14 i f k i s not master server :
15 i f ! r e p l i c a e x i s t s ( user , k) :
16 c r e a t e r e p l i c a ( user , k)
Listing 1: Algorithm for Data Replication in
SPARJA
3.3.3 Sampling and Swapping Policies
Each node runs the algorithm of SPARJA as a process-
ing unit. It periodically selects a node and moves to the
swapping policy, which measures the benefit of swapping its
server with the server of the sampled node. The benefit of
swapping is measured in terms of energy, as introduced in
JA-BE-JA [10]. Each node has energy and, therefore, the
system has a global energy. The energy becomes low when
nodes are placed close to their neighbors. SPARJA uses the
same energy function to measure the swapping benefit. If
the energy decreases for both nodes - which is the desired be-
havior - then the swapping is performed, otherwise it halts.
A hybrid node selection policy is followed [10], which con-
sists of two parts. Firstly, the node selects - in random -
one direct neighbor and calculates the benefits. If the en-
ergy function does not improve, then the node performs a
random walk and selects another node from its walk [3].
3.3.4 Simulated Annealing
Local search algorithms tend to stuck in local optimas. Simi-
larly, SPARJA is vulnerable to this hazard which would lead
to higher replication overhead. To address this possibility,
we employ the Simulated Annealing technique as described
in [13]. Initially, noise is introduced to the system which
is analogous to temperature, causing the system to deviate
from the optimum value. After an amount of iterations, the
system starts to stabilize and eventually concludes to the
optimal solution, rather than a local optima.
4. EVALUATION
For evaluating SPARJA, we have implemented both SPARJA
and SPAR algorithms. Both algorithms are implemented in
Python, using Cassandra as the data store.
4.1 Metrics
The principal evaluation metric we use is the replication
overhead, which consists of the number of replicas created
for local semantics and the number of replicas created for
fault tolerance.
4.2 Datasets
We use six datasets for evaluating the replication overhead;
three synthesized datasets and three facebook datasets.
Synthesized Datasets
We generated three synthesized datasets with different clus-
terization levels in order to study the clusterization impact
in the replication overhead. All three datasets contain 1000
nodes each with node degree equal to 10. The Randomized
graph (Synth-R) has no clusterization policy, while the Clus-
tered (Synth-C) and Highly Clustered (Synth-HC) graphs
have 75% and 95% of clusterization respectively.
Facebook Datasets
The three Facebook datasets have been acquired from the
Stanford Large Network Dataset Collection 5
, with number
of edges approximate to 3000, 6000 and 60000 respectively.
All details - including nodes, edges and clusterization levels
- can be found in the Table 1.
5
http://snap.stanford.edu/
4. Dataset Nodes Edges Clusterization (%)
Synth-R 1000 10,000 0%
Synth-C 1000 10,000 75%
Synth-HC 1000 10,000 95%
Facebook-1 150 3386 n/a
Facebook-2 224 6384 n/a
Facebook-3 786 60,050 n/a
Table 1: Description of Datasets
4.3 Environment Preparation
Before conducting any experiments, we set up the cooling
rate (δ), i.e. the rate of change of temperature, used in
the simulated annealing technique. As stated in [13], the
number of iterations is equal to the fraction of temperature
difference divided by the cooling rate.
number of iterations =
To − 1
δ
To is the Initial Temperature of the network, which declines
according to the cooling rate. Assuming a network with
four servers and a fixed value of the Initial Temperature
(To) equal to 2, we run the algorithm for different numbers
of iterations in order to adjust the cooling rate. The datasets
used are the synthesized graphs with 0%, 75% and 95% of
clusterization. Figure 2 shows how the number of non lo-
cal nodes is adjusted with the increase of iterations. From
the number of non local nodes we can deduce how much
replication overhead is caused and how well the clusteriza-
tion is done. As illustrated, the number of non local nodes
decreases with the increase of iterations, and eventually sta-
bilizes at 200 iterations for the randomized graph and at 300
iterations for both clusterized and highly clusterized graphs.
Figure 2: Number of Iterations vs Number of Non
Local Nodes
In Table 2, we accumulate all the parameters with fixed
values set before realising the experiments.
4.4 Experiments
In our experiments we compare SPARJA and SPAR in terms
of the replication overhead. We designed three scenarios
Parameter Value
Initial Temperature (To) 2
Final Temperature (T) 1
Cooling Factor (δ) 0.003
Energy Function Parameter (n) 2
Table 2: Parameters for SPARJA
for testing the impact of different datasets, fault tolerance
replication and amount of servers.
4.4.1 Replication Overhead on Different Datasets
In the first experiment, we study how different datasets and
topologies affect the replication overhead of the system. Fig-
ures 3 and 4 show the replication overhead of both SPAR
and SPARJA in synthesized graphs and facebook graphs re-
spectively. The amount of servers is set to four (S=4) and
the replication factor for fault tolerance is set to zero (K=0).
As revealed in Figure 3, a higher clusterization leads to lower
replication overhead in SPARJA. As expected, SPARJA takes
advantage of the existing graph clusterization and continues
redistributing nodes based on this divided topology. As a
result, SPARJA outperforms SPAR in the clustered graphs
while giving worse results than SPAR in the random graph.
Similarly, Figure 4 shows how SPARJA and SPAR perform
on facebook graphs. Again, SPARJA gives better results as
compared to SPAR.
Figure 3: Replication Overhead on Synthesized
Datasets
Figure 4: Replication Overhead on Facebook
Datasets
4.4.2 Replication Overhead vs Replication Factor
Next, we turn our attention to the replication factor and
whether it affects the replication overhead. Figure 5 shows
5. the replication overhead of SPARJA in all datasets with
replication factor for fault tolerance set to zero (K=0) and
two (K=2). The amount of servers is set to four (S=4).
As can be seen, the fault tolerance replication factor can
dramatically affect and decrease the replication overhead.
As it was expected, fault tolerance replica nodes are also
used for preserving data locality.
Figure 5: Replication Overhead for Different Num-
ber Fault Tolerance Replicas
4.4.3 Replication Overhead with Different Number
of Servers
In our final experiment we measure the replication overhead
of both SPARJA and SPAR for different number of servers
S=4,8,16. The fault tolerance replication factor is set to 2
(K=2) and all datasets are used.
In Figure 6 we plot the results collected from the algorithms,
divided in 6 graphs, each one for a different dataset. As
expected in all datasets, the replication overhead increases
with the increase of the number of servers.
5. CONCLUSIONS
Online Social Networks have faced a steep growth over the
last decade. This popularity has lead companies to study the
nature of OSNs and offer scalability and maintenance ser-
vices. However, none of the scalability approaches, proposed
so far, has solved all the scalability issues. The strong com-
munity structure of such systems makes Key-Value stores
and relational databases inefficient.
We proposed SPARJA, a distributed graph partitioning and
replication middleware for scaling OSNs. SPARJA parti-
tions the graph into k balanced components and maintains
them without obtaining the global view of the system. It
relies on data replication for preserving fault tolerance and
locality semantics, while aiming to keep the replication over-
head as low as possible.
The evaluation of SPARJA was accomplished using synthe-
sized graphs as well as real datasets from facebook. Our
comparisons with SPAR showed that SPARJA offers signif-
icant gains in replication overhead, especially when there is
clusterization of the graph. Moreover, with the low replica-
tion overhead, it covers both goals of locality semantics and
fault tolerance.
We implemented and tested an initial version of SPARJA.
We leave the integration of SPARJA in a real system with
a three-tier architecture as a future work.
6. ACKNOWLEDGEMENTS
We would like to thank our colleague Muhammad Anis ud-
din Nasir for his valuable contribution and help in the project.
We also thank Fatemeh Rahimian for providing sources for
datasets used for the evaluation part of the project.
7. REFERENCES
[1] F. Benevenuto, T. Rodrigues, M. Cha, and
V. Almeida. Characterizing user behavior in online
social networks. In Proceedings of the 9th ACM
SIGCOMM conference on Internet measurement
conference, pages 49–62. ACM, 2009.
[2] R. Carr, S. Doddi, G. Konjevod, and M. Marathe. On
the red-blue set cover problem. In Proceedings of the
eleventh annual ACM-SIAM symposium on Discrete
algorithms, pages 345–353. Society for Industrial and
Applied Mathematics, 2000.
[3] M. Gjoka, M. Kurant, C. Butts, and A. Markopoulou.
Walking in facebook: A case study of unbiased
sampling of osns. In INFOCOM, 2010 Proceedings
IEEE, pages 1–9. IEEE, 2010.
[4] R. Guy, J. Heidemann, W. Mak, T. Page Jr,
G. Popek, D. Rothmeier, et al. Implementation of the
ficus replicated file system. In USENIX Conference
Proceedings, volume 74, pages 63–71. Citeseer, 1990.
[5] J. Leskovec, K. Lang, A. Dasgupta, and M. Mahoney.
Community structure in large networks: Natural
cluster sizes and the absence of large well-defined
clusters. Internet Mathematics, 6(1):29–123, 2009.
[6] M. Newman. Modularity and community structure in
networks. Proceedings of the National Academy of
Sciences, 103(23):8577–8582, 2006.
[7] M. Newman and J. Park. Why social networks are
different from other types of networks. Physical
Review E, 68(3):036122, 2003.
[8] J. Pujol, V. Erramilli, G. Siganos, X. Yang,
N. Laoutaris, P. Chhabra, and P. Rodriguez. The little
engine (s) that could: scaling online social networks.
In ACM SIGCOMM Computer Communication
Review, volume 40, pages 375–386. ACM, 2010.
[9] J. Pujol, G. Siganos, V. Erramilli, and P. Rodriguez.
Scaling online social networks without pains. In Proc
of NETDB. Citeseer, 2009.
[10] F. Rahimian, A. H. Payberah, S. Girdzijauskas,
M. Jelasity, and S. Haridi. JA-BE-JA: a distributed
algorithm for balanced graph partitioning.
forthcoming.
[11] M. Satyanarayanan, J. Kistler, P. Kumar, M. Okasaki,
E. Siegel, and D. Steere. Coda: A highly available file
system for a distributed workstation environment.
Computers, IEEE Transactions on, 39(4):447–459,
1990.
[12] F. Schneider, A. Feldmann, B. Krishnamurthy, and
W. Willinger. Understanding online social network
usage from a network perspective. In Proceedings of
6. Figure 6: Replication Overhead with Different Number of Servers
the 9th ACM SIGCOMM conference on Internet
measurement conference, pages 35–48. ACM, 2009.
[13] E. Talbi. Metaheuristics: from design to
implementation. 2009.