Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud

Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud Alexander G. Connor Panos K. Chrysanthis AlexandrosLabrinidis Advanced Data Management Technologies Laboratory Department of Computer Science University of Pittsburgh

Data in social networks A social network manages user profiles, updates and connections How to manage this data in a scalable way? Key-value stores offer performance under high load Some observations about social networks A profile view usually includes data from a user’s friends Spatial locality A friend’s profile is often visited next Temporal locality Requests might ask for updates from several users Web pages might include pieces of several user profiles A single request requires connecting to many machines

Connections in a Social Network Alice

Leveraging Locality Can we take advantage of the connections? What if we stored connected user’s profiles and data in the same place? Locality can be leveraged The number of connections is reduced User data can be pre-fetched We can think of this as a graph partitioning problem… Partitions = machines Vertices = user profiles, including update Edges = connections Objective: minimize the number of edges that cross partitions

Example – graph partitioning ,[object Object]

Accessing a vertex’s neighbors requires accessing many partitions

In a social network, requesting updates from followed users requires connecting to many machines

Far fewer edges cross partitions

Accessing a vertex’s neighbors requires accessing few partitions

In a social network, fewer connections are made and related user data can be pre-fetched,[object Object]

Outline Introduction Data in Social Networks Leveraging Locality Key-Key-Value Stores System Model Client API Adding a Key-Key-Value Load management On-line partitioning algorithm Simulation Parameters Results Conclusion

Address Table: Mapping Store ,[object Object]

maps keys to virtual machinesPhysical Layer: Physical machines ,[object Object],Logical Layer: Virtual machines ,[object Object]

Can be moved between physical machines as neededApplication Layer: Client API ,[object Object]

cached dataApplication Sessions Address table Virtual hosts Physical hosts

Client API and Sessions Clients use a simple API that includes the get, put and sync commands Data is pulled from the logical layer in blocks Groups of related keys The client API keeps data in an in-memory cache Data is pushed out asynchronously to virtual nodes in blocks Push/pull can be done synchronously if requested by the client Offers stronger consistency at the cost of performance

Adding a key-key-value put(alice, bob, follows) The on-line partitioning algorithm moves Alice’s data to Bob’s node because they are connected Two users: Alice and Bob Write the data to that node Write the same data to that node Use the Address Table to determine the virtual machine (node) that hosts Alice’s data Use the address table to determine the node that hosts Bob’s data Address table bob 8,8 8,8 alice 1,1 Virtual hosts kv(bob, ...) ... kkv(alice, bob, follows) kv(alice, ...) ... kkv(alice, bob, follows) 1,1 8,8

Once the split is complete, new physical machines can be turned on ,[object Object],If one node becomes overloaded, it can initiate a split To maintain the grid structure, nodes in the same row and column must also split Virtual hosts Splitting a Node

Outline Introduction Data in Social Networks Leveraging Locality Key-Key-Value Stores System Model Client API Adding a Key-Key-Value Load management On-line Partitioning Algorithm Simulation Parameters Results Conclusion

Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (7)

Semelhante a Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud

Semelhante a Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud (20)

Mais de University of New South Wales

Mais de University of New South Wales (10)

Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud

Notas do Editor