Neo4j au coeur du graphe social de 45 millions de membres", ou comment Viadeo est passés d'une technologie maison devenue limitée à un graphe-database plein de perspectives d'avenir pour modéliser son graphe social...
http://fr.viadeo.com/fr/profile/nicolas.tricot
MySQL Document Store - A Document Store with all the benefts of a Transactona...
Neo4j au coeur du graphe social de 45 millions de membres par Nicolas Tricot
1. Your network is more powerful than you think
Neo4J au cœur du graphe social
de 45 millions de membres
Viadeo Tech Days
Les 20, 21 et 22 novembre 2012
1
2. ABOUT THE VIADEO GROUP
• 1 million new members /
month
• 10 million connexions /
month
• 100 million profiles
viewed / month
Your network is more powerful than you think 2 / 36
14. PREHISTORY 2006-2011
• In-house algorithm
• Network storage in MySQL Database
CREATE TABLE `Network` (
`memberId` int(11) NOT NULL DEFAULT '0',
`L1` mediumblob NOT NULL,
`L2` mediumblob NOT NULL,
PRIMARY KEY (`memberId`)
) ENGINE=InnoDB;
Your network is more powerful than you think 14 / 36
15. PREHISTORY 2006-2011
Update the network (old-fashioned style)
Member A and Member B are now in contacts
Update of A.L1 + B.L1 and A.L2 + B.L2
Retrieving A.L1 + B.L1 and update *.L2
Example:
• A has 500 contacts
• B has 150 contacts
500 + 150 + 2 = 652 updates!
Your network is more powerful than you think 15 / 36
16. PREHISTORY 2006-2011
Good performances on
Computation of Paths
Computation of Distances
Your network is more powerful than you think 16 / 36
18. PREHISTORY 2006-2011
LIMITATIONS
1) Important latency for complete update
2) Massive bandwidth impact
for internal network
3) 48 hours to restart
from scratch
Your network is more powerful than you think 18 / 36
25. WHY Neo4J
Findings after POC on 3 other tools:
• Old technology with add-on for graph management
• No user communities
• Bad performance
• “Black Box” code
Why ?
• OpenSource project
• Good documentation
• User community
• Excellent performance
• ACID
• Very simple
• (How to better model a Social Graph than with a
Graph database ?!?)
Your network is more powerful than you think 25 / 36
26. WHY Neo4J
1 node = 1 member 1 Relationship
= 1 direct contact
Your network is more powerful than you think 26 / 36
27. WHY Neo4J
BENEFITS
Very easy to integrate
(less than 2 months)
Instantaneous
graph updates
High Availability
Backup /
Restore
Your network is more powerful than you think 27 / 36
29. LIMITATION
Doesn’t handle SHARDING!
(Split one graph onto several servers)
« Size doesn’t matter… », but…
Server 1 Server 2
Your network is more powerful than you think 29 / 36
31. EXPLORATION MODE
What for the future?
Store various kind of objects
Change the development paradigm
Your network is more powerful than you think 31 / 36
35. CONCLUSION
Neo4J:
Has replaced a 5-year-old in-house technology in only 2
months
Supports the core system of the Viadeo Professional Social
Network
Has been in production for 1 year ½
Deals smoothly with Viadeo’s usage growth
Think about how Neo4J will improve your own business!
Your network is more powerful than you think 35 / 36