SlideShare uma empresa Scribd logo
1 de 22
Using MongoDB for IGN’s Social Platform SF Bay Area MongoDB User Group Tuesday Feb 15th, 2011
About Me Manish Pandit @lobster1234 http:/about.me/mpandit
About IGN’s Social Platform An API to connect gamer community with editors, games, other gamers, and help lay the foundation for premium content discovery as well as UGC In beta since Sept 2010 5M+ activities  20K UVs a day, ~100K PVs a day
Architecture REST based API, built in Java Entities are People,MediaItems, Activities, Comments, Notifications, Status Interfaces across IGN.com as well as other social networks Caching tier based on memcached MySQL and MongoDB as persistence PHP/Zendfront end
MongoDB Usage Activity Streams : ActivityStrea.ms standard Activity Caching :(more on this later!) Activity Commenting Points : Also extend to badges Blocklists, Ban lists Notifications : System notifications Analytics : Activity snapshot for a user
Alternatives MySQL Obvious alternative, being used for storing person data, game data, relationships Did not work for activities Massive joins to filter newsfeeds, i.e. activities from friends Fairly normalized schema for activities Too many changes to the schema as requirements changed and new types of activities came into picture. Alter table started to take hours. Optimization led to large number of indexes, slowing down the writes
Alternatives Voldemort Used for the initial release, Sept 2010 Fast and simple implementation of Amazon Dynamo	 Did not work out for long We needed the ability to query the data Needed more than Key-Value pairs No in-place updates out of the box, had to write custom code to handle concurrent update conflicts (read-repair). Not a lot of developer velocity when compared to MongoDB
Other alternatives Cassandra Learning curve, lack of querying Did not want to bite more than we could chew CouchDB Map-reduce queries, views REST-based API is good, but performance gets affected by a chatty, HTTP interface for a database
Configuration Server: 1 Master, 2 Slaves (load balanced thru Netscalar) 2 extra slaves which are not queried (replicate!!) Version 1.6.1 Client: Java Driver (2.1) Ruby Driver (1.2) Mappers: Morphia for Java Connections per host : 200, #hosts = 4 Oplog Size: 1GB, about 2.5 hours Syncdelay: 60s (default) Hardware: 2 core, 6 GB virtualized machine
Maintenance Data defragmentation Slaves – by running it on different port Master – by having a downtime Collection trimming The scripts block during remove Bulk removes kills the slaves, spiking CPU 100%
Monitoring Nagios TCP Port Monitoring  Disk space monitoring CPU monitoring Munin Mongo connections  Memory usage Ops/second Write Lock % Collection Sizes (in terms of # of documents)
Backup or prepping for O Shit! NetApp Filter based, snapshots Make sure to do {fsync:1} and {lock:1} on one slave Hourly dumps via cronjob Using mongodump Incremental backup via the oplog Replay the oplog instead of relying on a snapshot Delayed slaves  Not recommended as it almost guarantees data loss proportional to the delay, which is inversely proportional to the time-to-react
Tools to be familiar with mongostat Look at queue lengths, memory, connections and operation mix db.serverStatus() Server status with sync, pagefaults, locks, index misses atop iostat db.stats() Overall info at the database level db.<coll_name>.stats() Overall info at the collection level db.printReplicationInfo() Info about the oplog size and time db.printSlaveReplicationInfo() Info about the master, the last sync timetamp, and how behind the slave is from the master
Challenges with ActivityStreams Lots of data! Large amount of data coming out as a result Reverse sorting The data has to be sorted in reverse natural order ($natural : -1), and we do not use capped collections Aggregation of similar activities Impacts pagination Fetching self activities (profile), and newsfeed (self + others) Filtering based on the activity type People want to see Game Updates or Blog updates from their friends Hydration of activities for dynamic data The thumbnail and level of the actor may change Comments  When an activity is rendered, the initial comments and count has to be pulled ($slice)  TODO: Rant about missing $size operator
ActivityStreams Each activity has an ACTOR Each actor has a TYPE Each actor performs an action, that action is called a VERB  Each VERB can act upon many Objects, called ACTIVITYOBJECTS Some VERBs may involve a Target, called ACTIVITYTARGET Every entity (Actor, ActivityObject, ActivityTarget) has links to define it Examples :  A writes ‘Hello!’ on B’s wall Actor => A,  ActivityObject=> ‘Hello!’ of type WALL_POST, ActivityTarget=> B, VERB => POST A follows a game B Actor => A, ActivityObject=> B of type MEDIA_ITEM, ActivityTarget=> null, VERB => FOLLOW ………and it gets complicated as we go down the rabbit hole!
Caching using MongoDB Caching the entire streams A bad idea (or bad implementation?) The expired objects sat in the db, bloating the database The removal did not free up space, so we ran out Use Mongo as a cache-key-index Cache the streams in Memcached For invalidation, keep the index of the memcached keys in MongoDB. Works!
What we’ve learned Keep an eye on Page Faults Index misses Queue lengths Database sizes on disk due to reuse vs. release Use .explain()  Watch for nscanned and indexBounds Use limit() when using find While updating, try to load that object in memory so that its in the working set (findAndModify) Try to keep the fields being selected at a minimum Replicate and denormalize instead of using writeconcerns
Near term Plans Move to replica sets  Move relationship graphs to MongoDB Shard the relationships based on the userId Run multiple mongo processes, splitting out collections among multiple databases
Wishlist Respect indexes in $or queries A $size operator for arrays $inc when doing $addToSet Defragmentation when removing data Concurrency – too many write lock conditions A decent start/stop script Load balancing in the driver (round robin) for reads
We are hiring Software Engineers to help us with exciting initiatives at IGN Technologies we use RoR, Java (no J2EE!), Spring, PHP/Zend, JQuery HTML5, CSS3, Sencha Touch, PhoneGap MongoDB, memcached, Solr http://corp.ign.com
Questions
References IGN’s Social Platform http://my.ign.com http://people.ign.com/ign-labs Mongo MuninPlugins https://github.com/erh/mongo-munin https://github.com/lobster1234/munin-mongo-collections Morphia http://code.google.com/p/morphia/

Mais conteúdo relacionado

Destaque

Katechismus 9 - 10 jarigen - De Tien Geboden
Katechismus 9 - 10 jarigen - De Tien GebodenKatechismus 9 - 10 jarigen - De Tien Geboden
Katechismus 9 - 10 jarigen - De Tien GebodenN Couperus
 
Pure Insight presentation on Innovation Search
Pure Insight presentation on Innovation SearchPure Insight presentation on Innovation Search
Pure Insight presentation on Innovation Searchdbyhundred
 
Зачем вашему бизнесу контакт-центр
Зачем вашему бизнесу контакт-центрЗачем вашему бизнесу контакт-центр
Зачем вашему бизнесу контакт-центрIndex - Unified Communications
 
Silicon Valley 2014 - API Antipatterns
Silicon Valley 2014 - API AntipatternsSilicon Valley 2014 - API Antipatterns
Silicon Valley 2014 - API AntipatternsManish Pandit
 
Funcionlinealyafin
FuncionlinealyafinFuncionlinealyafin
FuncionlinealyafinRodolfo A
 
Political Cartoon
Political CartoonPolitical Cartoon
Political CartoonAmy
 
Activities Done
Activities DoneActivities Done
Activities DoneIaaC
 
Filming- Day Two
Filming- Day TwoFilming- Day Two
Filming- Day Two3246
 
Jointure Naturelle3
Jointure Naturelle3Jointure Naturelle3
Jointure Naturelle3ADB2
 
Sowing the Seeds of a Successful Startup
Sowing the Seeds of a Successful StartupSowing the Seeds of a Successful Startup
Sowing the Seeds of a Successful Startupllumenti
 

Destaque (18)

Katechismus 9 - 10 jarigen - De Tien Geboden
Katechismus 9 - 10 jarigen - De Tien GebodenKatechismus 9 - 10 jarigen - De Tien Geboden
Katechismus 9 - 10 jarigen - De Tien Geboden
 
Proyecto ministerio adolescentes
Proyecto ministerio adolescentesProyecto ministerio adolescentes
Proyecto ministerio adolescentes
 
Pure Insight presentation on Innovation Search
Pure Insight presentation on Innovation SearchPure Insight presentation on Innovation Search
Pure Insight presentation on Innovation Search
 
Зачем вашему бизнесу контакт-центр
Зачем вашему бизнесу контакт-центрЗачем вашему бизнесу контакт-центр
Зачем вашему бизнесу контакт-центр
 
PROBLEMAS
PROBLEMASPROBLEMAS
PROBLEMAS
 
Story Board & Planning
Story Board & PlanningStory Board & Planning
Story Board & Planning
 
Silicon Valley 2014 - API Antipatterns
Silicon Valley 2014 - API AntipatternsSilicon Valley 2014 - API Antipatterns
Silicon Valley 2014 - API Antipatterns
 
Funcionlinealyafin
FuncionlinealyafinFuncionlinealyafin
Funcionlinealyafin
 
Makro Sunum2
Makro Sunum2Makro Sunum2
Makro Sunum2
 
Political Cartoon
Political CartoonPolitical Cartoon
Political Cartoon
 
Slideshare
SlideshareSlideshare
Slideshare
 
Activities Done
Activities DoneActivities Done
Activities Done
 
Mecatronic
MecatronicMecatronic
Mecatronic
 
Filming- Day Two
Filming- Day TwoFilming- Day Two
Filming- Day Two
 
Finance Bill 2009
Finance Bill 2009Finance Bill 2009
Finance Bill 2009
 
Acacia Research Learning Forum - Day 2
Acacia Research Learning Forum - Day 2Acacia Research Learning Forum - Day 2
Acacia Research Learning Forum - Day 2
 
Jointure Naturelle3
Jointure Naturelle3Jointure Naturelle3
Jointure Naturelle3
 
Sowing the Seeds of a Successful Startup
Sowing the Seeds of a Successful StartupSowing the Seeds of a Successful Startup
Sowing the Seeds of a Successful Startup
 

Mais de Manish Pandit

Disaster recovery - What, Why, and How
Disaster recovery - What, Why, and HowDisaster recovery - What, Why, and How
Disaster recovery - What, Why, and HowManish Pandit
 
Serverless Architectures on AWS in practice - OSCON 2018
Serverless Architectures on AWS in practice - OSCON 2018Serverless Architectures on AWS in practice - OSCON 2018
Serverless Architectures on AWS in practice - OSCON 2018Manish Pandit
 
Disaster Recovery and Reliability
Disaster Recovery and ReliabilityDisaster Recovery and Reliability
Disaster Recovery and ReliabilityManish Pandit
 
Immutable AWS Deployments with Packer and Jenkins
Immutable AWS Deployments with Packer and JenkinsImmutable AWS Deployments with Packer and Jenkins
Immutable AWS Deployments with Packer and JenkinsManish Pandit
 
AWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and JavaAWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and JavaManish Pandit
 
AWS Primer and Quickstart
AWS Primer and QuickstartAWS Primer and Quickstart
AWS Primer and QuickstartManish Pandit
 
Securing your APIs with OAuth, OpenID, and OpenID Connect
Securing your APIs with OAuth, OpenID, and OpenID ConnectSecuring your APIs with OAuth, OpenID, and OpenID Connect
Securing your APIs with OAuth, OpenID, and OpenID ConnectManish Pandit
 
Scalabay - API Design Antipatterns
Scalabay - API Design AntipatternsScalabay - API Design Antipatterns
Scalabay - API Design AntipatternsManish Pandit
 
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at Netflix
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at NetflixOSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at Netflix
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at NetflixManish Pandit
 
API Design Antipatterns - APICon SF
API Design Antipatterns - APICon SFAPI Design Antipatterns - APICon SF
API Design Antipatterns - APICon SFManish Pandit
 
Motivation : it Matters
Motivation : it MattersMotivation : it Matters
Motivation : it MattersManish Pandit
 
Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2Manish Pandit
 
Introducing Scala to your Ruby/Java Shop : My experiences at IGN
Introducing Scala to your Ruby/Java Shop : My experiences at IGNIntroducing Scala to your Ruby/Java Shop : My experiences at IGN
Introducing Scala to your Ruby/Java Shop : My experiences at IGNManish Pandit
 
Evolving IGN’s New APIs with Scala
 Evolving IGN’s New APIs with Scala Evolving IGN’s New APIs with Scala
Evolving IGN’s New APIs with ScalaManish Pandit
 
Object Oriented Programming
Object Oriented ProgrammingObject Oriented Programming
Object Oriented ProgrammingManish Pandit
 
Silicon Valley Code Camp 2011: Play! as you REST
Silicon Valley Code Camp 2011: Play! as you RESTSilicon Valley Code Camp 2011: Play! as you REST
Silicon Valley Code Camp 2011: Play! as you RESTManish Pandit
 

Mais de Manish Pandit (20)

Disaster recovery - What, Why, and How
Disaster recovery - What, Why, and HowDisaster recovery - What, Why, and How
Disaster recovery - What, Why, and How
 
Serverless Architectures on AWS in practice - OSCON 2018
Serverless Architectures on AWS in practice - OSCON 2018Serverless Architectures on AWS in practice - OSCON 2018
Serverless Architectures on AWS in practice - OSCON 2018
 
Disaster Recovery and Reliability
Disaster Recovery and ReliabilityDisaster Recovery and Reliability
Disaster Recovery and Reliability
 
OAuth2 primer
OAuth2 primerOAuth2 primer
OAuth2 primer
 
Immutable AWS Deployments with Packer and Jenkins
Immutable AWS Deployments with Packer and JenkinsImmutable AWS Deployments with Packer and Jenkins
Immutable AWS Deployments with Packer and Jenkins
 
AWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and JavaAWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and Java
 
AWS Primer and Quickstart
AWS Primer and QuickstartAWS Primer and Quickstart
AWS Primer and Quickstart
 
Securing your APIs with OAuth, OpenID, and OpenID Connect
Securing your APIs with OAuth, OpenID, and OpenID ConnectSecuring your APIs with OAuth, OpenID, and OpenID Connect
Securing your APIs with OAuth, OpenID, and OpenID Connect
 
Scalabay - API Design Antipatterns
Scalabay - API Design AntipatternsScalabay - API Design Antipatterns
Scalabay - API Design Antipatterns
 
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at Netflix
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at NetflixOSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at Netflix
OSCON 2014 - API Ecosystem with Scala, Scalatra, and Swagger at Netflix
 
API Design Antipatterns - APICon SF
API Design Antipatterns - APICon SFAPI Design Antipatterns - APICon SF
API Design Antipatterns - APICon SF
 
Motivation : it Matters
Motivation : it MattersMotivation : it Matters
Motivation : it Matters
 
Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2Building Apis in Scala with Playframework2
Building Apis in Scala with Playframework2
 
Scala at Netflix
Scala at NetflixScala at Netflix
Scala at Netflix
 
Introducing Scala to your Ruby/Java Shop : My experiences at IGN
Introducing Scala to your Ruby/Java Shop : My experiences at IGNIntroducing Scala to your Ruby/Java Shop : My experiences at IGN
Introducing Scala to your Ruby/Java Shop : My experiences at IGN
 
Evolving IGN’s New APIs with Scala
 Evolving IGN’s New APIs with Scala Evolving IGN’s New APIs with Scala
Evolving IGN’s New APIs with Scala
 
IGN's V3 API
IGN's V3 APIIGN's V3 API
IGN's V3 API
 
Java and the JVM
Java and the JVMJava and the JVM
Java and the JVM
 
Object Oriented Programming
Object Oriented ProgrammingObject Oriented Programming
Object Oriented Programming
 
Silicon Valley Code Camp 2011: Play! as you REST
Silicon Valley Code Camp 2011: Play! as you RESTSilicon Valley Code Camp 2011: Play! as you REST
Silicon Valley Code Camp 2011: Play! as you REST
 

Último

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Último (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

SF MongoDB User Group : Using MongoDB for IGN's Social Platform

  • 1. Using MongoDB for IGN’s Social Platform SF Bay Area MongoDB User Group Tuesday Feb 15th, 2011
  • 2. About Me Manish Pandit @lobster1234 http:/about.me/mpandit
  • 3. About IGN’s Social Platform An API to connect gamer community with editors, games, other gamers, and help lay the foundation for premium content discovery as well as UGC In beta since Sept 2010 5M+ activities 20K UVs a day, ~100K PVs a day
  • 4. Architecture REST based API, built in Java Entities are People,MediaItems, Activities, Comments, Notifications, Status Interfaces across IGN.com as well as other social networks Caching tier based on memcached MySQL and MongoDB as persistence PHP/Zendfront end
  • 5. MongoDB Usage Activity Streams : ActivityStrea.ms standard Activity Caching :(more on this later!) Activity Commenting Points : Also extend to badges Blocklists, Ban lists Notifications : System notifications Analytics : Activity snapshot for a user
  • 6. Alternatives MySQL Obvious alternative, being used for storing person data, game data, relationships Did not work for activities Massive joins to filter newsfeeds, i.e. activities from friends Fairly normalized schema for activities Too many changes to the schema as requirements changed and new types of activities came into picture. Alter table started to take hours. Optimization led to large number of indexes, slowing down the writes
  • 7. Alternatives Voldemort Used for the initial release, Sept 2010 Fast and simple implementation of Amazon Dynamo Did not work out for long We needed the ability to query the data Needed more than Key-Value pairs No in-place updates out of the box, had to write custom code to handle concurrent update conflicts (read-repair). Not a lot of developer velocity when compared to MongoDB
  • 8. Other alternatives Cassandra Learning curve, lack of querying Did not want to bite more than we could chew CouchDB Map-reduce queries, views REST-based API is good, but performance gets affected by a chatty, HTTP interface for a database
  • 9. Configuration Server: 1 Master, 2 Slaves (load balanced thru Netscalar) 2 extra slaves which are not queried (replicate!!) Version 1.6.1 Client: Java Driver (2.1) Ruby Driver (1.2) Mappers: Morphia for Java Connections per host : 200, #hosts = 4 Oplog Size: 1GB, about 2.5 hours Syncdelay: 60s (default) Hardware: 2 core, 6 GB virtualized machine
  • 10. Maintenance Data defragmentation Slaves – by running it on different port Master – by having a downtime Collection trimming The scripts block during remove Bulk removes kills the slaves, spiking CPU 100%
  • 11. Monitoring Nagios TCP Port Monitoring Disk space monitoring CPU monitoring Munin Mongo connections Memory usage Ops/second Write Lock % Collection Sizes (in terms of # of documents)
  • 12. Backup or prepping for O Shit! NetApp Filter based, snapshots Make sure to do {fsync:1} and {lock:1} on one slave Hourly dumps via cronjob Using mongodump Incremental backup via the oplog Replay the oplog instead of relying on a snapshot Delayed slaves Not recommended as it almost guarantees data loss proportional to the delay, which is inversely proportional to the time-to-react
  • 13. Tools to be familiar with mongostat Look at queue lengths, memory, connections and operation mix db.serverStatus() Server status with sync, pagefaults, locks, index misses atop iostat db.stats() Overall info at the database level db.<coll_name>.stats() Overall info at the collection level db.printReplicationInfo() Info about the oplog size and time db.printSlaveReplicationInfo() Info about the master, the last sync timetamp, and how behind the slave is from the master
  • 14. Challenges with ActivityStreams Lots of data! Large amount of data coming out as a result Reverse sorting The data has to be sorted in reverse natural order ($natural : -1), and we do not use capped collections Aggregation of similar activities Impacts pagination Fetching self activities (profile), and newsfeed (self + others) Filtering based on the activity type People want to see Game Updates or Blog updates from their friends Hydration of activities for dynamic data The thumbnail and level of the actor may change Comments When an activity is rendered, the initial comments and count has to be pulled ($slice) TODO: Rant about missing $size operator
  • 15. ActivityStreams Each activity has an ACTOR Each actor has a TYPE Each actor performs an action, that action is called a VERB Each VERB can act upon many Objects, called ACTIVITYOBJECTS Some VERBs may involve a Target, called ACTIVITYTARGET Every entity (Actor, ActivityObject, ActivityTarget) has links to define it Examples : A writes ‘Hello!’ on B’s wall Actor => A, ActivityObject=> ‘Hello!’ of type WALL_POST, ActivityTarget=> B, VERB => POST A follows a game B Actor => A, ActivityObject=> B of type MEDIA_ITEM, ActivityTarget=> null, VERB => FOLLOW ………and it gets complicated as we go down the rabbit hole!
  • 16. Caching using MongoDB Caching the entire streams A bad idea (or bad implementation?) The expired objects sat in the db, bloating the database The removal did not free up space, so we ran out Use Mongo as a cache-key-index Cache the streams in Memcached For invalidation, keep the index of the memcached keys in MongoDB. Works!
  • 17. What we’ve learned Keep an eye on Page Faults Index misses Queue lengths Database sizes on disk due to reuse vs. release Use .explain() Watch for nscanned and indexBounds Use limit() when using find While updating, try to load that object in memory so that its in the working set (findAndModify) Try to keep the fields being selected at a minimum Replicate and denormalize instead of using writeconcerns
  • 18. Near term Plans Move to replica sets Move relationship graphs to MongoDB Shard the relationships based on the userId Run multiple mongo processes, splitting out collections among multiple databases
  • 19. Wishlist Respect indexes in $or queries A $size operator for arrays $inc when doing $addToSet Defragmentation when removing data Concurrency – too many write lock conditions A decent start/stop script Load balancing in the driver (round robin) for reads
  • 20. We are hiring Software Engineers to help us with exciting initiatives at IGN Technologies we use RoR, Java (no J2EE!), Spring, PHP/Zend, JQuery HTML5, CSS3, Sencha Touch, PhoneGap MongoDB, memcached, Solr http://corp.ign.com
  • 22. References IGN’s Social Platform http://my.ign.com http://people.ign.com/ign-labs Mongo MuninPlugins https://github.com/erh/mongo-munin https://github.com/lobster1234/munin-mongo-collections Morphia http://code.google.com/p/morphia/