WordSquared is a massively multiplayer online scrabble crossword that uses MongoDB to store location data. Players place tiles in realtime, and compete to build the longest chain of words and earn points. WordSquared leverages MongoDB's built-in geospatial indexing, storing an x and y coordinate with each tile and using bounding box queries to
display a view of the board.
This presentation covers the architecture of the game, with a specific focus on the use of MongoDB, storing and querying location data, and learning how to structure and even shard geo data through the unlikely use
case of an infinitely large board game.
7. A bit of history
Originally written for Node Knockout 2010
8. MongoDB - An Introduction
(briefly)
Fast, schemaless, document-oriented database
Download at http://mongodb.org/downloads
Speaks BSON (Binary JSON) - drivers in many
languages
9. MongoDB - An Introduction
Documents can be nested and contain arrays
14. Structuring your data
tile = {
_id : BSON::ObjectId(...)
position : [0,0],
letter : "A",
wildcard : "false"
}
15. Structuring your data
tile = {
_id : BSON::ObjectId(...)
position : {x: 0, y: 0},
letter : "A",
wildcard : "false"
}
16. Watch your language
> db[‘tiles’].insert({
position : {y: 50, x: 20},
letter : "A",
wildcard : "false"
})
=> BSON::ObjectId('4dd06d037a70183256000004')
> db.[‘tiles’].find_one()
=>
{"_id"=>BSON::ObjectId('4dd06d037a70183256000
004'), "letter"=>"A", "position"=>{"x"=>20,
"y"=>50}, "wildcard"=>false}
17. Be safe!
Use array notation; guaranteed ordering = WIN
C++: BSONObjBuilder
Ruby: Use 1.9.x or OrderedHash in 1.8.x
Python: Use OrderedDict (introduced in 2.7) and SON
(in the BSON package)
Javascript: Did I mention arrays?
19. Creating the index
> db[‘tiles’].create_index(
[[“position”, Mongo::GEO2D]],
:min => -500, :max => 500, :bits => 32
)
=> “position_2d”
20. More index fun
Only one Geo2D index per collection (SERVER-2331)
But it can be a compound index:
> db[‘tiles’].create_index([
[“position”, Mongo::GEO2D],
[“letter”, Mongo::ASCENDING]
])
=> “position_2d_letter_1”
Queries are prefix-matched on indexes, so put Geo2D
first (or use hinting)
21. New 2.0 feature
Geo2d indices across an array field!
> db[‘words’].insert({
“word” : “QI”,
“tiles” : [
{“letter” => “Q”, position => [1,1]},
{“letter” => “I”, position => [2,1]}
]
})
=> BSON::ObjectID('4dd074927a70183256000006')
> db[‘words’].create_index([[
“tiles.position”,
Mongo::GEO2D
]])
=> “position_2d”
25. Problems we don’t have
Projection issues
Great Circle distance
calculation
Polar coordinate systems
Pirates
http://www.flickr.com/photos/jmd41280/4501410061/
26. Querying real location data
Search by proximity: $near
Uses native units (degrees for [-180, 180])
Use $maxDistance to bound query
> db[‘tile’].find(:position => {“$near” => [10,10]}).to_a
=> [{"_id"=>BSON::ObjectId('4dd084ca7a70183256000007'),
"letter"=>"A", "position"=>[12,9]}]
> db[‘tile’].find(:position => {“$near” => [10,10],
“$maxDistance” => 1}).to_a
=>[]
27. Querying real location data
Need distance to center as well? Use $geoNear
Also includes fun stats
> db.command('geoNear' => 'tiles', 'near' => [1830,
2002], :maxDistance => 10)
)
=> {"ns"=>"test.tiles",
"near"=>"110000000000001100011000110010101010001000001011
1111", "results"=>[{"dis"=>3.999471664428711,
"obj"=>{"_id"=>BSON::ObjectId('4dd0b0957a701852bc02bf67')
, "position"=>{"x"=>1830, "y"=>2006}, "letter"=>"A"}}],
"stats"=>{"time"=>0, "btreelocs"=>3, "nscanned"=>2,
"objectsLoaded"=>1, "avgDistance"=>3.999471664428711,
"maxDistance"=>3.999471664428711}, "ok"=>1.0}
28. Querying real location data
Region queries: $within
Example: $box (rectangle)
> db[‘tile’].find(:position => {“$within” => {“$box” =>
[[10,10], [30,30]]}).to_a
=> [{"_id"=>BSON::ObjectId('4dd084ca7a70183256000007'),
"letter"=>"A", "position"=>[12,9]}]
[30,30]
[10,10]
30. Querying real location data
New in 2.0: $polygon!
> db[‘tile’].find(:position => {“$within” => {“$polygon”
=> [[5,5], [5,15], [15,5]}).to_a
=> [{"_id"=>BSON::ObjectId('4dd084ca7a70183256000007'),
"letter"=>"A", "position"=>[12,9]}]
[5,15]
[5,5] [15,5]
31. Querying real location data
Spherical equivalents: $nearSphere and $centerSphere
Uses radians, not native units
position must be in [long, lat] order!
> earthRadius = 6378 #km
=> 6378
> db[‘restaurants’].find(:position => {“$nearSphere” =>
[-122.03,36.97], “$maxDistance” => 25.0/earthRadius}).to_a
=> [{"_id"=>BSON::ObjectId('4dd084ca7a70183256000007'),
"name"=>"Crow’s Nest", "position"=>[-122.0,36.96]}]
32. MapReduce
MapReduce queries can use Geo2D indices when
querying data
Great for MMO analytics:
‘What events did user x trigger within this region’
‘Which users visited this region in the last 24 hours’
40. Gotchas
Query engine assumes a regular grid (possibly mapped
onto a sphere using a standard sinusoidal projection)
If you’re using non-square region units, expect to
perform secondary processing on the results
42. Again: we’re weird.
Big index, but no need for it all to be in memory
Large numbers of tiny documents
Large regions of the world where activity => 0 as
density => 1
Single box scaling limit determined by # of active
sections of the world at a time
43. Our setup
Master/Slave (Nowadays: use a Replica Set)
Slaves used for
backup
Map image generation
Next stop (at some point): geoSharding
44. Sharding
Yes, you can shard on a geo-indexed field
Not recommended due to query performance
(SERVER-1982). Vote it up if you care (and you
should).
Can’t use $near in queries, only $geoNear and
therefore runCommand(). (SERVER-1981)
Don’t have a lot of time - trying to pack in as much useful information as possible.\n\nStill, in case you’re wondering where the title came from: A book from the 19th Century in which a Sphere comes to 2D Flatland in an effort to convince them of the existence of the third dimension (and then scoffs when the Square posits 4th, 5th etc. dimensions). \n\n
I run a game company named Massively Fun. We’re based in Seattle and are relatively new to the scene. \nOur first game is WordSquared. Anyone heard of it? Anyone? Bueller?\n
Massively multiplayer crossword game, played on an infinitely large board.\n47.18MM wordd played so far\nPlay surface: 108MM+ tiles, covering an area of 63510 x 130629\nAssuming a ‘standard’ 18.5mmx21mmx4.5mm tile, the play surface covers an area of:\n * 3.22 MM square meters\n * 796.44 acres.\n\nStacked vertically, 489.5 km tall (Roughly 100km higher than the orbit of the ISS\n
One grid square = 15x15 (a size that should be familiar from other, smaller word games)\nEach white pixel = 1 tile\n
\n
Wrote the original version in 48 hours as part of Node Knockout.\nApp is pure HTML/Javascript/CSS3.\nBackend was Node talking to MongoDB.\n
\n
\n
\n
\n
MongoHQ was offering free hosted MongoDB instances for NK teams\nNo fixed schema = less time wrangling with the schema while rapid prototyping\nWe track every tile in the game world using a single Geo2D indexed collection.\n
We’ll be looking at 1.8.1 for today. \nSome things that won’t work in 1.6.x\nSome things get better in 1.9.x (use at your own risk!)\nExamples are in Ruby for brevity and clarity\n
MongoDB stores Documents - arbitrarily structured JSON.\nHere’s our basic tile document. Not too different from what we store.\nWe’re going to generate a 2d index across the position field.\nOur position data is in an array. Doesn’t matter whether x or y comes first, as long as it’s consistent.\n
You can also store your position data as a sub-object.\nThere’s a gotcha here...\n
Let’s use Ruby 1.8.7 as an example. If you specify the position as a Hash,\nthere’s no guarantee the ordering of keys will be preserved. \nThis is bad news for geo2d and will result in strange query results.\n
Not much you can do in javascript.\nOf course, your ORM may take care of this for you. Test to be sure.\nWe use the array syntax.\n
Creates a basic lat/long geo2d index over the position column. \nRange by default is -180 to 180, with 26 bits’ precision\nIf you’re indexing a larger space (like us!), you’ll want to increase all 3.\nMin/Max can be really, really big. :)\n
We can modify the defaults by passing additional options to the call to create_index.\nNot that in versions less than 1.9, you cannot insert records at the limit of the index (±500 here). \n
Useful if you’re frequently querying on parameters other than location.\ngeo doesn’t have to be first, but probably should be unless you never query purely on location (remember, only one Geo2D index per collection!).\nAlternately you can use MongoDB’s hinting mechanism to help the query planner.\n
Great for storing things that appear at multiple locations. For example:\n * Everywhere on the board a word has been played\n * \n\n
\n
The world isn’t flat. (Inorite?) Our (and likely, your) world is. \nAny guess which world is easier to deal with?\n
Big surprise.\n
\n
$near is a simple proximity search. $maxDistance can be a float and < 1.\n
Remember, ruby 1.9.x or use OrderedHash! Things won&#x2019;t work otherwise!\n
This is our primary use: fetching $box queries that represent the user&#x2019;s viewport on the world.\n
You can also do centerpoint/radius queries.\n
Okay, technically it was there in 1.9.\nStore your mesh and query within it - great for political regions, for example\n
Works like $near, but we need to adjust for the fact that it uses radians and not native units.\nNo $boxSphere or $polygonSphere, in case you were wondering.\n
\n
\n
Easysauce. Treat like a normal grid, then do the skew math in the client.\n
&#x201C;Squares are great!&#x201D; you say. &#x201C;But what about other shapes?&#x201D;\nI&#x2019;m glad you asked. Our engine on top of MongoDB can handle persistence and region calculations with non-square region units. \n(Side note: Battletech rawks. I loved the Marauder. Can anyone name that &#x2018;mech?)\n
Massively Multiplayer Triominos, anyone?\nOr is it a flattened polygon mesh?\n\n
\n
\n
\n
\n
We&#x2019;ll be looking at 1.8.1 for today. \nSome things that won&#x2019;t work in 1.6.x\nSome things get better in 1.9.x (use at your own risk!)\nExamples are in Ruby for brevity and clarity\n
The Word2 world is a bit like the universe. All the interesting stuff is happening further and further apart. \n
We build map images on the slave because they pull tile data into memory that&#x2019;s a superset of what&#x2019;s necessary to show players; minimizes in-memory cache thrashing\n
Geo queries currently get routed to every shard for execution. \nWe don&#x2019;t do it (yet). Experimenting with it though.\n
What does that mean for me, the person on the street?\n