31. Data Collection in
Cloud
Game
Mess
Bus
Summarization
Large Scale
Storage
• Game is fully instrumented
• Messages sent to server via
message bus architecture (in
background and batched)
Data
Ingestion
• Standard “NoSQL” stack (storage
with Hadoop/Pig layered on top)
• Streaming event aggregation often
done by aggregating into in-
memory caching (memcached or
elasticcache).
• Often the large-scale storage is
simply json files stored in S3
• Large scale storage can run into
tens of gigabytes a day. Can be
significant cost over time
• What we know and
love
Example Architecture