This document summarizes Telltale Games' migration of their story analytics data from Apache CouchDB to Amazon DynamoDB. Some key points:
- Telltale Games collects large amounts of choice data from player interactions in their story-based games to better understand player behaviors.
- Their early infrastructure using SQL and CouchDB struggled to scale adequately to handle terabytes of data and required significant maintenance.
- Migrating to the managed DynamoDB allowed their data to scale elastically without limits. It reduced maintenance needs and improved their ability to process and analyze the data.
- Telltale now uses the analytics to cluster players into personas based on choices. This helps them personalize stories
2. What to Expect from the Session
● Intro to DynamoDB
● Telltale story analytics
● Early infrastructure (SQL and
Apache CouchDB)
● Migration to DynamoDB
● Better data, better stories
4. Why did AWS Build Amazon DynamoDB?
It’s hard to engineer for the
performance you need.
Traditional NoSQL databases run
into challenges as they scale.
Managing non-relational databases is hard.
5. Quick Intro on Amazon DynamoDB
Document or Key-Value Scales to Any WorkloadFully Managed NoSQL
Access Control Event Driven
Programming
Fast and Consistent
6. Fast, Consistent Performance
Single-digit millisecond latency
• At any scale
Data stored on Solid State Drives (SSDs)
Automatic partitioning means no need for
hotspot management
7. Highly Scalable
Simply specify each table’s read and write
throughput capacity
Increase and decrease capacity as needed
• No upper limit
DynamoDB manages all the scaling behind
the scenes
8. Flexible
Key-value store model
Each item in a DynamoDB
table is a list of attributes (fields)
and values
No need for every item to have the
same attributes
Add attributes at will
Document store
Place JSON-formatted data
into DynamoDB items for robust,
nested data structures
9. Amazon DynamoDB is a schemaless database
table items
Attributes (name/value
pairs or JSON
documents)
10. Each item includes a key
Partition key
(DynamoDB maintains
an unordered index)
11. Each item includes a key
Partition Key
Sort Key
(DynamoDB maintains a
sorted index)
12. Integration capabilities
DynamoDB Triggers
❑ Implemented as AWS
Lambda functions
❑ Your code scales
automatically
❑ Java, Node.js, and Python
DynamoDB Streams
❑ Stream of table updates
❑ Asynchronous
❑ Exactly once
❑ Strictly ordered
❑ 24-hr lifetime per item
22. Choice data
Large amounts of diverse data
●Episodes contain over 2000 nodes plus additional data
●Millions of worldwide users
●21TB of events stored
●>1 million parsed ‘sessions’ daily
●10x spikes for release, free episodes, & advertising
23. How we use the data
Aggregated back to players
24. How we use the data
Player heat map and evaluations
29. Apache CouchDB (Processing)
●Processing nearly impossible with data size
●Limited to more aggregation than analysis
●Couldn’t scale up easily for ‘speedy’ processing
●New queries impractical
31. Managed & Scale
●Immediately ended our maintenance
●No storage limitations(200 Billion event peak)
●21TB of events, 10GB/day.
●1M session uploads per day with 900ms response
●Automation scripts to adjust to spikes
●Start at 50 r/w per second, up to 20K write spikes
●Autoscaling using Dynamic DynamoDB
33. Processing
●Separate tables per game for independent processing
●Reading only the data needed
●Export entire tables to S3 in 24 hours with no loss
●Capable of adjusting to new queries
34. Costs
●1 server handles what 12 did before
●Costs roughly equivalent, but load is 10x
●Reading only the data needed
●No longer paying for static usage with pay per usage
37. Understanding player behavior
●Internal tools read aggregated data as player head
maps
●Know the characters and lines working
●Episodes personalize to audience in a way no
entertainment medium can
But we wanted to take this a step further. . .
38. User Playstyle Clustering
● Used K-means clustering was based on 2,200 Walking Dead story choice nodes
○ Algorithm determines number of means and initial seed value
○ 8 clusters that represented 88% of our players.
● Analyzed each cluster on two metrics:
○ Which story choices a single cluster endorsed at high rates
○ Which story choices are effective at splitting apart two or more clusters
● Developed player personas
○ Highlighted general preferences of a cluster (e.g., protecting resources over
helping people)
○ Identified minor characters that were influential to play style
40. “Amoral and Ambivalent”
The second biggest cluster, representing 22%
of our players, seems to value independence.
They offer peace first to Russians but lie about
Jane's whereabouts, potentially because they
think they are bluffing for survival's sake. They
don’t start with force, but likely to follow-up with
force if not complied with. Highly favors
pointing out they have a baby to the Russians
during the showdown, perhaps appealing to
their humanity, but while this cluster helps
others or offers peace when convenient, they
don’t hesitate to react with violence once
pushed.
41. “My Best Self”
This cluster is reasonable and logical; they
may be even tempered individuals, or Players
who feel comfortable being a little distanced
from the content of the game they are playing.
Their responses in conversation generally
seem to pick out the most responsive/reactive
threads. This cluster is conventionally
compassionate and frequently chose offers of
condolences and sympathy when appropriate.
Their most common two endings can perhaps
be read as either: disillusionment at Howe’s
when Clem turns away the family or loyalty as
when Clem keeps on with Kenny to Wellington
then leaves with him.