The document provides an overview of analytical techniques for answering business questions. It discusses the four pillars of analytics: data munging, reporting and visualization, analysis and insights, and applied analytics. Specific topics covered include A/B testing best practices, reporting and visualization tools like Tableau, using multiple data sources for analysis, and best practices for data analysis and communication. The document is intended as a practical guide for those working in analytics to help tackle business issues.
4. Topics
1. Intro
2. The Four Pillars of Analytics
3. A/B testing
4. Reporting and Visualization
5. Data Analysis
6. Communication
7. Q&A
5. Who are we?
Space Ape Games is an award-winning UK independent game studio
Game of the Year - TIGA 2015
Best Indie Studio - Develop 2015
Combined KPIs: 20mm downloads, $44mm gross revenue
Apple Editor’s Choice, 4.7 average app store rating
7. The four pillars of analytics
1. Data Munging 2. Reporting and
Visualization
3. Analysis and
Insights
4. Applied
Analytics
Ad-hoc analysis
Deep Dives
A/B Testing
Dashboards
Slice & Dice
Tools
Data Viz
Event
Generation
Aggregation
Multiple Data
Sources
Predictive
Modelling
User
Segmentation
Targeted
Content
8. The four pillars of analytics
1. Data Munging 2. Reporting and
Visualization
3. Analysis and
Insights
4. Applied
Analytics
Ad-hoc analysis
Deep Dives
Hypothesis
Testing
Dashboards
Data Viz
Event
Generation
Aggregation
Multiple Data
Sources
Predictive
Modelling
User
Segmentation
Targeted
Content
10. ● Primacy Effect
○ When changes are made to a website, app etc, users will sometimes react to the “novelty”
of seeing something different, but only for a short period. This can confound a/b tests,
biasing results against control
● Examine test vs control time-series - is the uplift uniform or front-loaded?
● Sometimes opposite effect - eg changes to pricing changes can take time to sink in
● Interesting side-note: continuous change may be optimal, rather than “one-and-done” a/b test
A/B Testing - Primacy effect
11. ● Bootstrapping
○ T-Test relies on data being normally distributed
○ For mobile F2P games data is often heavily skewed and high variance, especially
revenue
○ Bootstrapping is an alternative to a t-test
○ Re-sampling with replacement to generate a distribution of sample means
○ Compare test group distribution to control to determine if test mean is different from
control - CLT means the distributions are normally distributed
A/B Testing - Bootstrapping
12. ● Decide on target metrics before starting the test (helps avoid type 1 errors by measuring too
many metrics or confirmation bias)
● When running optimization tests, only change 1 variable at a time (otherwise you won’t know
which variable caused the uplift!)
● Calculate how long the test will need to run for to detect a difference between test and control
(avoid ending test too early or running test for too long)
○ It is bad practice to wait until you get a significant result - can result in type 1 errors
● If possible, run a dummy control along with the actual control (eg have a “test group” that is the
same as control). This is insurance in case the assigning of users to a group affects the result
somehow
A/B Testing - best practices
14. Tableau is awesome!
● As a lifelong Excel user - Tableau is superior for dashboards and
slice/dice tools
○ Very flexible and fast - can quickly drill / filter / slice in real-time
during meetings. No need for “let me go back to my desk and
check that”
● “total” function is equivalent of windowing functions in SQL. Allows
same functionality in report (example: taps report - divide by DAU
rather than just users that used that tap)
● Works best when pointed at user / date level tables, rather than rolled-
up tables, as you can then calculate “per user” metrics on the fly
15. Beware being caught out by Y-axis scaling
Yellow sales
declining much
faster than
other types
16. Beware being caught out by Y-axis scaling
In fact share of
sales is
unchanged
Can also index
values against
starting amount
or calculate
period-on -period
change
17. Truncated Y-
Axes are
misleading - do
not use them!*
(some BI tools
add them by
default)
Beware being caught out by Y-axis scaling
* Unless you want to
over-emphasise the
differences in
something
18. ● Make sure your graph is clearly understandable
○ Add Axis labels, legend and title where needed
○ are font sizes big enough (will this be shown as a presentation or emailed to
someone?)
● Too many series on a graph can be confusing - filter out or roll-up long tail stuff - country split
for example
● R + ggplot2 is good if you need to make a lot of similar graphs
Data Viz best practices
20. Eat your own dogfood
● Dogfooding is the practice
of using your own product
● Put yourself in the shoes
of the customer - make
sure that your experience
is as close to theirs as
possible - no god mode,
no free premium currency
● This gives you a big
advantage when analyzing
player behaviour or
interpreting KPIs
21. ● Not everything will be captured in tracking events +
data warehouse
○ Do you need to add additional hooks?
○ Use Charles Proxy to see what else the client
is sending (eg for us - outside of Swrve)
● “System” tables (for us: Dynamo DB)
● Dev tools (server devs often have additional tools
and data you may not know about (for us: logstash)
● Spot when data is broken (eg hacked client)
● Competitor Tracking (App Annie)
● Marketing data aggregators (Singular)
● Platform reports (iTunes, google, Facebook)
● 3rd Party user trackers (Slice, SimilarWeb,
SuperFly)
Use all the data sources!
22. ● Mean does not tell the whole story
● Look at distributions using tools like R
● Use median/percentile measurements (for example measuring FPS - use
95th percentile)
● In F2P games we often see long-tailed, heavily skewed distributions
○ Outliers can heavily influence means - consider removing outliers
● Break users into segments (eg spend) to analyze features etc
Beware of only looking at means
23. ● Be careful to avoid confirmation bias
● Correlation does not not imply causation! Eg PvE vs retention (a/b testing is good here)
● Talking a problem through with someone will often yield good results - rubber duck effect
● Peer review of analysis is great for picking up mistakes and spotting additional avenues of
investigation
● Effort vs business benefit - sometimes the simple version is “good enough” (ie engineering
tolerance)
● A good analyst should be thinking about solutions as well as looking for the smoking gun - this
is the problem and here are suggestions for how we fix it (you are in a unique position of
having the most info - use that!)
Data Analysis best practices
25. ● Use “reverse brief”: when you receive a brief for some analysis
work, write your own brief for how you will tackle the issue and
the run through it with the originator
○ Good way to avoid going too deep on wrong areas or not
deep enough in key areas
● Sometimes it’s easier / quicker to go lo-fi on output and run
through it with someone face-to-face, rather than spending time
on a polished presentation
● For presenting work: big difference between a presentation you
send out to people vs presentation you present (try and avoid
“wall of text”. Yes I appreciate the irony saying that on this slide!)
Communication best practices