Main Takeaways:
- We'll discuss using proxy metrics to estimate the impact of future features, without having to pull any data.
- We'll deep-dive into principled A/B testing techniques, and give a high-level overview of the statistics that power them. Then we'll work to understand the tradeoff between A/B testing early, with completing analyses before investing engineering resources.
- We'll discuss how a top-line metric can skew data, and a framework for choosing the best north start metric.
17. Counter metrics
○ 2-week A/B tests can recommend the wrong long-term move
○ Over-indexing on short-term gains
○ Items can be inappropriate and/or illegal
○ Increase % of (paid listings / organic listings) --> long term loss
○ Overall revenue improvements can hurt subsets of businesses
E.g., you’re a marketplace. What can go wrong, even if you improve revenue?
18. Proxy metrics can help with prioritization
Ex: website with job postings. Top-line metric is revenue. Prioritize
decreasing sign up time, or decreasing recruiter time?
Project 1 - job seekers:
• Decreasing W minutes to sign up → X% increase in job seekers
• Increasing job seekers by X% → Y more paid listings → Z% revenue
Project 2 - job posters:
• Decreasing A minutes of recruiter time → B more paid listings
• B more paid listings → C% revenue
→ Build Project 1 if Z% > C%
19. Prioritization caveats - long-term strategy
What if you have three times as many job seekers as job postings?
20. Prioritization caveats - long-term strategy
Trending cheaper
• e.g., continually launching revenue positive, but decrease the average price of
items on the marketplace
Saturation
• e.g., small businesses in a marketplace
Knowledge
• e.g., personalization
25. A/B testing early example 1
• Build voice interaction with your app
• Start by putting a microphone on your app
• Then manually parse top X queries (grouped by synonyms)
• Build the top commands as "If X, then Y"
26. A/B testing early example 2
Developing Gmail. Example areas for ML:
• Identify if someone has forgotten an attachment
• Auto-generate a calendar invite
27. Product Tradeoffs
● Knowing which data to pull to make the decision
● Bringing the customer into the decision
28. ○ Then: Better than baseline (average driver)
○ At first: zero
Tradeoffs - autonomous cars
E.g., you're releasing an autonomous car product. How many people would you allow to
die using your product?
○ Imagine the media + public perception → growth
stagnated. People continue driving.
31. Tradeoffs - steps
1. Graph the tradeoff.
2. Find your inflection point where there are diminishing returns.
3. Find two points that represent 1) the conservative but expensive
solution, and 2) the riskier, faster-moving one.
4. Test them in isolated markets.
32. Tradeoffs - spam
How much do people value losing an email to their spam folder, vs. seeing extra spam?
• Additional data
• How often people check their spam folders? Average + distribution
• Precision: Of all the emails that we classified as spam, what percentage were spam
• Recall: Of all the spam emails, what percentage have we classified as spam
• Per (non-spam) email in inbox, likelihood that it's important
• Per spam email in inbox, how quickly people unsubscribe
• Per spam email in inbox, the negative impact on perception of the email engine
34. Tying it all together: relevance @ eBay
• Long-term correlation with revenue; oftentimes a short-term negative correlation
• ML classification:
• Get many examples of relevant items and many examples of irrelevant items
• Write features that could help determine if it's relevant or not
• Train and validate your model on the data
• Solvable problem if you have many examples of (ir)relevant items
• But how do you get many examples of “relevant” items?
35. Tying it all together: relevance @ eBay
• First step - define “relevant” and “irrelevant”
• Optimize for quantity or for quality?
• Hire and train human labelers
• Quantity, but judge each (query, item) multiple times and only accept
judgement where two agree
• Ground-truth tiered system with expensive super judges
36. Tying it all together: relevance @ eBay
• Is relevance a problem worth solving? → Go back to top-line goal
Really important example of where PMs can add great value if they’re savvy with data.
• Come up with system / strategy to get the best quality data to feed into the
ML models → practice being a data mindset
• How can we trade this off with projects targeting revenue? → Proxy metrics
• ML modelling which was previously impossible → 4th revolution
37. Wrap up
To be a data PM, it’s not knowing how to pull data. It’s not writing
SQL queries or understanding schema structures.
It’s knowing what data is important to answer the questions that
will help you build the best product for the customer.
For many tradeoffs, there are no right answers to sometimes very
tough problems.
But aggregating data up front allows you to make the most
informed decision.