Mais conteúdo relacionado
Semelhante a Linked in stream experimentation framework (20)
Linked in stream experimentation framework
- 1. LinkedIn’s STREAM EXPERIMENTATION FRAMEWORK
Joseph Adler, Bee-Chung Chen, and Xin Fu
O’Reilly Strata Conference
February 12 2014
©2014 LinkedIn Corporation. All Rights Reserved.
- 3. The LinkedIn Stream
Like many social networks, the
centerpiece of LinkedIn’s home
page is a news stream.
It contains
• Updates about users’ networks
• News stories and shares
• Recommendations
©2014 LinkedIn Corporation. All Rights Reserved.
- 4. The LinkedIn Stream
We operate at a large scale.
• 277+ million members
• 75+ million monthly unique
•
users
5000+ employees
©2014 LinkedIn Corporation. All Rights Reserved.
- 5. The LinkedIn Stream
Today, we’ll tell you how we
experiment with new content in
the stream:
• Creating new content
• Maximizing relevance
• Managing tests
©2014 LinkedIn Corporation. All Rights Reserved.
- 6. History of the LinkedIn Stream
Network updates were
introduced in 2006
Back then, LinkedIn had
• 5mm members
• 875k monthly uniques
• 70 employees
©2014 LinkedIn Corporation. All Rights Reserved.
- 7. History of the LinkedIn Stream
In practice this meant:
•Slow changing content, small
number of updates, weekly visit
rate
‣ No ranking/optimization
•Small number of active tests,
limited analytics resources
‣ Primitive resources for A/B tests
•Limited engineering resources
‣ Hacky solution for testing new
content...
©2014 LinkedIn Corporation. All Rights Reserved.
- 8. History of the LinkedIn Stream
We experimented with new
content using a system called
the Analytics Prototype Engine,
or APE. It was implemented as
an ad slot on the home page.
Big wins included:
• People You May Know
• Groups You Might Like
• Jobs You Might Be Interested In
©2014 LinkedIn Corporation. All Rights Reserved.
- 9. History of the LinkedIn Stream
We added more content over
the next couple of years:
•Status updates
•Twitter content
•Group discussions
•OpenSocial content (TripIt,
GitHub, and more...)
©2014 LinkedIn Corporation. All Rights Reserved.
- 10. History of the LinkedIn Stream
By 2009, the stream looked
very similar to the stream
today.
LinkedIn was much bigger than
when we first added a news
stream...
• 55mm members
• 36mm monthly uniques
• 500 employees (end of year)
©2014 LinkedIn Corporation. All Rights Reserved.
- 11. History of the LinkedIn Stream
… but the infrastructure hadn’t
changed much and we were
experiencing growing
pains:
•No system for ranking and
optimization:
‣ Users were overwhelmed with low
relevance updates
•No system for A/B testing
‣ Overlapping A/B tests, poor
experiment design, difficult analysis
•No system for rapid
prototyping/testing
‣ APE was making the site slow and
unstable, and was shut down
©2014 LinkedIn Corporation. All Rights Reserved.
- 12. History of the Stream
In the rest of this talk, we’ll tell
you how we’ve addressed
these challenges (and used a
lot of data science to make this
happen).
©2014 LinkedIn Corporation. All Rights Reserved.
- 13. Content Insertion
In the beginning (2006),
experiments happened outside
the stream through APE:
• Easy data uploads
• Management UI
• Templates
©2014 LinkedIn Corporation. All Rights Reserved.
- 14. Content Insertion
Most new content experiments
boil down to one thing: creating
experimental data.
We wanted the data experts to
be able to create experiments
easily by focusing on data, not
on writing production code (and
wrestling with build systems,
deployment processes, etc).
We created a system that lets
data scientists push new
content into the stream by
writing scripts (in Pig, Hive, etc).
©2014 LinkedIn Corporation. All Rights Reserved.
- 15. Content Insertion
Project Gorilla brought the spirit
of APE back to the home page,
inside the stream.
nhome
USCP
Federator
Gorilla First Pass
Ranker
Architecture diagram →
Gorilla Voldemort Store
Gorilla Batch
Gorilla jobs
©2014 LinkedIn Corporation. All Rights Reserved.
- 16. Content Insertion
What does this consist of?
•An Apache Pig UDF for
pushing content
•A batch process that filters,
consolidates, and ranks
updates
•A process that pushes data
from Hadoop into Voldemort
(our NoSQL key/value store)
•An online system that fetches
updates from the store and
mixes them into the stream
©2014 LinkedIn Corporation. All Rights Reserved.
nhome
USCP
Federator
Gorilla First Pass
Ranker
Gorilla Voldemort Store
Gorilla Batch
Gorilla jobs
- 17. Content Insertion
Our implementation is very simple:
•LinkedIn production systems use
rest.li as an API (JSON data +
schema)
•We create data offline on Hadoop,
put it in Voldemort, and surface it
through an API
This means that we can experiment
easily using existing templates,
tracking, etc; we just have to change
the data that’s rendered.
(We’re also experimenting with a
similar real time system based on
Apache Samza.)
©2014 LinkedIn Corporation. All Rights Reserved.
- 18. Relevance Optimization
Bring each individual user the most relevant items from different
sources to optimize for a single or multiple measurable
objectives
©2014 LinkedIn Corporation. All Rights Reserved.
- 19. Relevance Optimization
• Maximize users’ clicks on items in the stream
• Rank items according their click rates
• Probability that a user would click an item
• Predict the click rate based on
• User features: Profile, visit pattern, interests, …
• Item features: Type, topics, keywords, …
• User-item interaction features
• Context: Device, time of day, previous page …
©2014 LinkedIn Corporation. All Rights Reserved.
- 20. Relevance Optimization
Large scale logistic regression
•Input: A set of past users’ responses to items
Response
1
0
…
Feature Vector
(Gender=M, JobTitle=CEO, ItemType=JobChange, ...)
(Gender=F, JobTitle=Engineer, ItemType=Article, ...)
…
•Output: Model parameters
•Challenge: Data too large to fit in a single machine
•Solution: Train a model using MapReduce on Hadoop
©2014 LinkedIn Corporation. All Rights Reserved.
- 21. Relevance Optimization
Large scale Logistic Regression with ADMM
Large Input Data Set
Partition 1
Partition 2
Partition 3
…
Partition K
Logistic
Regression
Logistic
Regression
Logistic
Regression
…
Logistic
Regression
Consensus
Computation
©2014 LinkedIn Corporation. All Rights Reserved.
- 22. Relevance Optimization
Large scale Logistic Regression with ADMM
Large Input Data Set
Partition 1
Partition 2
Partition 3
…
Partition K
Logistic
Regression
Logistic
Regression
Logistic
Regression
…
Logistic
Regression
Consensus
Computation
©2014 LinkedIn Corporation. All Rights Reserved.
- 23. Relevance Optimization
Large scale Logistic Regression with ADMM
Large Input Data Set
Partition 1
Partition 2
Partition 3
…
Partition K
Logistic
Regression
Logistic
Regression
Logistic
Regression
…
Logistic
Regression
Consensus
Computation
©2014 LinkedIn Corporation. All Rights Reserved.
- 24. Relevance Optimization
Large scale Logistic Regression with ADMM
Large Input Data Set
Partition 1
Partition 2
Partition 3
…
Partition K
Logistic
Regression
Logistic
Regression
Logistic
Regression
…
Logistic
Regression
Consensus
Computation
©2014 LinkedIn Corporation. All Rights Reserved.
- 25. Relevance Optimization
Large scale Logistic Regression with ADMM
Large Input Data Set
Partition 1
Partition 2
Partition 3
…
Partition K
Logistic
Regression
Logistic
Regression
Logistic
Regression
…
Logistic
Regression
Consensus
Computation
©2014 LinkedIn Corporation. All Rights Reserved.
- 26. Relevance Optimization
Large scale Logistic Regression with ADMM
Large Input Data Set
Partition 1
Partition 2
Partition 3
…
Partition K
Logistic
Regression
Logistic
Regression
Logistic
Regression
…
Logistic
Regression
Consensus
Computation
©2014 LinkedIn Corporation. All Rights Reserved.
- 27. Relevance Optimization
Large scale Logistic Regression with ADMM
Large Input Data Set
Partition 1
Partition 2
Partition 3
…
Partition K
Logistic
Regression
Logistic
Regression
Logistic
Regression
…
Logistic
Regression
Consensus
Computation
©2014 LinkedIn Corporation. All Rights Reserved.
- 28. Relevance Optimization
Large scale Logistic Regression with ADMM
Large Input Data Set
Partition 1
Partition 2
Partition 3
…
Partition K
Logistic
Regression
Logistic
Regression
Logistic
Regression
…
Logistic
Regression
Consensus
Computation
©2014 LinkedIn Corporation. All Rights Reserved.
- 29. Relevance Optimization
Diversity
Users get tired when seeing items of the same type many times in the
stream.
Example: Group discussions
Drop in Click Rate
2 consecutive
discussions
21%
3 consecutive
discussions
48%
©2014 LinkedIn Corporation. All Rights Reserved.
- 30. Relevance Optimization
Multi-Objective Optimization
• Different items in the stream generate different kinds of value
• Click
• Social actions: Like, share, comment, …
• Revenue from sponsored items
• One approach:
Maximize revenue s.t. clicks and social actions are
still within ε% of optimal
• It requires extensive experiments!
©2014 LinkedIn Corporation. All Rights Reserved.
- 31. Experimentation Framework
Stream experiments are carried
out on LinkedIn’s central
experimentation platform:
• A one stop solution for feature
•
•
A/B testing, ramping, and
advanced targeting needs
Built-in power calculation to aid
experiment design
Automated reporting and
analysis capabilities
Mockup of UI
©2014 LinkedIn Corporation. All Rights Reserved.
- 32. Experimentation Framework
• History: assign members into test groups based on modulo of
Member IDs
• A very high likelihood of range overlaps between tests
• Just one experiment can negatively affect results of other tests
executed on the same page
• Now: deterministic pseudo-random algorithm for treatment
assignment computation
• Improved logging of treatment assignment
• Automated scorecards
• Record of historical experiments
©2014 LinkedIn Corporation. All Rights Reserved.
- 33. Experimentation Framework
• History: focus on productspecific metrics
• Stream relevance change
•
⇒ CTR
Profile redesign
⇒ # of profile views
• Now: standardized, tiered
metric system
• Sitewide Tier 1 metrics
• Product-specific Tier 2 / Tier 3
•
metrics
Comprehensive understanding
of feature impact
©2014 LinkedIn Corporation. All Rights Reserved.
Mockup of UI
- 34. Conclusions
LinkedIn has always experimented with site content. As we’ve
grown, we’ve had to rethink how we experiment.
Key lessons:
•Managing experimentation at scale is hard
•Scale means users, content volume, and employees
•Invest in platforms if it saves time, money, labor.
©2014 LinkedIn Corporation. All Rights Reserved.