9. Content Ranking Problems X Y Most Popular Most engaging overall based on objective metrics Most Popular + Per User History Rotate stories I’ve already seen Light Personalization More relevant to me based on my age, gender, location, and property usage Deep Personalization Most relevant to me based on my deep interests (entities, sources, categories, keywords) Related Items and Context-Sensitive Models Behavioral Affinity: People who did X, did Y Most engaging in this page/section/property/device/referral context? Layout Optimization Which modules/ad units should be shown to this user in this context? Revenue Optimization Voice and Business Rules Real-time Dashboard
10. Yahoo Frontpage Trending Now (Most popular) Today Module (Light personalization) Personal Assistant (Light Personalization) National News (Most Popular + User History bucket) Deals (most popular)
14. …Item Inventory Articles, web page, ads, … Use an automated algorithm to select item(s) to show Get feedback (click, time spent,..) Refine the models Repeat (large number of times) Measure metric(s) of interest (Total clicks, Total revenue,…) Opportunity Users, queries, pages, …
15. Problem Characteristics : Today module Traffic obtained from a controlled randomized experiment Things to note: a) Short lifetimes b) temporal effects c) often breaking news story
16. Scale: Why use Hadoop? Million events per second (user view/click, content update) Hundreds of GB data collected and modeled per run Millions of items in pool Millions of user profiles Tens of thousands of Features (Content and/or User)
17. Data Flow Optimization Engine Content feed with biz rules Rules Engine Content Metadata Exploit ~99% Explore ~1% Near Real-time Feedback Real-time Insights Dashboard Optimized Module
18. How it happens ? Additional Content & User Feature Generation Feature Generation ITEM Model STORE: HBASE 5 min latency Request User Events Modeling Ranking B-Rules SLA 50 ms – 200 ms 5 – 30 min latency At time ‘t’ User ‘u’ (user attr: age, gen, loc) interacted withContent ‘id’ at Position ‘o’ Property/Site ‘p’ Section - s Module – m International - i’ STORE: PNUTS USER Model Content ‘id’ Has associated metadata ‘meta’ meta = {entity, keyword, geo, topic, category} Item Metadata
66. HIVE has provided us a great way for analyzing the results
67.
Notas do Editor
This is the Title slide.Please use the name of the presentation that was used in the abstract submission.
This is the agenda slide. There is only one of these in the deck.NOTES:What does X stories to run mean ? Can we be more clear on thatAlso – This should be a more a punch line of what we do. This slide to me is very broad and not clear. Following are the things that I would describeProblem of matching the best content to the interest of a userScale Millions of content slicesMillions of users
This is the agenda slide. There is only one of these in the deck.NOTES:What does X stories to run mean ? Can we be more clear on thatAlso – This should be a more a punch line of what we do. This slide to me is very broad and not clear. Following are the things that I would describeProblem of matching the best content to the interest of a userScale Millions of content slicesMillions of users
This is the agenda slide. There is only one of these in the deck.NOTES:What does X stories to run mean ? Can we be more clear on thatAlso – This should be a more a punch line of what we do. This slide to me is very broad and not clear. Following are the things that I would describeProblem of matching the best content to the interest of a userScale Millions of content slicesMillions of users
This is the agenda slide. There is only one of these in the deck.NOTES:What does X stories to run mean ? Can we be more clear on thatAlso – This should be a more a punch line of what we do. This slide to me is very broad and not clear. Following are the things that I would describeProblem of matching the best content to the interest of a userScale Millions of content slicesMillions of users
This is the agenda slide. There is only one of these in the deck.NOTES:What does X stories to run mean ? Can we be more clear on thatAlso – This should be a more a punch line of what we do. This slide to me is very broad and not clear. Following are the things that I would describeProblem of matching the best content to the interest of a userScale Millions of content slicesMillions of users
This is the agenda slide. There is only one of these in the deck.NOTES:What does X stories to run mean ? Can we be more clear on thatAlso – This should be a more a punch line of what we do. This slide to me is very broad and not clear. Following are the things that I would describeProblem of matching the best content to the interest of a userScale Millions of content slicesMillions of users
This is the agenda slide. There is only one of these in the deck.NOTES:What does X stories to run mean ? Can we be more clear on thatAlso – This should be a more a punch line of what we do. This slide to me is very broad and not clear. Following are the things that I would describeProblem of matching the best content to the interest of a userScale Millions of content slicesMillions of users
This is the agenda slide. There is only one of these in the deck.NOTES:What does X stories to run mean ? Can we be more clear on thatAlso – This should be a more a punch line of what we do. This slide to me is very broad and not clear. Following are the things that I would describeProblem of matching the best content to the interest of a userScale Millions of content slicesMillions of users
This is the final slide; generally for questions at the end of the talk.Please post your contact information here.