SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
Data Science for Online Services:
Problems & Frontiers
Jinyoung Kim
Data & Analytics
Naver Search US
2021 Changbal Conference
About Me
Head of Data Science @ Naver Search
Executive Director of Naver Search US
Ex-MSFT / Ex-Snap (Search & RecSys)
Co-founder / 1st President of Changbal
* 한국/미국에서 Data Scientist & Engineer
채용중입니다! (jin.y.kim@navercorp.com)
https://medium.com/naver-dna-tech-blog
Mission: Making Data Science as Cool as AI
Mission: Making Data Science as Cool as AI
Challenges in Building
Modern Online Services
What constitutes a modern online service?
AI-
enabled
Mobile-
first
Cloud-
backend
Powerful
algorithms
Limitless
Computing
Infrastructure
Contextual
awareness,
Omnipresence
Challenge#1: Dynamic Environment
Societal and
environment
changes
New apps
and services
launch daily
Underpinning
technologies
evolve
COVID-19
Climate
Change
GPT-3 and large-
scale language
models Yelp
Pinterest
AirBnB
Election
Breakthrough in
Computer Vision
TicTok
DoorDash
Solution: Metric Regression Detection
Source: KDD’20 Paper from Microsoft
Challenge#2: Ecosystem / Social Impact
• Online services need to look
beyond user satisfaction
• Search / RecSys results can instill
and reinforce social bias
• Algorithmic ranking means
life/death for content providers
• Manipulating algorithmic results
became an industry
Solutions to Bias / Fairness in Ranking
Source: http://naversearchconf.naver.com/
Challenge#3: Internal Alignment
• Major online service
companies have dozens
of teams with different
focus
• Blind optimization in one
area can lead to negative
user experience
Modern AB Test for Naver Search (DeView’21)
Data Science Problems
for Online Services
Problems & Solutions by Service Lifecycle
• Each stage present different analysis and decision problems,
with corresponding data science solutions
PLAN DEVELOP LAUNCH MONITOR
Defect
Monitoring
Competitive
Analysis
Offline
Experiment
A/B
Experiment
KPI
Monitoring
Opportunity
Analysis
Parameter
Optimization
Crowdsourced
Evaluation
DS Problems during Planning Stage
Questions
• How can we acquire &
onboard new users?
• What are the pain points
with existing customers?
• Where are we lagging
behind competing
services?
Solutions
• User funnel / journey
analysis
• Side-by-side comparison
• In-depth user study
Where do users come & go? Funnel Analysis
• New Users for E-Commerce • Existing Users for Contents
Discovery (Search / App Store)
Connect / Install
Create Account
Checkout
Revisit
App Open
Page View (browsing)
Page View
Consume
Revisit
Where do most users drop off the funnel?
Which part of the funnel has the biggest leverage?
How is the funnel shifting over time?
Beyond Funnel: User Journey Analysis
• Sanky chart can visualize diverging / converging user journey
Source: Medium
Example: Analyzing Snapchat User Journey
• Use app events to build & predict user journey graph
Source: Characterizing and Forecasting User Engagement
with In-app Action Graph: A Case Study of Snapchat
Example: Analyzing Snapchat User Journey
• Use these to predict and optimize user satisfaction / retention
Applications for
Naver Mobile App
• What does overall user
journey look like?
• Can we understand
which factors improve /
hinder user satisfaction?
• Can we optimize user
experience, thereby
improving KPI (DAU)?
DS Problems during Dev. / Launch Stage
Questions
• Is the new
design/ranking better
than the old one?
• How can we choose the
best design/ranking
parameters?
Solutions
• Online (A/B)
experiment
• Multi-armed Bandit
Before/After Comparision vs. AB Test
• Hard to measure true impact of given feature
• High user impact in case of full launch & roll-back
Control Treatment
External Factors
(day-of-week / seasonality / ...)
Full Launch Roll-back
Control
100%
Traffic
Before/After Comparison vs. AB Test
• AB testing allows measurement w/o impact of external factors
• Multiple treatments with versioning & roll-out support
Control
Treatment1 (V1)
AB Test V1
External Factors (*)
(* no impact under randomized controlled experiment)
Roll-out &
Monitoring
Control
Treatment1 (V2)
AB Test V2
Treatment2
(gradual roll-out)
Problem
Solving
10~20%
Traffic
Treatment2 (V1) Treatment2 (V2)
Pitfalls in Real-world AB Testing (DeView’21)
Modern AB Test for Naver Search (DeView’21)
Can we use the user
bucketing idea for
dynamically optimize AI
model / UI parameters?
But I have hundreds of
parameters to choose
from. I’d also like to
minimize user impact
Contextual Bandits for Parameter Optimization
• Converge to the best action (max. reward) given a context
DS Problems during Monitoring Stage
Questions
•Has there been any
shift in user behavior /
metrics?
•Has the algo results
shown any new
defects?
Solutions
•Metric monitoring /
detection framework
•Results defect
detection framework
Metric Regression
Detection Framework
• Generating alerts for
metric regression is easy
• The hard part is
minimizing false positives
• Solutions to control false
discovery rate is available
Source: KDD’20 Paper from Microsoft
Defect Monitoring for Naver Search Results
Input Query
Set
Scraping &
Annotation
Automated
Quality Estimate
Collect Human
Quality Rating
Defect
Reporting
Continuous
Defect Monitoring
and Improvement
Manual reporting
Sample from traffic
SRP side-by-side
rating
Snippet rating
SRP scraping & parsing
Query taxonomy
SRP layout signals
Engagement signals
Query defect alerts
Defect weekly discussion
Closing Thoughts
Closing Thoughts
• Modern / AI-powered online services are
more powerful, but along came greater
challenges and responsibilities.
• Data Science provides various solutions
throughout the lifecycle of online
services
• The choice of techniques depends on
specifics of the service (Contents /
Commerce / Social network / …)
• Learning and starting career in data
science is also more accessible than ever
(bootcamps, better tools and guides)
AI-
enable
d
Mobile
-first
Cloud-
backend
We’re hiring @ Naver Search US!
• Exciting Data Science &
Artificial Intelligence problems
across Naver & Line
• Best of both Korean and US
tech working culture
• Huge growth opportunities
(150+ people in 3-5 years)
• Locations in Seattle & Bay
Area (+remote options)
https://naver-career.gitbook.io/en/

Mais conteúdo relacionado

Mais procurados

Younus poonawala Web Application Testing
Younus poonawala   Web Application TestingYounus poonawala   Web Application Testing
Younus poonawala Web Application TestingYounus Poonawala
 
Opticon 2015 - Getting Started with the Optimizely Developer Platform
Opticon 2015 - Getting Started with the Optimizely Developer PlatformOpticon 2015 - Getting Started with the Optimizely Developer Platform
Opticon 2015 - Getting Started with the Optimizely Developer PlatformOptimizely
 
Opticon 2017 Decisions at Scale
Opticon 2017 Decisions at ScaleOpticon 2017 Decisions at Scale
Opticon 2017 Decisions at ScaleOptimizely
 
Web Analytics Maturity Model
Web Analytics Maturity ModelWeb Analytics Maturity Model
Web Analytics Maturity ModelStéphane Hamel
 
User Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 VfUser Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 VfUserZoom
 
How to be Successful with Responsive Sites (Koombea & NGINX) - English
How to be Successful with Responsive Sites (Koombea & NGINX) - EnglishHow to be Successful with Responsive Sites (Koombea & NGINX) - English
How to be Successful with Responsive Sites (Koombea & NGINX) - EnglishKoombea
 
Option 2015- Getting Started with Optimizely for Mobile
Option 2015- Getting Started with Optimizely for MobileOption 2015- Getting Started with Optimizely for Mobile
Option 2015- Getting Started with Optimizely for MobileOptimizely
 

Mais procurados (7)

Younus poonawala Web Application Testing
Younus poonawala   Web Application TestingYounus poonawala   Web Application Testing
Younus poonawala Web Application Testing
 
Opticon 2015 - Getting Started with the Optimizely Developer Platform
Opticon 2015 - Getting Started with the Optimizely Developer PlatformOpticon 2015 - Getting Started with the Optimizely Developer Platform
Opticon 2015 - Getting Started with the Optimizely Developer Platform
 
Opticon 2017 Decisions at Scale
Opticon 2017 Decisions at ScaleOpticon 2017 Decisions at Scale
Opticon 2017 Decisions at Scale
 
Web Analytics Maturity Model
Web Analytics Maturity ModelWeb Analytics Maturity Model
Web Analytics Maturity Model
 
User Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 VfUser Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 Vf
 
How to be Successful with Responsive Sites (Koombea & NGINX) - English
How to be Successful with Responsive Sites (Koombea & NGINX) - EnglishHow to be Successful with Responsive Sites (Koombea & NGINX) - English
How to be Successful with Responsive Sites (Koombea & NGINX) - English
 
Option 2015- Getting Started with Optimizely for Mobile
Option 2015- Getting Started with Optimizely for MobileOption 2015- Getting Started with Optimizely for Mobile
Option 2015- Getting Started with Optimizely for Mobile
 

Semelhante a Data Science for Online Services: Problems & Frontiers (Changbal Conference 2021)

Intro to Data Analytics with Oscar's Director of Product
 Intro to Data Analytics with Oscar's Director of Product Intro to Data Analytics with Oscar's Director of Product
Intro to Data Analytics with Oscar's Director of ProductProduct School
 
How to Design for (Digital) Success
How to Design for (Digital) SuccessHow to Design for (Digital) Success
How to Design for (Digital) SuccessSøren Engelbrecht
 
Practical Tips for Ops: End User Monitoring
Practical Tips for Ops: End User MonitoringPractical Tips for Ops: End User Monitoring
Practical Tips for Ops: End User MonitoringDynatrace
 
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBusCorso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBusAlessandro Longo
 
Why Apps Succeed: 4 Keys to Winning the Digital Quality Game
Why Apps Succeed: 4 Keys to Winning the Digital Quality GameWhy Apps Succeed: 4 Keys to Winning the Digital Quality Game
Why Apps Succeed: 4 Keys to Winning the Digital Quality GameAustin Marie Gay
 
Ui & ux insights via case stydies
Ui & ux insights via case stydiesUi & ux insights via case stydies
Ui & ux insights via case stydiesSomu Arumugam
 
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom
 
Data-Driven Design for User Experience
Data-Driven Design for User Experience Data-Driven Design for User Experience
Data-Driven Design for User Experience Emi Kwon
 
Why Apps Succeed: 4 Keys to Winning the Digital Quality Game
Why Apps Succeed: 4 Keys to Winning the Digital Quality GameWhy Apps Succeed: 4 Keys to Winning the Digital Quality Game
Why Apps Succeed: 4 Keys to Winning the Digital Quality GamePerfecto by Perforce
 
Secrets of going codeless - How to build enterprise apps without coding
Secrets of going codeless - How to build enterprise apps without codingSecrets of going codeless - How to build enterprise apps without coding
Secrets of going codeless - How to build enterprise apps without codingNewton Day Uploads
 
IRJET- Popularity based Recommender Sytsem for Google Maps
IRJET-  	  Popularity based Recommender Sytsem for Google MapsIRJET-  	  Popularity based Recommender Sytsem for Google Maps
IRJET- Popularity based Recommender Sytsem for Google MapsIRJET Journal
 
SenchaCon 2016: Using Ext JS 6 for Cross-Platform Development on Mobile - And...
SenchaCon 2016: Using Ext JS 6 for Cross-Platform Development on Mobile - And...SenchaCon 2016: Using Ext JS 6 for Cross-Platform Development on Mobile - And...
SenchaCon 2016: Using Ext JS 6 for Cross-Platform Development on Mobile - And...Sencha
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptopRising Media, Inc.
 
Data Analytics in Digital Transformation
Data Analytics in Digital TransformationData Analytics in Digital Transformation
Data Analytics in Digital TransformationMukund Babbar
 
Akshay_salvi_Resume (1)
Akshay_salvi_Resume (1)Akshay_salvi_Resume (1)
Akshay_salvi_Resume (1)Akshay Salvi
 
The Best of Both Worlds - Combining Performance and Functional Mobile App Tes...
The Best of Both Worlds - Combining Performance and Functional Mobile App Tes...The Best of Both Worlds - Combining Performance and Functional Mobile App Tes...
The Best of Both Worlds - Combining Performance and Functional Mobile App Tes...Bitbar
 
Self-Organized, Autonomous UX | SoCal UX Camp | May 31, 2014
Self-Organized, Autonomous UX  |  SoCal UX Camp  |  May 31, 2014Self-Organized, Autonomous UX  |  SoCal UX Camp  |  May 31, 2014
Self-Organized, Autonomous UX | SoCal UX Camp | May 31, 2014Jaimi Kercher
 
Analytics Tune Up! Insights and methods to achieve a manageable approach to...
Analytics Tune Up! Insights and methods to achieve a manageable approach to...Analytics Tune Up! Insights and methods to achieve a manageable approach to...
Analytics Tune Up! Insights and methods to achieve a manageable approach to...Brian Alpert
 
Uz big design talk may10
Uz big design talk may10Uz big design talk may10
Uz big design talk may10UserZoom
 
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning MeetupKnowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning MeetupBenjamin Nussbaum
 

Semelhante a Data Science for Online Services: Problems & Frontiers (Changbal Conference 2021) (20)

Intro to Data Analytics with Oscar's Director of Product
 Intro to Data Analytics with Oscar's Director of Product Intro to Data Analytics with Oscar's Director of Product
Intro to Data Analytics with Oscar's Director of Product
 
How to Design for (Digital) Success
How to Design for (Digital) SuccessHow to Design for (Digital) Success
How to Design for (Digital) Success
 
Practical Tips for Ops: End User Monitoring
Practical Tips for Ops: End User MonitoringPractical Tips for Ops: End User Monitoring
Practical Tips for Ops: End User Monitoring
 
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBusCorso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
 
Why Apps Succeed: 4 Keys to Winning the Digital Quality Game
Why Apps Succeed: 4 Keys to Winning the Digital Quality GameWhy Apps Succeed: 4 Keys to Winning the Digital Quality Game
Why Apps Succeed: 4 Keys to Winning the Digital Quality Game
 
Ui & ux insights via case stydies
Ui & ux insights via case stydiesUi & ux insights via case stydies
Ui & ux insights via case stydies
 
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
 
Data-Driven Design for User Experience
Data-Driven Design for User Experience Data-Driven Design for User Experience
Data-Driven Design for User Experience
 
Why Apps Succeed: 4 Keys to Winning the Digital Quality Game
Why Apps Succeed: 4 Keys to Winning the Digital Quality GameWhy Apps Succeed: 4 Keys to Winning the Digital Quality Game
Why Apps Succeed: 4 Keys to Winning the Digital Quality Game
 
Secrets of going codeless - How to build enterprise apps without coding
Secrets of going codeless - How to build enterprise apps without codingSecrets of going codeless - How to build enterprise apps without coding
Secrets of going codeless - How to build enterprise apps without coding
 
IRJET- Popularity based Recommender Sytsem for Google Maps
IRJET-  	  Popularity based Recommender Sytsem for Google MapsIRJET-  	  Popularity based Recommender Sytsem for Google Maps
IRJET- Popularity based Recommender Sytsem for Google Maps
 
SenchaCon 2016: Using Ext JS 6 for Cross-Platform Development on Mobile - And...
SenchaCon 2016: Using Ext JS 6 for Cross-Platform Development on Mobile - And...SenchaCon 2016: Using Ext JS 6 for Cross-Platform Development on Mobile - And...
SenchaCon 2016: Using Ext JS 6 for Cross-Platform Development on Mobile - And...
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
Data Analytics in Digital Transformation
Data Analytics in Digital TransformationData Analytics in Digital Transformation
Data Analytics in Digital Transformation
 
Akshay_salvi_Resume (1)
Akshay_salvi_Resume (1)Akshay_salvi_Resume (1)
Akshay_salvi_Resume (1)
 
The Best of Both Worlds - Combining Performance and Functional Mobile App Tes...
The Best of Both Worlds - Combining Performance and Functional Mobile App Tes...The Best of Both Worlds - Combining Performance and Functional Mobile App Tes...
The Best of Both Worlds - Combining Performance and Functional Mobile App Tes...
 
Self-Organized, Autonomous UX | SoCal UX Camp | May 31, 2014
Self-Organized, Autonomous UX  |  SoCal UX Camp  |  May 31, 2014Self-Organized, Autonomous UX  |  SoCal UX Camp  |  May 31, 2014
Self-Organized, Autonomous UX | SoCal UX Camp | May 31, 2014
 
Analytics Tune Up! Insights and methods to achieve a manageable approach to...
Analytics Tune Up! Insights and methods to achieve a manageable approach to...Analytics Tune Up! Insights and methods to achieve a manageable approach to...
Analytics Tune Up! Insights and methods to achieve a manageable approach to...
 
Uz big design talk may10
Uz big design talk may10Uz big design talk may10
Uz big design talk may10
 
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning MeetupKnowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
 

Mais de Jin Young Kim

DnA Playshop - Serious Fun with LEGO.pptx
DnA Playshop - Serious Fun with LEGO.pptxDnA Playshop - Serious Fun with LEGO.pptx
DnA Playshop - Serious Fun with LEGO.pptxJin Young Kim
 
Frontiers in Data Science For Modern Web Search Engine
Frontiers in Data Science For Modern Web Search EngineFrontiers in Data Science For Modern Web Search Engine
Frontiers in Data Science For Modern Web Search EngineJin Young Kim
 
네이버서치ABT: 신뢰할 수 있는 A/B 테스트 플랫폼 개발 및 정착기
네이버서치ABT: 신뢰할 수 있는 A/B 테스트 플랫폼 개발 및 정착기네이버서치ABT: 신뢰할 수 있는 A/B 테스트 플랫폼 개발 및 정착기
네이버서치ABT: 신뢰할 수 있는 A/B 테스트 플랫폼 개발 및 정착기Jin Young Kim
 
Social Entrepreneur meets Technology by 황진솔 대표
Social Entrepreneur meets Technology by 황진솔 대표Social Entrepreneur meets Technology by 황진솔 대표
Social Entrepreneur meets Technology by 황진솔 대표Jin Young Kim
 
Subtleties in Tracking Happiness -- Seattle QS#10
Subtleties in Tracking Happiness -- Seattle QS#10Subtleties in Tracking Happiness -- Seattle QS#10
Subtleties in Tracking Happiness -- Seattle QS#10Jin Young Kim
 
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)Jin Young Kim
 
헬로 데이터 과학: 삶과 업무를 개선하는 데이터 과학 이야기 (스타트업 얼라이언스 강연)
헬로 데이터 과학: 삶과 업무를 개선하는 데이터 과학 이야기 (스타트업 얼라이언스 강연)헬로 데이터 과학: 삶과 업무를 개선하는 데이터 과학 이야기 (스타트업 얼라이언스 강연)
헬로 데이터 과학: 삶과 업무를 개선하는 데이터 과학 이야기 (스타트업 얼라이언스 강연)Jin Young Kim
 
SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...
SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...
SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...Jin Young Kim
 
랭킹 최적화를 넘어 인간적인 검색으로 - 서울대 융합기술원 발표
랭킹 최적화를 넘어 인간적인 검색으로  - 서울대 융합기술원 발표랭킹 최적화를 넘어 인간적인 검색으로  - 서울대 융합기술원 발표
랭킹 최적화를 넘어 인간적인 검색으로 - 서울대 융합기술원 발표Jin Young Kim
 
반상식적이고 주관적인 (CS) 유학 이야기
반상식적이고 주관적인 (CS) 유학 이야기반상식적이고 주관적인 (CS) 유학 이야기
반상식적이고 주관적인 (CS) 유학 이야기Jin Young Kim
 

Mais de Jin Young Kim (10)

DnA Playshop - Serious Fun with LEGO.pptx
DnA Playshop - Serious Fun with LEGO.pptxDnA Playshop - Serious Fun with LEGO.pptx
DnA Playshop - Serious Fun with LEGO.pptx
 
Frontiers in Data Science For Modern Web Search Engine
Frontiers in Data Science For Modern Web Search EngineFrontiers in Data Science For Modern Web Search Engine
Frontiers in Data Science For Modern Web Search Engine
 
네이버서치ABT: 신뢰할 수 있는 A/B 테스트 플랫폼 개발 및 정착기
네이버서치ABT: 신뢰할 수 있는 A/B 테스트 플랫폼 개발 및 정착기네이버서치ABT: 신뢰할 수 있는 A/B 테스트 플랫폼 개발 및 정착기
네이버서치ABT: 신뢰할 수 있는 A/B 테스트 플랫폼 개발 및 정착기
 
Social Entrepreneur meets Technology by 황진솔 대표
Social Entrepreneur meets Technology by 황진솔 대표Social Entrepreneur meets Technology by 황진솔 대표
Social Entrepreneur meets Technology by 황진솔 대표
 
Subtleties in Tracking Happiness -- Seattle QS#10
Subtleties in Tracking Happiness -- Seattle QS#10Subtleties in Tracking Happiness -- Seattle QS#10
Subtleties in Tracking Happiness -- Seattle QS#10
 
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)
 
헬로 데이터 과학: 삶과 업무를 개선하는 데이터 과학 이야기 (스타트업 얼라이언스 강연)
헬로 데이터 과학: 삶과 업무를 개선하는 데이터 과학 이야기 (스타트업 얼라이언스 강연)헬로 데이터 과학: 삶과 업무를 개선하는 데이터 과학 이야기 (스타트업 얼라이언스 강연)
헬로 데이터 과학: 삶과 업무를 개선하는 데이터 과학 이야기 (스타트업 얼라이언스 강연)
 
SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...
SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...
SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...
 
랭킹 최적화를 넘어 인간적인 검색으로 - 서울대 융합기술원 발표
랭킹 최적화를 넘어 인간적인 검색으로  - 서울대 융합기술원 발표랭킹 최적화를 넘어 인간적인 검색으로  - 서울대 융합기술원 발표
랭킹 최적화를 넘어 인간적인 검색으로 - 서울대 융합기술원 발표
 
반상식적이고 주관적인 (CS) 유학 이야기
반상식적이고 주관적인 (CS) 유학 이야기반상식적이고 주관적인 (CS) 유학 이야기
반상식적이고 주관적인 (CS) 유학 이야기
 

Último

why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 

Último (17)

why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 

Data Science for Online Services: Problems & Frontiers (Changbal Conference 2021)

  • 1. Data Science for Online Services: Problems & Frontiers Jinyoung Kim Data & Analytics Naver Search US 2021 Changbal Conference
  • 2. About Me Head of Data Science @ Naver Search Executive Director of Naver Search US Ex-MSFT / Ex-Snap (Search & RecSys) Co-founder / 1st President of Changbal * 한국/미국에서 Data Scientist & Engineer 채용중입니다! (jin.y.kim@navercorp.com) https://medium.com/naver-dna-tech-blog
  • 3. Mission: Making Data Science as Cool as AI
  • 4. Mission: Making Data Science as Cool as AI
  • 6. What constitutes a modern online service? AI- enabled Mobile- first Cloud- backend Powerful algorithms Limitless Computing Infrastructure Contextual awareness, Omnipresence
  • 7. Challenge#1: Dynamic Environment Societal and environment changes New apps and services launch daily Underpinning technologies evolve COVID-19 Climate Change GPT-3 and large- scale language models Yelp Pinterest AirBnB Election Breakthrough in Computer Vision TicTok DoorDash
  • 8. Solution: Metric Regression Detection Source: KDD’20 Paper from Microsoft
  • 9. Challenge#2: Ecosystem / Social Impact • Online services need to look beyond user satisfaction • Search / RecSys results can instill and reinforce social bias • Algorithmic ranking means life/death for content providers • Manipulating algorithmic results became an industry
  • 10. Solutions to Bias / Fairness in Ranking Source: http://naversearchconf.naver.com/
  • 11. Challenge#3: Internal Alignment • Major online service companies have dozens of teams with different focus • Blind optimization in one area can lead to negative user experience
  • 12. Modern AB Test for Naver Search (DeView’21)
  • 13. Data Science Problems for Online Services
  • 14. Problems & Solutions by Service Lifecycle • Each stage present different analysis and decision problems, with corresponding data science solutions PLAN DEVELOP LAUNCH MONITOR Defect Monitoring Competitive Analysis Offline Experiment A/B Experiment KPI Monitoring Opportunity Analysis Parameter Optimization Crowdsourced Evaluation
  • 15. DS Problems during Planning Stage Questions • How can we acquire & onboard new users? • What are the pain points with existing customers? • Where are we lagging behind competing services? Solutions • User funnel / journey analysis • Side-by-side comparison • In-depth user study
  • 16. Where do users come & go? Funnel Analysis • New Users for E-Commerce • Existing Users for Contents Discovery (Search / App Store) Connect / Install Create Account Checkout Revisit App Open Page View (browsing) Page View Consume Revisit Where do most users drop off the funnel? Which part of the funnel has the biggest leverage? How is the funnel shifting over time?
  • 17. Beyond Funnel: User Journey Analysis • Sanky chart can visualize diverging / converging user journey Source: Medium
  • 18. Example: Analyzing Snapchat User Journey • Use app events to build & predict user journey graph Source: Characterizing and Forecasting User Engagement with In-app Action Graph: A Case Study of Snapchat
  • 19. Example: Analyzing Snapchat User Journey • Use these to predict and optimize user satisfaction / retention
  • 20. Applications for Naver Mobile App • What does overall user journey look like? • Can we understand which factors improve / hinder user satisfaction? • Can we optimize user experience, thereby improving KPI (DAU)?
  • 21. DS Problems during Dev. / Launch Stage Questions • Is the new design/ranking better than the old one? • How can we choose the best design/ranking parameters? Solutions • Online (A/B) experiment • Multi-armed Bandit
  • 22. Before/After Comparision vs. AB Test • Hard to measure true impact of given feature • High user impact in case of full launch & roll-back Control Treatment External Factors (day-of-week / seasonality / ...) Full Launch Roll-back Control 100% Traffic
  • 23. Before/After Comparison vs. AB Test • AB testing allows measurement w/o impact of external factors • Multiple treatments with versioning & roll-out support Control Treatment1 (V1) AB Test V1 External Factors (*) (* no impact under randomized controlled experiment) Roll-out & Monitoring Control Treatment1 (V2) AB Test V2 Treatment2 (gradual roll-out) Problem Solving 10~20% Traffic Treatment2 (V1) Treatment2 (V2)
  • 24. Pitfalls in Real-world AB Testing (DeView’21)
  • 25. Modern AB Test for Naver Search (DeView’21)
  • 26. Can we use the user bucketing idea for dynamically optimize AI model / UI parameters? But I have hundreds of parameters to choose from. I’d also like to minimize user impact
  • 27. Contextual Bandits for Parameter Optimization • Converge to the best action (max. reward) given a context
  • 28. DS Problems during Monitoring Stage Questions •Has there been any shift in user behavior / metrics? •Has the algo results shown any new defects? Solutions •Metric monitoring / detection framework •Results defect detection framework
  • 29. Metric Regression Detection Framework • Generating alerts for metric regression is easy • The hard part is minimizing false positives • Solutions to control false discovery rate is available Source: KDD’20 Paper from Microsoft
  • 30. Defect Monitoring for Naver Search Results Input Query Set Scraping & Annotation Automated Quality Estimate Collect Human Quality Rating Defect Reporting Continuous Defect Monitoring and Improvement Manual reporting Sample from traffic SRP side-by-side rating Snippet rating SRP scraping & parsing Query taxonomy SRP layout signals Engagement signals Query defect alerts Defect weekly discussion
  • 32. Closing Thoughts • Modern / AI-powered online services are more powerful, but along came greater challenges and responsibilities. • Data Science provides various solutions throughout the lifecycle of online services • The choice of techniques depends on specifics of the service (Contents / Commerce / Social network / …) • Learning and starting career in data science is also more accessible than ever (bootcamps, better tools and guides) AI- enable d Mobile -first Cloud- backend
  • 33. We’re hiring @ Naver Search US! • Exciting Data Science & Artificial Intelligence problems across Naver & Line • Best of both Korean and US tech working culture • Huge growth opportunities (150+ people in 3-5 years) • Locations in Seattle & Bay Area (+remote options) https://naver-career.gitbook.io/en/

Notas do Editor

  1. Data-driven decision making and optimization is feasible throughout