SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
DATA
SCIENCE
POP UP
AUSTIN
Using LDA and Structural Topic Modeling to
Explore Trending Topics in a Call Center
Jordana Heller
Data Scientist, Mattersight
jheller
DATA
SCIENCE
POP UP
AUSTIN
#datapopupaustin
April 13, 2016
Galvanize, Austin Campus
Lightning Talk:
Using LDA and Structural Topic Modeling to
Explore Trending Topics in a Call Center
Jordana Heller @jheller
Data Science Pop-up Austin, April 13, 2016
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
What We Do
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Our goal: Topic Trends
3/31/2016 4/30/2016 5/31/2016 6/30/2016 7/31/2016
Identifying contents and prevalence of multiword topics present in conversation in an unsupervised way
Unexpected Prevalence Critical Spikes Escalating Frequency
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Our goals, continued
Manageable number of topics
Track expected and unexpected topics
Go deep: Contextualize topic usage
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Short text: Keywords, hashtags, ngrams
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Long text: Could use predetermined topics
Image credit: IBM Watson Concept Insights
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Long text: Or discover themes
Image credit: Blei, 2012, Communications of the ACM
Latent Dirichlet Allocation (LDA) (Blei et al., 2003)
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Great! How about contextualizing trends?
• Where are topics trending?
• Structural Topic Modeling (Roberts et al., 2013)
– Instead of relying on post-hoc comparisons,
includes covariates in LDA model
• Specifies priors as GLMs
• Word distribution determined by topic, covariates,
topic-covariate interaction
– Authors’ implementation: R package stm (available
via CRAN; all code on GitHub!)
Ready to talk pipeline!
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Data Collection and Preprocessing
Read
Transcripts
Add Call-level
Covariates
Preprocess
text
• Collocations
• -Stop words
• Stem/completion
• -Low freq terms
Create Term-
Document
Matrix
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Topic Model Creation
Retrieve
last topic
model
• For comparison
Create
current
topic model
• Detect number
of topics, or
specify
Create
topic labels
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Topic Model Comparison
Inspect overall
topic prevalence
Compare overall
topic prevalence
across periods
• Topics change!
Measure change in
word probability
distributions for each
new topic wrt each
old topic
• Match new to closest
previous match
below change
threshold (otherwise
new topic)
• Evaluate trends!
Estimate and
inspect effects of
covariates
Compare effects
of covariates
across periods
•Output can be
interpreted similarly
to regression
Example results: Hotel reservations
Covariates: booking, caller distress
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
convention, center, mind, worry, philadelphia, inventory New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
school, college, graduate, medical, clinic
New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
30% beach, balcony, ocean, view
New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
10% back, next, receive, listen, cash future
New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
back, minute, system, run, inconvenience
New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
42% confirm, email, arrival, local
New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Caller Distress
New
Decreasing
Increasing
Distress: > 30 seconds of linguistically-
identified dissatisfaction or negative emotion
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Caller Distress
square, city, price, hotel, manhattan, central
New
Decreasing
Increasing
Distress: > 30 seconds of linguistically-
identified dissatisfaction or negative emotion
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Caller Distress
12% online, website, cancel, purchase, advance
New
Decreasing
Increasing
Distress: > 30 seconds of linguistically-
identified dissatisfaction or negative emotion
Nice!
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Our goals, revisited
Manageable number of topics
Track expected and unexpected topics
Go deep: Contextualize topic usage
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Topic trends using
structural topic models
Thank you!
DATA
SCIENCE
POP UP
AUSTIN
@datapopup
#datapopupaustin

Mais conteúdo relacionado

Destaque

Millennials: Why you should care
Millennials: Why you should careMillennials: Why you should care
Millennials: Why you should careKristen Cosentino
 
Charlotte O'Brien - BioBamboo: An Example of Eco-Restoration
Charlotte O'Brien - BioBamboo: An Example of Eco-RestorationCharlotte O'Brien - BioBamboo: An Example of Eco-Restoration
Charlotte O'Brien - BioBamboo: An Example of Eco-Restorationgabriellebastien
 
Evolucion de la computadoras Unexpo
Evolucion de la computadoras UnexpoEvolucion de la computadoras Unexpo
Evolucion de la computadoras UnexpoAbrahan Molina
 
Cheetah Power Point
Cheetah Power PointCheetah Power Point
Cheetah Power PointKen_Rein
 
The New Explorers Guide To Dutch Digital Culture 2
The New Explorers Guide To Dutch Digital Culture 2The New Explorers Guide To Dutch Digital Culture 2
The New Explorers Guide To Dutch Digital Culture 2Fers
 
Empleo con apoyo. glosario.
Empleo con apoyo. glosario.Empleo con apoyo. glosario.
Empleo con apoyo. glosario.José María
 
La katana josmary patiño
La katana   josmary patiñoLa katana   josmary patiño
La katana josmary patiñojosmary patiño
 
Criando produtos e serviços reais para o mundo virtual.
Criando produtos e serviços reais para o mundo virtual.Criando produtos e serviços reais para o mundo virtual.
Criando produtos e serviços reais para o mundo virtual.Jane Vita
 
Presentation for anthro kieran
Presentation for anthro kieranPresentation for anthro kieran
Presentation for anthro kieranTom McLean
 
Social Media for Retails, Singapore
Social Media for Retails, SingaporeSocial Media for Retails, Singapore
Social Media for Retails, SingaporeHappy Marketer
 
Crash course in instruction
Crash course in instructionCrash course in instruction
Crash course in instructionabartholomew
 
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...Bomonnhi
 
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...Brenda Leibowitz
 
Blue ocean strategy –part 1
Blue ocean strategy –part 1Blue ocean strategy –part 1
Blue ocean strategy –part 1Pavan kumar
 
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan Profiling
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan ProfilingFrom WCBP 2015: GlycoWorks RapiFluor-MS for Glycan Profiling
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan ProfilingWaters Corporation
 
Personal Progression Framework
Personal Progression FrameworkPersonal Progression Framework
Personal Progression FrameworkOlusegun Agunbiade
 

Destaque (20)

Millennials: Why you should care
Millennials: Why you should careMillennials: Why you should care
Millennials: Why you should care
 
Charlotte O'Brien - BioBamboo: An Example of Eco-Restoration
Charlotte O'Brien - BioBamboo: An Example of Eco-RestorationCharlotte O'Brien - BioBamboo: An Example of Eco-Restoration
Charlotte O'Brien - BioBamboo: An Example of Eco-Restoration
 
Evolucion de la computadoras Unexpo
Evolucion de la computadoras UnexpoEvolucion de la computadoras Unexpo
Evolucion de la computadoras Unexpo
 
Cheetah Power Point
Cheetah Power PointCheetah Power Point
Cheetah Power Point
 
phao-updated resume
phao-updated resumephao-updated resume
phao-updated resume
 
The New Explorers Guide To Dutch Digital Culture 2
The New Explorers Guide To Dutch Digital Culture 2The New Explorers Guide To Dutch Digital Culture 2
The New Explorers Guide To Dutch Digital Culture 2
 
Empleo con apoyo. glosario.
Empleo con apoyo. glosario.Empleo con apoyo. glosario.
Empleo con apoyo. glosario.
 
La katana josmary patiño
La katana   josmary patiñoLa katana   josmary patiño
La katana josmary patiño
 
Criando produtos e serviços reais para o mundo virtual.
Criando produtos e serviços reais para o mundo virtual.Criando produtos e serviços reais para o mundo virtual.
Criando produtos e serviços reais para o mundo virtual.
 
My resume
My resumeMy resume
My resume
 
Resume
ResumeResume
Resume
 
Presentation for anthro kieran
Presentation for anthro kieranPresentation for anthro kieran
Presentation for anthro kieran
 
sunpark
sunparksunpark
sunpark
 
Social Media for Retails, Singapore
Social Media for Retails, SingaporeSocial Media for Retails, Singapore
Social Media for Retails, Singapore
 
Crash course in instruction
Crash course in instructionCrash course in instruction
Crash course in instruction
 
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...
 
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...
 
Blue ocean strategy –part 1
Blue ocean strategy –part 1Blue ocean strategy –part 1
Blue ocean strategy –part 1
 
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan Profiling
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan ProfilingFrom WCBP 2015: GlycoWorks RapiFluor-MS for Glycan Profiling
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan Profiling
 
Personal Progression Framework
Personal Progression FrameworkPersonal Progression Framework
Personal Progression Framework
 

Semelhante a Data Science Popup Austin: Using lda and Structural Topic Modeling to Explore Trending Topics in a Call Center

Insights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter contentInsights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter contentStephen Dann
 
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...Data Con LA
 
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from Text
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from TextUse Amazon Comprehend and Amazon SageMaker to Gain Insight from Text
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from TextAmazon Web Services
 
Irmac presentation for website
Irmac presentation for websiteIrmac presentation for website
Irmac presentation for websiteFrank Barnes
 
Rigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentRigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentSandy Man
 
Serverless Text Analytics with Amazon Comprehend
Serverless Text Analytics with Amazon ComprehendServerless Text Analytics with Amazon Comprehend
Serverless Text Analytics with Amazon ComprehendDonnie Prakoso
 
Building Text Analytics Solutions with AWS ML Services
Building Text Analytics Solutions with AWS ML ServicesBuilding Text Analytics Solutions with AWS ML Services
Building Text Analytics Solutions with AWS ML ServicesAmazon Web Services
 
Analysis of Metadata and Topic Modeling for
Analysis of Metadata and Topic Modeling forAnalysis of Metadata and Topic Modeling for
Analysis of Metadata and Topic Modeling forJigar Mehta
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache MetronMaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache MetronDataWorks Summit
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...DataWorks Summit
 
Cognitive Systems
Cognitive SystemsCognitive Systems
Cognitive SystemsLukas Ott
 
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017Splunk
 
Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...Lucidworks
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessDataWorks Summit/Hadoop Summit
 
Georgetown Data Science - Team BuzzFeed
Georgetown Data Science - Team BuzzFeed Georgetown Data Science - Team BuzzFeed
Georgetown Data Science - Team BuzzFeed Joshua Erb
 
Experiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter ZadroznyExperiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter Zadroznypadatascience
 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...ryanorban
 
Basic Sentiment Analysis using Hive
Basic Sentiment Analysis using HiveBasic Sentiment Analysis using Hive
Basic Sentiment Analysis using HiveQubole
 

Semelhante a Data Science Popup Austin: Using lda and Structural Topic Modeling to Explore Trending Topics in a Call Center (20)

Insights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter contentInsights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter content
 
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
 
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from Text
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from TextUse Amazon Comprehend and Amazon SageMaker to Gain Insight from Text
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from Text
 
Irmac presentation for website
Irmac presentation for websiteIrmac presentation for website
Irmac presentation for website
 
Rigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentRigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deployment
 
Serverless Text Analytics with Amazon Comprehend
Serverless Text Analytics with Amazon ComprehendServerless Text Analytics with Amazon Comprehend
Serverless Text Analytics with Amazon Comprehend
 
Building Text Analytics Solutions with AWS ML Services
Building Text Analytics Solutions with AWS ML ServicesBuilding Text Analytics Solutions with AWS ML Services
Building Text Analytics Solutions with AWS ML Services
 
Analysis of Metadata and Topic Modeling for
Analysis of Metadata and Topic Modeling forAnalysis of Metadata and Topic Modeling for
Analysis of Metadata and Topic Modeling for
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache MetronMaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
 
Cognitive Systems
Cognitive SystemsCognitive Systems
Cognitive Systems
 
Key Phrases for Better Search
Key Phrases for Better SearchKey Phrases for Better Search
Key Phrases for Better Search
 
A6 big data_in_the_cloud
A6 big data_in_the_cloudA6 big data_in_the_cloud
A6 big data_in_the_cloud
 
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017
 
Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awareness
 
Georgetown Data Science - Team BuzzFeed
Georgetown Data Science - Team BuzzFeed Georgetown Data Science - Team BuzzFeed
Georgetown Data Science - Team BuzzFeed
 
Experiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter ZadroznyExperiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter Zadrozny
 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
 
Basic Sentiment Analysis using Hive
Basic Sentiment Analysis using HiveBasic Sentiment Analysis using Hive
Basic Sentiment Analysis using Hive
 

Mais de Domino Data Lab

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...Domino Data Lab
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...Domino Data Lab
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataDomino Data Lab
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itDomino Data Lab
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationDomino Data Lab
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryDomino Data Lab
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusDomino Data Lab
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterDomino Data Lab
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceDomino Data Lab
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Domino Data Lab
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Domino Data Lab
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at ScaleDomino Data Lab
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataDomino Data Lab
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data ScientistsDomino Data Lab
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Domino Data Lab
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyDomino Data Lab
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsDomino Data Lab
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino Data Lab
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceDomino Data Lab
 

Mais de Domino Data Lab (20)

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops data
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile Virus
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data Science
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data Scientists
 
Making Big Data Smart
Making Big Data SmartMaking Big Data Smart
Making Big Data Smart
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data Science
 

Último

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxolyaivanovalion
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 

Data Science Popup Austin: Using lda and Structural Topic Modeling to Explore Trending Topics in a Call Center

  • 1. DATA SCIENCE POP UP AUSTIN Using LDA and Structural Topic Modeling to Explore Trending Topics in a Call Center Jordana Heller Data Scientist, Mattersight jheller
  • 2.
  • 4. Lightning Talk: Using LDA and Structural Topic Modeling to Explore Trending Topics in a Call Center Jordana Heller @jheller Data Science Pop-up Austin, April 13, 2016
  • 5. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. What We Do
  • 6. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Our goal: Topic Trends 3/31/2016 4/30/2016 5/31/2016 6/30/2016 7/31/2016 Identifying contents and prevalence of multiword topics present in conversation in an unsupervised way Unexpected Prevalence Critical Spikes Escalating Frequency
  • 7. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Our goals, continued Manageable number of topics Track expected and unexpected topics Go deep: Contextualize topic usage
  • 8. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Short text: Keywords, hashtags, ngrams
  • 9. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Long text: Could use predetermined topics Image credit: IBM Watson Concept Insights
  • 10. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Long text: Or discover themes Image credit: Blei, 2012, Communications of the ACM Latent Dirichlet Allocation (LDA) (Blei et al., 2003)
  • 11. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Great! How about contextualizing trends? • Where are topics trending? • Structural Topic Modeling (Roberts et al., 2013) – Instead of relying on post-hoc comparisons, includes covariates in LDA model • Specifies priors as GLMs • Word distribution determined by topic, covariates, topic-covariate interaction – Authors’ implementation: R package stm (available via CRAN; all code on GitHub!)
  • 12. Ready to talk pipeline!
  • 13. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Data Collection and Preprocessing Read Transcripts Add Call-level Covariates Preprocess text • Collocations • -Stop words • Stem/completion • -Low freq terms Create Term- Document Matrix
  • 14. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Topic Model Creation Retrieve last topic model • For comparison Create current topic model • Detect number of topics, or specify Create topic labels
  • 15. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Topic Model Comparison Inspect overall topic prevalence Compare overall topic prevalence across periods • Topics change! Measure change in word probability distributions for each new topic wrt each old topic • Match new to closest previous match below change threshold (otherwise new topic) • Evaluate trends! Estimate and inspect effects of covariates Compare effects of covariates across periods •Output can be interpreted similarly to regression
  • 16. Example results: Hotel reservations Covariates: booking, caller distress
  • 17. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking convention, center, mind, worry, philadelphia, inventory New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 18. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking school, college, graduate, medical, clinic New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 19. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking 30% beach, balcony, ocean, view New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 20. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking 10% back, next, receive, listen, cash future New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 21. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking back, minute, system, run, inconvenience New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 22. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking 42% confirm, email, arrival, local New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 23. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Caller Distress New Decreasing Increasing Distress: > 30 seconds of linguistically- identified dissatisfaction or negative emotion
  • 24. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Caller Distress square, city, price, hotel, manhattan, central New Decreasing Increasing Distress: > 30 seconds of linguistically- identified dissatisfaction or negative emotion
  • 25. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Caller Distress 12% online, website, cancel, purchase, advance New Decreasing Increasing Distress: > 30 seconds of linguistically- identified dissatisfaction or negative emotion
  • 26. Nice!
  • 27. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Our goals, revisited Manageable number of topics Track expected and unexpected topics Go deep: Contextualize topic usage
  • 28. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Topic trends using structural topic models Thank you!