Intent Classification of Voice Queries on Mobile Devices

•

1 gostou•263 visualizações

Intent Classification of Voice Queries on Mobile Devices, Subhabrata Mukherjee, Ashish Verma and Kenneth W. Church, In Proceedings of the 22nd International World Wide Web Conference (WWW 2013), Rio De Janeiro, Brazil, May 13 - May 17, 2013 (Poster)

Tecnologia

Intent Classification of Voice Queries on Mobile DevicesIntent Classification of Voice Queries on Mobile DevicesIntent Classification of Voice Queries on Mobile DevicesIntent Classification of Voice Queries on Mobile Devices
Subhabrata Mukherjee†, Ashish Verma†, Kenneth W. Church‡
†IBM India Research Lab, ‡ IBM T.J. Watson Research Center
• Mobile query classification is made difficult by short, noisy queries and the presence of inter-active and personalized queries like map (where is the closest restaurant),
command-and-control (open facebook), dialogue (how are you?), joke (will you marry me?) etc.
• Voice queries are more difficult than typed queries due to the errors introduced by the automatic speech recognizer
• In this work we bring the complexities of voice search and intent classification together in proposing a multi-stage classifier for intent classification
Introduction
• 52,282 unique queries, with total 1,04,950 impressions, are collected from an android voice search application and manually tagged. Avg. number of words per
query is 2.3. Avg. word error rate of ASR engine is 20%. The figure below shows the F1 score comparison of different models over 13 most frequent query classes.
Evaluation
• Stage 1 - Multi-class Naïve Bayes classifier uses bag-of-words, part-of-speech tag information and domain words of
the query as features and predicts top K probable classes for a query.
• Domain words (DW) of a query are extracted from url’s of top ranked retrieved pages from search engine. For
example, the search engine retrieves the DW’s imdb marvel en.wikipedia youtube etc. for the query the avengers.
• DW introduce external knowledge in classifier (Movies, Music, Sports etc.) and keeps online data requirement low
• POS tags detect patterns for Command-and-Control, Map, Knowledge, Dialogue, broken queries (Endpoint) etc.
• Stage 2 - Top K ranked classes are taken with some additional features in a logistic regression classifier.
• Features like DW and url ranking, DW and query overlap (Navigational ), substrings having maximum information
gain for a class help differentiate between close query categories from Stage 1
• Stage 3 – Prediction of the independent binary classifiers are combined into a single prediction
Intent
Classification
0
0.2
0.4
0.6
0.8
1
1.2
w eather map nav command know ledge movies dialogue restaurant w eb music sports acceptor endpoint
Reg+Vocab+POS+Yahoo Vocab+POS+Yahoo Vocab+Yahoo Vocab+POS Vocab
•Navigational - Query DW overlap
•Map - Presence of substrings find,
close, direction, near, where in query
•Movie - Presence of IMDB as the
topmost DW
•Command-and-Control - Query
starting with a Verb
•Websearch - Presence of Wikipedia
in the topmost two DW’s
KnowledgeWhat is pi
EndpointHow to spell
NavigationalBest buy com
Command-and-ControlCall mom
MapFind nearest SubwayExample Queries

Mais conteúdo relacionado

Mais de Subhabrata Mukherjee

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models

Subhabrata Mukherjee

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...

Subhabrata Mukherjee

Fact Checking from Text

Subhabrata Mukherjee

OpenTag: Open Attribute Value Extraction From Product Profiles

Subhabrata Mukherjee

One of the major hurdles preventing the full exploitation of information from online communities is the widespread concern regarding the quality and credibility of user-contributed content. We propose probabilistic graphical models that can leverage the joint interplay between multiple factors --- like user interactions, community dynamics, and textual content --- to automatically assess the credibility of user-contributed online information, expertise of users and their evolution with user-interpretable explanation. We devise new models based on Conditional Random Fields that enable applications such as extracting reliable side-effects of drugs from user-contributed posts in health forums, and identifying credible news articles in news forums. Online communities are dynamic, as users join and leave, adapt to evolving trends, and mature over time. To capture this dynamics, we propose generative models based on Hidden Markov Model, Latent Dirichlet Allocation, and Brownian Motion to trace the continuous evolution of user expertise and their language model over time. This allows us to identify expert users and credible content jointly over time, improving state-of-the-art recommender systems by explicitly considering the maturity of users. This enables applications such as identifying useful product reviews, and detecting fake and anomalous reviews with limited information.

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...

Subhabrata Mukherjee

Leveraging Joint Interactions for Credibility Analysis in News Communities

Subhabrata Mukherjee

People on Drugs: Credibility of User Statements in Health Forums

Subhabrata Mukherjee

Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...

Subhabrata Mukherjee

Joint Author Sentiment Topic Model

Subhabrata Mukherjee

TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter

Subhabrata Mukherjee

Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...

Subhabrata Mukherjee

Leveraging Sentiment to Compute Word Similarity

Subhabrata Mukherjee

WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...

Subhabrata Mukherjee

Feature specific analysis of reviews

Subhabrata Mukherjee

YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...

Subhabrata Mukherjee

Sentiment Analysis in Twitter with Lightweight Discourse Analysis

Subhabrata Mukherjee

Mais de Subhabrata Mukherjee (16)

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...

Fact Checking from Text

OpenTag: Open Attribute Value Extraction From Product Profiles

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...

Leveraging Joint Interactions for Credibility Analysis in News Communities

People on Drugs: Credibility of User Statements in Health Forums

Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...

Joint Author Sentiment Topic Model

TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter

Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...

Leveraging Sentiment to Compute Word Similarity

WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...

Feature specific analysis of reviews

YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...

Sentiment Analysis in Twitter with Lightweight Discourse Analysis

Último

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

Scaling API-first – The story of a global engineering organization Ian Reasor, Senior Computer Scientist - Adobe Radu Cotescu, Senior Computer Scientist - Adobe Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

apidays

Data Cloud, More than a CDP by Matt Robison

Anna Loughnan Colquhoun

FWD Group - Insurer Innovation Award 2024

The Digital Insurer

presentation ICT roal in 21st century education

jfdjdjcjdnsjd

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

MadyBayot

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Product Anonymous

Building Digital Trust in a Digital Economy Veronica Tan, Director - Cyber Security Agency of Singapore Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

apidays

ICT role in 21st century education and its challenges

rafiqahmad00786416

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Martijn de Jong

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Real Time Object Detection Using Open CV

Khem

GenAI Risks & Security Meetup 01052024.pdf

lior mazor

Join our latest Connector Corner webinar to discover how UiPath Integration Service revolutionizes API-centric automation in a 'Quote to Cash' process—and how that automation empowers businesses to accelerate revenue generation. A comprehensive demo will explore connecting systems, GenAI, and people, through powerful pre-built connectors designed to speed process cycle times. Speakers: James Dickson, Senior Software Engineer Charlie Greenberg, Host, Product Marketing Manager

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

DianaGray10

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Whatsapp Number Escorts Call girls 8617370543 Available 24x7 Navi Mumbai Call Girls Service Offer Genuine VIP Model Escorts Call Girls in Your Budget. Navi Mumbai Call Girls Service Provide Real Call Girls Number. Make Your Sexual Pleasure Memorable with Our Navi Mumbai Call Girls at Affordable Price. Top VIP Escorts Call Girls, High Profile Independent Escorts Call Girls, Housewife Women Escorts Call Girl, College Girls Escorts Call Girls, Russian Escorts Call girls Service in Your Budget.

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Deepika Singh

Webinar Recording: https://www.panagenda.com/webinars/why-teams-call-analytics-is-critical-to-your-entire-business Nothing is as frustrating and noticeable as being in an important call and being unable to see or hear the other person. Not surprising then, that issues with Teams calls are among the most common problems users call their helpdesk for. Having in depth insight into everything relevant going on at the user’s device, local network, ISP and Microsoft itself during the call is crucial for good Microsoft Teams Call quality support. To ensure a quick and adequate solution and to ensure your users get the most out of their Microsoft 365. But did you know that ‘bad calls’ are also an excellent indicator of other problems arising? Precisely because it is so noticeable!? Like the canary in the mine, bad calls can be early indicators of problems. Problems that might otherwise not have been noticed for a while but can have a big impact on productivity and satisfaction. Join this session by Christoph Adler to learn how true Microsoft Teams call quality analytics helped other organizations troubleshoot bad calls and identify and fix problems that impacted Teams calls or the use of Microsoft365 in general. See what it can do to keep your users happy and productive! In this session we will cover - Why CQD data alone is not enough to troubleshoot call problems - The importance of attributing call problems to the right call participant - What call quality analytics can do to help you quickly find, fix-, and prevent problems - Why having retrospective detailed insights matters - Real life examples of how others have used Microsoft Teams call quality monitoring to problem shoot problems with their ISP, network, device health and more.

Why Teams call analytics are critical to your entire business

panagenda

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Zilliz

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Edi Saputra

Manulife - Insurer Transformation Award 2024

The Digital Insurer

Intent Classification of Voice Queries on Mobile Devices

1. Intent Classification of Voice Queries on Mobile DevicesIntent Classification of Voice Queries on Mobile DevicesIntent Classification of Voice Queries on Mobile DevicesIntent Classification of Voice Queries on Mobile Devices Subhabrata Mukherjee†, Ashish Verma†, Kenneth W. Church‡ †IBM India Research Lab, ‡ IBM T.J. Watson Research Center • Mobile query classification is made difficult by short, noisy queries and the presence of inter-active and personalized queries like map (where is the closest restaurant), command-and-control (open facebook), dialogue (how are you?), joke (will you marry me?) etc. • Voice queries are more difficult than typed queries due to the errors introduced by the automatic speech recognizer • In this work we bring the complexities of voice search and intent classification together in proposing a multi-stage classifier for intent classification Introduction • 52,282 unique queries, with total 1,04,950 impressions, are collected from an android voice search application and manually tagged. Avg. number of words per query is 2.3. Avg. word error rate of ASR engine is 20%. The figure below shows the F1 score comparison of different models over 13 most frequent query classes. Evaluation • Stage 1 - Multi-class Naïve Bayes classifier uses bag-of-words, part-of-speech tag information and domain words of the query as features and predicts top K probable classes for a query. • Domain words (DW) of a query are extracted from url’s of top ranked retrieved pages from search engine. For example, the search engine retrieves the DW’s imdb marvel en.wikipedia youtube etc. for the query the avengers. • DW introduce external knowledge in classifier (Movies, Music, Sports etc.) and keeps online data requirement low • POS tags detect patterns for Command-and-Control, Map, Knowledge, Dialogue, broken queries (Endpoint) etc. • Stage 2 - Top K ranked classes are taken with some additional features in a logistic regression classifier. • Features like DW and url ranking, DW and query overlap (Navigational ), substrings having maximum information gain for a class help differentiate between close query categories from Stage 1 • Stage 3 – Prediction of the independent binary classifiers are combined into a single prediction Intent Classification 0 0.2 0.4 0.6 0.8 1 1.2 w eather map nav command know ledge movies dialogue restaurant w eb music sports acceptor endpoint Reg+Vocab+POS+Yahoo Vocab+POS+Yahoo Vocab+Yahoo Vocab+POS Vocab •Navigational - Query DW overlap •Map - Presence of substrings find, close, direction, near, where in query •Movie - Presence of IMDB as the topmost DW •Command-and-Control - Query starting with a Verb •Websearch - Presence of Wikipedia in the topmost two DW’s KnowledgeWhat is pi EndpointHow to spell NavigationalBest buy com Command-and-ControlCall mom MapFind nearest SubwayExample Queries

Intent Classification of Voice Queries on Mobile Devices

Recomendados

Recomendados

Mais conteúdo relacionado

Mais de Subhabrata Mukherjee

Mais de Subhabrata Mukherjee (16)

Último

Último (20)

Intent Classification of Voice Queries on Mobile Devices