[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data

•

5 gostaram•87,875 visualizações

Shuyo Nakatani

Supervised Nonparametric Topic Model

Tecnologia Negócios

[Kim+ ICML2012] Dirichlet Process
with Mixed Random Measures : A
Nonparametric Topic Model for
Labeled Data

2012/07/28
Nakatani Shuyo @ Cybozu Labs, Inc
twitter : @shuyo

LDA(Latent Dirichlet Allocation)
[Blei+ 03]
• Unsupervised Topic Model
– Each word has an unobserved topic
• Parametric
– The topic size K is given in advance

via Wikipedia

Labeled LDA [Ramage+ 09]

• Supervised Topic Model
– Each document has an observed label
• Parametric

via [Ramage+ 09]

Generative Process for L-LDA
• 𝜷 𝑘 ~Dir 𝜼
topics corresponding to
𝑑 observed labels
• Λ 𝑘 ~Bernoulli Φ 𝑘
• 𝜽 𝑑 ~Dir 𝜶 𝑑
restricted to labeled
– where 𝜶 𝑑 = 𝛼𝑘 parameters
𝑑
𝑘 Λ 𝑘 =1

𝑑 𝑑
• 𝑧 𝑖 ~Multi 𝜽
𝑑
• 𝑤𝑖 ~Multi 𝜷 𝑧 𝑑
𝑖

via [Ramage+ 09]

Pros/Cons of L-LDA
• Pros
– Easy to implement

• Cons via [Ramage+ 09]

– It is necessary to specify label-topic
correspondence manually
• Its performance depends on the corresponds

※) My implementation is here : https://github.com/shuyo/iir/blob/master/lda/llda.py

DP-MRM [Kim+ 12]
– Dirichlet Process with Mixed Random Measures

• Supervised Topic Model
• Nonparametric
– K is not the topic size, but the label size
𝛼

𝑁𝑗

𝐻 𝐺0𝑘 𝐺𝑗 𝜃 𝑗𝑖 𝑥 𝑗𝑖

𝜆j 𝑟𝑗 𝐷
𝛽 𝛾𝑘 𝜂
𝐾

Generative Process for DP-MRM
𝛼
Each label has a random
measure as topic space 𝑁𝑗
𝐻 𝐺0𝑘 𝐺𝑗 𝜃 𝑗𝑖 𝑥 𝑗𝑖
• 𝐻 = Dir 𝛽
𝜆j 𝑟𝑗 𝐷
• 𝐺0𝑘 ~DP 𝛾 𝑘 , 𝐻 𝛽
𝐾
𝛾𝑘 𝜂

• 𝜆 𝑗 ~Dir 𝒓 𝑗 𝜂 where 𝒓 𝑗 = 𝐼 𝑘∈label 𝑗

• 𝐺 𝑗 ~DP 𝛼, 𝑘∈label 𝑗 𝜆 𝑗𝑘 𝐺0𝑘 mixed random measures

• 𝜃 𝑗𝑖 ~𝐺 𝑗 , 𝑥 𝑗𝑖 ~𝐹 𝜃 𝑗𝑖 = Multi 𝜃 𝑗𝑖

Stick Breaking Process
• 𝑣 𝑙 𝑘 ~Beta 1, 𝛾 𝑘 , 𝜋 𝑙𝑘 = 𝑣 𝑙 𝑘 𝑙−1
𝑑=0 1 − 𝑣 𝑑𝑘

• 𝜙 𝑙𝑘 ~𝐻, 𝐺0𝑘 = ∞
𝑙=0 𝜋 𝑙𝑘 𝛿 𝜙 𝑘
𝑙
𝑡−1
• 𝜆 𝑗 ~Dir 𝒓 𝑗 𝜂 , 𝑤 𝑗𝑡 ~Beta 1, 𝛼 , 𝜋 𝑗𝑡 = 𝑤 𝑗𝑡 𝑑=0 1 − 𝑤 𝑗𝑑
𝑘 𝑗𝑡 ∞
• 𝑘 𝑗𝑡 ~Multi 𝜆 𝑗 , 𝜓 𝑗𝑡 ~𝐺0 , 𝐺𝑗 = 𝑡=0 𝜋 𝑗𝑡 𝛿 𝜓 𝑗𝑡

Chinese Restaurant Franchise
• 𝑡 𝑗𝑖 : table index of 𝑖-th term in 𝑗-th document
• 𝑘 𝑗𝑡 , 𝑙 𝑗𝑡 : dish indexes on 𝑡-th table of 𝑗-th
document This layer consists on
only a single DP G0
on normal HDP

Experiments
• DP-MRM gives label-topic probabilistic
corresponding automatically.

via [Kim+ 12]

via [Kim+ 12]

• L-LDA can also predict single labeled document to
assign a common second label to any documents.

References
• [Kim+ ICML2012] Dirichlet Process with Mixed
Random Measures : A Nonparametric Topic
Model for Labeled Data
• [Ramage+ EMNLP2009] Labeled LDA : A
supervised topic model for credit attribution in
multi-labeled corpora
• [Blei+ 2003] Latent Dirichlet Allocation

Mais conteúdo relacionado

Destaque

[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing SystemsShuyo Nakatani

Manifold learning with application to object recognitionzukun

Methods of Manifold Learning for Dimension Reduction of Large Data SetsRyan B Harvey, CSDP, CSM

Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...wl820609

The Gaussian Process Latent Variable Model (GPLVM)James McMurray

Topic ModelsClaudia Wagner

関東CV勉強会 Kernel PCA (2011.2.19)Akisato Kimura

Self-organizing mapTarat Diloksawatdikul

WSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender SystemsKotaro Tanahashi

Visualizing Data Using t-SNETomoki Hayashi

AutoEncoderで特徴抽出Kai Sasaki

LDA入門正志坪坂

非線形データの次元圧縮 150905 WACODE 2ndMika Yoshimura

CVIM#11 3. 最小化のための数値計算sleepy_yoshi

Numpy scipyで独立成分分析Shintaro Fukushima

基底変換、固有値・固有ベクトル、そしてその先Taketo Sano

Hyperoptとその周辺についてKeisuke Hosaka

Destaque (17)

[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems

Manifold learning with application to object recognition

Methods of Manifold Learning for Dimension Reduction of Large Data Sets

Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...

The Gaussian Process Latent Variable Model (GPLVM)

Topic Models

関東CV勉強会 Kernel PCA (2011.2.19)

Self-organizing map

WSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender Systems

Visualizing Data Using t-SNE

AutoEncoderで特徴抽出

LDA入門

非線形データの次元圧縮 150905 WACODE 2nd

CVIM#11 3. 最小化のための数値計算

Numpy scipyで独立成分分析

基底変換、固有値・固有ベクトル、そしてその先

Hyperoptとその周辺について

Semelhante a [Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data

TldrNishaMohanDevadiga

The Volcano/Cascades Optimizer宇傅

Software size distribution - Why we always underestimate software costIsrael Herraiz

DGraph: Introduction To Basics & Quick Start W/RatelKnoldus Inc.

230906 paper summary - learning to world model with language - public.pdfSeungjoon1

Challenges and patterns for semantics at scaleRob Vesse

Data PreprocessingzekeLabs Technologies

Semelhante a [Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data (7)

Tldr

The Volcano/Cascades Optimizer

Software size distribution - Why we always underestimate software cost

DGraph: Introduction To Basics & Quick Start W/Ratel

230906 paper summary - learning to world model with language - public.pdf

Challenges and patterns for semantics at scale

Data Preprocessing

Mais de Shuyo Nakatani

画像をテキストで検索したい！(OpenAI CLIP) - VRC-LT #15Shuyo Nakatani

Generative adversarial networksShuyo Nakatani

無限関係モデル (続・わかりやすいパターン認識 13章)Shuyo Nakatani

Memory Networks (End-to-End Memory Networks の Chainer 実装)Shuyo Nakatani

人工知能と機械学習の違いって？Shuyo Nakatani

RとStanでクラウドセットアップ時間を分析してみたら #TokyoRShuyo Nakatani

ドラえもんでわかる統計的因果推論 #TokyoRShuyo Nakatani

[Yang, Downey and Boyd-Graber 2015] Efficient Methods for Incorporating Knowl...Shuyo Nakatani

星野「調査観察データの統計科学」第3章Shuyo Nakatani

星野「調査観察データの統計科学」第1＆2章Shuyo Nakatani

言語処理するのに Python でいいの？ #PyDataTokyoShuyo Nakatani

Zipf? (ジップ則のひみつ？) #DSIRNLPShuyo Nakatani

ACL2014 Reading: [Zhang+] "Kneser-Ney Smoothing on Expected Count" and [Pickh...Shuyo Nakatani

ソーシャルメディアの多言語判定 #SoC2014Shuyo Nakatani

猫に教えてもらうルベーグ可測Shuyo Nakatani

アラビア語とペルシャ語の見分け方 #DSIRNLP 5Shuyo Nakatani

どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013Shuyo Nakatani

Active Learning 入門Shuyo Nakatani

数式を綺麗にプログラミングするコツ #spro2013Shuyo Nakatani

ノンパラベイズ入門の入門Shuyo Nakatani

Mais de Shuyo Nakatani (20)

画像をテキストで検索したい！(OpenAI CLIP) - VRC-LT #15

Generative adversarial networks

無限関係モデル (続・わかりやすいパターン認識 13章)

Memory Networks (End-to-End Memory Networks の Chainer 実装)

人工知能と機械学習の違いって？

RとStanでクラウドセットアップ時間を分析してみたら #TokyoR

ドラえもんでわかる統計的因果推論 #TokyoR

[Yang, Downey and Boyd-Graber 2015] Efficient Methods for Incorporating Knowl...

星野「調査観察データの統計科学」第3章

星野「調査観察データの統計科学」第1＆2章

言語処理するのに Python でいいの？ #PyDataTokyo

Zipf? (ジップ則のひみつ？) #DSIRNLP

ACL2014 Reading: [Zhang+] "Kneser-Ney Smoothing on Expected Count" and [Pickh...

ソーシャルメディアの多言語判定 #SoC2014

猫に教えてもらうルベーグ可測

アラビア語とペルシャ語の見分け方 #DSIRNLP 5

どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013

Active Learning 入門

数式を綺麗にプログラミングするコツ #spro2013

ノンパラベイズ入門の入門

Último

A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3

Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica

Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3

React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica

QCon London: Mastering long-running processes in modern architecturesBernd Ruecker

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal

Top 10 Hubspot Development Companies in 2024TopCSSGallery

A Framework for Development in the AI AgeCprime

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani

Decarbonising Buildings: Making a net-zero built environment a realityIES VE

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765

UiPath Community: Communication Mining from Zero to HeroUiPathCommunity

Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers

[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data

1. [Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data 2012/07/28 Nakatani Shuyo @ Cybozu Labs, Inc twitter : @shuyo

2. LDA(Latent Dirichlet Allocation) [Blei+ 03] • Unsupervised Topic Model – Each word has an unobserved topic • Parametric – The topic size K is given in advance via Wikipedia

3. Labeled LDA [Ramage+ 09] • Supervised Topic Model – Each document has an observed label • Parametric via [Ramage+ 09]

4. Generative Process for L-LDA • 𝜷 𝑘 ~Dir 𝜼 topics corresponding to 𝑑 observed labels • Λ 𝑘 ~Bernoulli Φ 𝑘 • 𝜽 𝑑 ~Dir 𝜶 𝑑 restricted to labeled – where 𝜶 𝑑 = 𝛼𝑘 parameters 𝑑 𝑘 Λ 𝑘 =1 𝑑 𝑑 • 𝑧 𝑖 ~Multi 𝜽 𝑑 • 𝑤𝑖 ~Multi 𝜷 𝑧 𝑑 𝑖 via [Ramage+ 09]

5. Pros/Cons of L-LDA • Pros – Easy to implement • Cons via [Ramage+ 09] – It is necessary to specify label-topic correspondence manually • Its performance depends on the corresponds ※) My implementation is here : https://github.com/shuyo/iir/blob/master/lda/llda.py

6. DP-MRM [Kim+ 12] – Dirichlet Process with Mixed Random Measures • Supervised Topic Model • Nonparametric – K is not the topic size, but the label size 𝛼 𝑁𝑗 𝐻 𝐺0𝑘 𝐺𝑗 𝜃 𝑗𝑖 𝑥 𝑗𝑖 𝜆j 𝑟𝑗 𝐷 𝛽 𝛾𝑘 𝜂 𝐾

7. Generative Process for DP-MRM 𝛼 Each label has a random measure as topic space 𝑁𝑗 𝐻 𝐺0𝑘 𝐺𝑗 𝜃 𝑗𝑖 𝑥 𝑗𝑖 • 𝐻 = Dir 𝛽 𝜆j 𝑟𝑗 𝐷 • 𝐺0𝑘 ~DP 𝛾 𝑘 , 𝐻 𝛽 𝐾 𝛾𝑘 𝜂 • 𝜆 𝑗 ~Dir 𝒓 𝑗 𝜂 where 𝒓 𝑗 = 𝐼 𝑘∈label 𝑗 • 𝐺 𝑗 ~DP 𝛼, 𝑘∈label 𝑗 𝜆 𝑗𝑘 𝐺0𝑘 mixed random measures • 𝜃 𝑗𝑖 ~𝐺 𝑗 , 𝑥 𝑗𝑖 ~𝐹 𝜃 𝑗𝑖 = Multi 𝜃 𝑗𝑖

8. Stick Breaking Process • 𝑣 𝑙 𝑘 ~Beta 1, 𝛾 𝑘 , 𝜋 𝑙𝑘 = 𝑣 𝑙 𝑘 𝑙−1 𝑑=0 1 − 𝑣 𝑑𝑘 • 𝜙 𝑙𝑘 ~𝐻, 𝐺0𝑘 = ∞ 𝑙=0 𝜋 𝑙𝑘 𝛿 𝜙 𝑘 𝑙 𝑡−1 • 𝜆 𝑗 ~Dir 𝒓 𝑗 𝜂 , 𝑤 𝑗𝑡 ~Beta 1, 𝛼 , 𝜋 𝑗𝑡 = 𝑤 𝑗𝑡 𝑑=0 1 − 𝑤 𝑗𝑑 𝑘 𝑗𝑡 ∞ • 𝑘 𝑗𝑡 ~Multi 𝜆 𝑗 , 𝜓 𝑗𝑡 ~𝐺0 , 𝐺𝑗 = 𝑡=0 𝜋 𝑗𝑡 𝛿 𝜓 𝑗𝑡

9. Chinese Restaurant Franchise • 𝑡 𝑗𝑖 : table index of 𝑖-th term in 𝑗-th document • 𝑘 𝑗𝑡 , 𝑙 𝑗𝑡 : dish indexes on 𝑡-th table of 𝑗-th document This layer consists on only a single DP G0 on normal HDP

10. Inference (1) • Sampling 𝑡

11. Inference (2) • Sampling 𝑘 and 𝑙

12. Experiments • DP-MRM gives label-topic probabilistic corresponding automatically. via [Kim+ 12]

13. via [Kim+ 12] • L-LDA can also predict single labeled document to assign a common second label to any documents.

14. References • [Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data • [Ramage+ EMNLP2009] Labeled LDA : A supervised topic model for credit attribution in multi-labeled corpora • [Blei+ 2003] Latent Dirichlet Allocation

[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (17)

Semelhante a [Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data

Semelhante a [Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data (7)

Mais de Shuyo Nakatani

Mais de Shuyo Nakatani (20)

Último

Último (20)

[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data