Computing Marginal in CCMRFs - NIPS 2010

•

0 gostou•1,969 visualizações

Continuous Markov random fields are a general formalism to model joint probability distributions over events with continuous outcomes. We prove that marginal computation for constrained continuous MRFs is #P-hard in general and present a polynomial-time approximation scheme under mild assumptions on the structure of the random field. Moreover, we introduce a sampling algorithm to compute marginal distributions and develop novel techniques to increase its efficency. Continuous MRFs are a general purpose probabilistic modeling tool and we demonstrate how they can be applied to statistical relational learning. On the problem of collective classification, we evaluate our algorithm and show that the standard deviation of marginals serves as a useful measure of confidence.

Tecnologia Educação

http://www.cs.umd.edu/linqs

Computing Marginal Distributions over Continuous
Markov Networks for Statistical Relational Learning
Matthias Bröcheler and Lise Getoor Supported by NSF Grant No. 0937094

The complexity of computing an approximate

Lovasz & Vempala ‘04
Problem?
distribution σ* using hit-and-run sampling such that
Computing marginal distributions in constrained the total variation distance of σ* and P is less than ε is
continuous MRFs (CCMRF) ∗
3

d O n (kB + n + m)
˜ ˜
Motivation? where ñ=n-kA, under the assumptions that we start from an initial distribution σ such
Many applications of CCMRF, probabilistic soft logic Xi p that the density function dσ/dP is bounded by M except on a set S with σ(S)≤ε/s

being one of them
Contributions? Hit-and-Run Sampling In Theory…
q In Prac@ce…
Analysis of the theoretical and practical aspects of 1.  Sample random direction
computing marginals in CCMRFs 2.  Compute line segment d
3.  Induce density on line Algorithm ε1
4.  Sample from induced density p
1.  Start=MAP state
What’s a CCMRF?
2.  Dimensionality
reduction and LA
Constrained Continuous Markov Random Field Let’s approximate! 3.  How do we get out ε2

of corners?
X = {X1 , .., Xn } : Di ⊂ R D = ×n Di zk − W k d i T
i=1 Computing the marginal probability density function
1.  Corner heuristic di+1 = di + 2 Wk
φ = {φ1 , .., φm } : φj : D → [0, M] 4.  Induce f efficiently
Constraints fX (x ) =
f (x , y)dy for a subset X ⊂ X under Wk 2
Λ = {λ1 , .., λm } ˜
y∈×D ,s.t.X ∈X
i i /
Equality Constraints
the probability measure defined by a CCMRF is #P
Probability measure P over X deﬁned through A : D → RkA , a ∈ Rk A
1 m hard in the worst case. Experimental Results
Inequality Constraints
f (x) = exp[− λj φj (x)]
Z(Λ) B : D → Rk B , b ∈ Rk B Collective classification of 1717 Wikipedia articles with 20% seed documents
j=1
  ˜
D = D ∩ {x|A(x) = a ∧ B(x) ≤ b} In Theory… Setup using tf/idf weighted cosine similarity as baseline and comparing against a
m PSL program with learned weights over K-folds cross validation.

Z(Λ) = exp − λj φj (x) dx / ˜
f (x) = 0 ∀x ∈ D Why CCMRF? Std. Deviation Indicator of
D j=1 Folds Improvement P(Null Relative Confidence
over baseline Hypothesis) Difference Δ(σ)
Probabilistic soft logic (PSL) is a declarative language ∆(σ) = 2
σ− − σ+
20 41.4% 1.95E-09 38.3%
for collective probabilistic reasoning about similarity σ+ + σ−
What does it look like? or uncertainty in relational domains. PSL focuses on
25 31.7% 2.40E-13 41.2%
30 39.1% 1.00E-16 43.5% Hypothesis

X1
statistical relational learning problems with continuous 35 46.1% 4.54E-08 39.0% ∆(σ) 0
1 1 X1
φ3 (x) = max(0, x2 − x3 ) f
RVs and supports sets and aggregation.
Convergence Analysis
φ2 (x) = max(0, x1 − x2 ) 0 1 PSL programs get grounded into CCMRFs for inference. 5

KL Divergence
φ1 (x) = x1
x1 + x3 ≤ 1 w1 : class(B,C)  A.text≈B.text class(A,C) Average KL Divergence
P(0.4 ≤ X2 ≤ 0.6) 0.5
X3
0
Highest
Probability 0
X3
w2 : class(B,C)  link(A,B) class(A,C) Lowest Quartile KL RV)
Divergence
(322-413
1 1 Highest Quartile KL RV)
(174-224

X2
Λ = {1, 2, 1} Constraint: functional(class) 0.05
Divergence
X = {X1 , X2 , X3 } 30000 300000 Number of Samples 3000000

Mais conteúdo relacionado

Mais de Matthias Broecheler

Titan: Scaling Graphs and TinkerPop3Matthias Broecheler

Titan @ Gitpro Conference 2014Matthias Broecheler

Titan NYC Meetup March 2014Matthias Broecheler

Graph Computing @ Strangeloop 2013Matthias Broecheler

Titan - Graph Computing with CassandraMatthias Broecheler

Data Day Texas 2013Matthias Broecheler

Adding Value through graph analysis using Titan and FaunusMatthias Broecheler

Big Graph DataMatthias Broecheler

Titan: Big Graph Data with CassandraMatthias Broecheler

PMatch: Probabilistic Subgraph Matching on Huge Social NetworksMatthias Broecheler

Budget-Match: Cost Effective Subgraph Matching on Large NetworksMatthias Broecheler

Probabilistic Soft LogicMatthias Broecheler

A Scalable Framework for Modeling Competitive Diffusion in Social NetworksMatthias Broecheler

COSI: Cloud Oriented Subgraph Identification in Massive Social NetworksMatthias Broecheler

Mais de Matthias Broecheler (14)

Titan: Scaling Graphs and TinkerPop3

Titan @ Gitpro Conference 2014

Titan NYC Meetup March 2014

Graph Computing @ Strangeloop 2013

Titan - Graph Computing with Cassandra

Data Day Texas 2013

Adding Value through graph analysis using Titan and Faunus

Big Graph Data

Titan: Big Graph Data with Cassandra

PMatch: Probabilistic Subgraph Matching on Huge Social Networks

Budget-Match: Cost Effective Subgraph Matching on Large Networks

Probabilistic Soft Logic

A Scalable Framework for Modeling Competitive Diffusion in Social Networks

COSI: Cloud Oriented Subgraph Identification in Massive Social Networks

Último

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Architecting Cloud Native ApplicationsWSO2

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

DBX First Quarter 2024 Investor PresentationDropbox

ICT role in 21st century education and its challengesrafiqahmad00786416

CNIC Information System with Pakdata Cf In Pakistandanishmna97

MS Copilot expands with MS Graph connectorsNanddeep Nachan

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Why Teams call analytics are critical to your entire businesspanagenda

Understanding the FAA Part 107 License ..Christopher Logan Kennedy

Computing Marginal in CCMRFs - NIPS 2010

1. http://www.cs.umd.edu/linqs Computing Marginal Distributions over Continuous Markov Networks for Statistical Relational Learning Matthias Bröcheler and Lise Getoor Supported by NSF Grant No. 0937094 The complexity of computing an approximate Lovasz & Vempala ‘04 Problem? distribution σ* using hit-and-run sampling such that Computing marginal distributions in constrained the total variation distance of σ* and P is less than ε is continuous MRFs (CCMRF) ∗ 3 d O n (kB + n + m) ˜ ˜ Motivation? where ñ=n-kA, under the assumptions that we start from an initial distribution σ such Many applications of CCMRF, probabilistic soft logic Xi p that the density function dσ/dP is bounded by M except on a set S with σ(S)≤ε/s being one of them Contributions? Hit-and-Run Sampling In Theory… q In Prac@ce… Analysis of the theoretical and practical aspects of 1.  Sample random direction computing marginals in CCMRFs 2.  Compute line segment d 3.  Induce density on line Algorithm ε1 4.  Sample from induced density p 1.  Start=MAP state What’s a CCMRF? 2.  Dimensionality reduction and LA Constrained Continuous Markov Random Field Let’s approximate! 3.  How do we get out ε2 of corners? X = {X1 , .., Xn } : Di ⊂ R D = ×n Di zk − W k d i T i=1 Computing the marginal probability density function 1.  Corner heuristic di+1 = di + 2 Wk φ = {φ1 , .., φm } : φj : D → [0, M] 4.  Induce f efficiently Constraints fX (x ) = f (x , y)dy for a subset X ⊂ X under Wk 2 Λ = {λ1 , .., λm } ˜ y∈×D ,s.t.X ∈X i i / Equality Constraints the probability measure defined by a CCMRF is #P Probability measure P over X deﬁned through A : D → RkA , a ∈ Rk A 1 m hard in the worst case. Experimental Results Inequality Constraints f (x) = exp[− λj φj (x)] Z(Λ) B : D → Rk B , b ∈ Rk B Collective classification of 1717 Wikipedia articles with 20% seed documents j=1   ˜ D = D ∩ {x|A(x) = a ∧ B(x) ≤ b} In Theory… Setup using tf/idf weighted cosine similarity as baseline and comparing against a m PSL program with learned weights over K-folds cross validation. Z(Λ) = exp − λj φj (x) dx / ˜ f (x) = 0 ∀x ∈ D Why CCMRF? Std. Deviation Indicator of D j=1 Folds Improvement P(Null Relative Confidence over baseline Hypothesis) Difference Δ(σ) Probabilistic soft logic (PSL) is a declarative language ∆(σ) = 2 σ− − σ+ 20 41.4% 1.95E-09 38.3% for collective probabilistic reasoning about similarity σ+ + σ− What does it look like? or uncertainty in relational domains. PSL focuses on 25 31.7% 2.40E-13 41.2% 30 39.1% 1.00E-16 43.5% Hypothesis X1 statistical relational learning problems with continuous 35 46.1% 4.54E-08 39.0% ∆(σ) 0 1 1 X1 φ3 (x) = max(0, x2 − x3 ) f RVs and supports sets and aggregation. Convergence Analysis φ2 (x) = max(0, x1 − x2 ) 0 1 PSL programs get grounded into CCMRFs for inference. 5 KL Divergence φ1 (x) = x1 x1 + x3 ≤ 1 w1 : class(B,C)  A.text≈B.text class(A,C) Average KL Divergence P(0.4 ≤ X2 ≤ 0.6) 0.5 X3 0 Highest Probability 0 X3 w2 : class(B,C)  link(A,B) class(A,C) Lowest Quartile KL RV) Divergence (322-413 1 1 Highest Quartile KL RV) (174-224 X2 Λ = {1, 2, 1} Constraint: functional(class) 0.05 Divergence X = {X1 , X2 , X3 } 30000 300000 Number of Samples 3000000

Computing Marginal in CCMRFs - NIPS 2010

Recomendados

Recomendados

Mais conteúdo relacionado

Mais de Matthias Broecheler

Mais de Matthias Broecheler (14)

Último

Último (20)

Computing Marginal in CCMRFs - NIPS 2010