Discriminative probabilistic models for relational data

TLDR

Supervised learning often involves entities whose labels are interdependent, yet conventional methods treat them independently, ignoring strong correlations such as those among linked hypertext pages. This work proposes a discriminative relational framework based on conditional Markov networks to overcome limitations of prior probabilistic relational models. By employing undirected models, the approach removes acyclicity constraints and enables discriminative training of the conditional likelihood, followed by approximate inference for collective classification. Experiments on webpage classification demonstrate that incorporating relational dependencies via this framework yields a significant accuracy improvement.

Abstract

In many supervised learning tasks, the entities to be labeled are related to each other in complex ways and their labels are not independent. For example, in hypertext classification, the labels of linked pages are highly correlated. A standard approach is to classify each entity independently, ignoring the correlations between them. Recently, Probabilistic Relational Models, a relational version of Bayesian networks, were used to define a joint probabilistic model for a collection of related entities. In this paper, we present an alternative framework that builds on (conditional) Markov networks and addresses two limitations of the previous approach. First, undirected models do not impose the acyclicity constraint that hinders representation of many important relational dependencies in directed models. Second, undirected models are well suited for discriminative training, where we optimize the conditional likelihood of the labels given the features, which generally improves classification accuracy. We show how to train these models effectively, and how to use approximate probabilistic inference over the learned model for collective classification of multiple related entities. We provide experimental results on a webpage classification task, showing that accuracy can be significantly improved by modeling relational dependencies.

References

Page 1

	Year	Citations

Page 1