Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

TLDR

Transductive graph learning algorithms and standard methods such as support vector machines and regularized least squares are special cases of the proposed framework. The authors propose a family of learning algorithms that exploit the geometry of the marginal distribution via a new form of regularization. They develop a semi‑supervised framework using reproducing kernel Hilbert space properties to prove new Representer theorems, enabling a general‑purpose learner that incorporates labeled and unlabeled data. The algorithms provide a natural out‑of‑sample extension for both transductive and truly semi‑supervised settings, and experiments show they effectively leverage unlabeled data. The paper also briefly discusses how the framework extends to unsupervised and fully supervised learning.

Abstract

We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner. Some transductive graph learning algorithms and standard methods including support vector machines and regularized least squares can be obtained as special cases. We use properties of reproducing kernel Hilbert spaces to prove new Representer theorems that provide theoretical basis for the algorithms. As a result (in contrast to purely graph-based approaches) we obtain a natural out-of-sample extension to novel examples and so are able to handle both transductive and truly semi-supervised settings. We present experimental evidence suggesting that our semi-supervised algorithms are able to use unlabeled data effectively. Finally we have a brief discussion of unsupervised and fully supervised learning within our general framework.