Solving Multiclass Learning Problems via Error-Correcting Output Codes

TLDR

Multiclass learning seeks a function mapping inputs to one of k > 2 discrete classes, and prior methods include direct multiclass algorithms, per‑class binary learners, and binary learners with distributed outputs. The study compares these three approaches to a novel error‑correcting output‑code technique. The authors employ error‑correcting output codes as a distributed output representation, evaluating them against C4.5 and backpropagation on multiclass tasks. The error‑correcting output‑code method improves generalization for both C4.5 and backpropagation, remains robust to training size, representation assignment, and overfitting controls, and yields reliable class probability estimates, establishing it as a general‑purpose improvement for inductive learning on multiclass problems.

Abstract

Multiclass learning problems involve finding a definitionfor an unknown function f(x) whose range is a discrete setcontaining k > 2 values (i.e., k ``classes''). Thedefinition is acquired by studying collections of training examples ofthe form [x_i, f (x_i)]. Existing approaches tomulticlass learning problems include direct application of multiclassalgorithms such as the decision-tree algorithms C4.5 and CART,application of binary concept learning algorithms to learn individualbinary functions for each of the k classes, and application ofbinary concept learning algorithms with distributed outputrepresentations. This paper compares these three approaches to a newtechnique in which error-correcting codes are employed as adistributed output representation. We show that these outputrepresentations improve the generalization performance of both C4.5and backpropagation on a wide range of multiclass learning tasks. Wealso demonstrate that this approach is robust with respect to changesin the size of the training sample, the assignment of distributedrepresentations to particular classes, and the application ofoverfitting avoidance techniques such as decision-tree pruning.Finally, we show that---like the other methods---the error-correctingcode technique can provide reliable class probability estimates.Taken together, these results demonstrate that error-correcting outputcodes provide a general-purpose method for improving the performanceof inductive learning programs on multiclass problems.

References

Page 1

	Year	Citations

Page 1