Learning hard concepts through constructive induction: framework and rationale

Abstract

The intrinsic accuracy of an inductive problem is the accuracy achieved by exhaustive table look‐up. Intrinsic accuracy is the upper bound for any inductive method. Hard concepts are concepts that have high intrinsic accuracy, but which cannot be learned effectively with traditional inductive methods. To learn hard concepts, we must use constructive induction ‐ methods that create new features. We use measures of concept dispersion to explore (conceptually and empirically) the inherent weaknesses of traditional inductive approaches. These structural defects are buried in the design of the algorithms and prevent the learning of hard concepts. After studying some examples of successful and unsuccessful feature construction (“success” being defined here in terms of accuracy), we introduce a single measure of inductive difficulty that we call variation. We argue for a specific approach to constructive induction that reduces variation by incorporating various kinds of domain knowledge. All of these kinds of domain knowledge boil down to utility invariants, i.e., transformations that group together non‐contiguous portions of feature space having similar class‐membership values. Utility invariants manifest themselves in various ways: in some cases they exist in the user's stock of domain knowledge, in other cases they may be discovered via methods we describe.

References

Page 1

	Year	Citations

Page 1