Text Classification using String Kernels

Abstract

1 Introduction Standard learning systems (like neural networks or decision trees) operate on input data after they have been transformed into feature vectors x1;:::; x ` 2 X from an n dimensional space. There are cases, however, where the input data can not be readily described by explicit feature vectors: for example biosequences, images, graphs and text documents. For such datasets, the construction of a feature extraction module can be as complex and expensive as solving the entire problem. An effective alternative to explicit feature extraction is provided by kernel methods. Kernel-based learning methods use an implicit mapping of the input data into a high dimensional feature space defined by a kernel function, i.e. a function returning the inner product between the images of two data points in the feature space. The learning then takes place in the feature space, provided the learning algorithm can be entirely rewritten so that the data points only appear inside dot products with other data points.

References

Page 1

	Year	Citations

Page 1