Concepedia

Abstract

The representation of documents and queries as vectors in a high-dimensional space is well-established in information retrieval. The author proposes that the semantics of words and contexts in a text be represented as vectors. The dimensions of the space are words and the initial vectors are determined by the words occurring close to the entity to be represented, which implies that the space has several thousand dimensions (words). This makes the vector representations (which are dense) too cumbersome to use directly. Therefore, dimensionality reduction by means of a singular value decomposition is employed. The author analyzes the structure of the vector representations and applies them to word sense disambiguation and thesaurus induction.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>

References

YearCitations

Page 1