Publication | Closed Access
Using latent semantic analysis to improve access to textual information
608
Citations
11
References
1988
Year
Unknown Venue
Retrieval systems typically rely on lexical matching between user queries and database object terms. The paper proposes a latent semantic indexing approach to address the vocabulary problem in human‑computer interaction by automatically organizing text objects into a semantic structure for better query matching. The method uses singular‑value decomposition on a term‑by‑text matrix to produce 50–150 dimensional vectors representing terms and objects, which are then matched in a semantic space to user queries. Lexical matching is incomplete, but initial tests show the latent semantic indexing method is widely applicable and improves user access to diverse textual materials.
This paper describes a new approach for dealing with the vocabulary problem in human-computer interaction. Most approaches to retrieving textual materials depend on a lexical match between words in users' requests and those in or assigned to database objects. Because of the tremendous diversity in the words people use to describe the same object, lexical matching methods are necessarily incomplete and imprecise [5]. The latent semantic indexing approach tries to overcome these problems by automatically organizing text objects into a semantic structure more appropriate for matching user requests. This is done by taking advantage of implicit higher-order structure in the association of terms with text objects. The particular technique used is singular-value decomposition, in which a large term by text-object matrix is decomposed into a set of about 50 to 150 orthogonal factors from which the original matrix can be approximated by linear combination. Terms and objects are represented by 50 to 150 dimensional vectors and matched against user queries in this "semantic" space. Initial tests find this completely automatic method widely applicable and a promising way to improve users' access to many kinds of textual materials, or to objects and services for which textual descriptions are available.
| Year | Citations | |
|---|---|---|
Page 1
Page 1