Publication | Open Access
Mathematical Foundations for a Compositional Distributional Model of Meaning
90
Citations
26
References
2010
Year
EngineeringLexical SemanticsSemanticsNatural Language ProcessingSyntaxComputational LinguisticsGrammarLanguage StudiesVector Space ModelsFormal SemanticsPrinciple Of CompositionalityDistributional SemanticsCategorial GrammarPhilosophy Of LanguageAutomated ReasoningDiagrammatic CalculusMathematical FoundationsUnification GrammarLinguisticsComputational SemanticsMathematical Framework
The authors propose a mathematical framework that unifies distributional vector space models of meaning with a compositional theory of grammatical types using Lambek’s Pregroup algebra. They lift Pregroup type reductions to categorical morphisms, enabling constituent meanings to be combined into sentence meaning, comparing sentences via inner products, and visualizing information flow with a diagrammatic calculus. This framework permits computing sentence meaning from its parts, places all sentence meanings in a single space regardless of syntax, and a Boolean‑constrained variant yields a Montague‑style Boolean semantics.
We propose a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types, for which we rely on the algebra of Pregroups, introduced by Lambek. This mathematical framework enables us to compute the meaning of a well-typed sentence from the meanings of its constituents. Concretely, the type reductions of Pregroups are ‘lifted’ to morphisms in a category, a procedure that transforms meanings of constituents into a meaning of the (well-typed) whole. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model. The mathematical structure we employ admits a purely diagrammatic calculus which exposes how the information flows between the words in a sentence in order to make up the meaning of the whole sentence. A variation of our ‘categorical model’ which involves constraining the scalars of the vector spaces to the semiring of Booleans results in a Montague-style Boolean-valued semantics.
| Year | Citations | |
|---|---|---|
Page 1
Page 1