Concepedia

TLDR

Lexicalized statistical parsing models have become increasingly complex, necessitating thorough scrutiny to understand their construction and to leverage that understanding for scientific and engineering progress. The thesis aims to examine a dominant type of lexicalized parsing model through macro and detailed analyses, revealing its core structure and the efficacy of specific parameters. The authors treat the model as data, employing a paradigm that includes macro analysis, detailed parameter evaluation, and the design of a modular software engine to isolate peripheral components. The analyses uncovered core insights and identified inefficiencies in the baseline model, enabling the creation of a more compact model or the development of a higher‑accuracy alternative.

Abstract

In this thesis, we apply as well as develop techniques and methodologies for the examination of the complex systems that are lexicalized statistical parsing models. The primary idea is that of treating the “model as data”, which is not a particular method, but a paradigm and a research methodology. Our argument is that lexicalized statistical parsing models have become increasingly complex, and therefore require thorough scrutiny, both to achieve the scientific aim of understanding what has been built thus far, and to achieve both the scientific and engineering goal of using that understanding for progress. In this thesis, we take a particular, dominant type of parsing model and perform a macro analysis, to reveal its core (and design a software engine that modularizes the periphery), and we also crucially perform a detailed analysis, which provides for the first time a window onto the efficacy of specific parameters. These analyses have not only yielded insight into the core model, but they have also enabled the identification of “inefficiencies” in our baseline model, such that those inefficiencies can be reduced to form a more compact model, or exploited for finding a better-estimated model with higher accuracy, or both.

References

YearCitations

Page 1