Concepedia

TLDR

Stochastic methods can benefit generation applications, but rapid assessment of their relative merits is essential during development. The study introduces intrinsic metrics for baseline quantitative assessment and aims to evaluate their correlation with human qualitative judgments. We conduct an experiment measuring the correlation between intrinsic metrics and human qualitative judgments. The experiment shows intrinsic metrics cannot replace human evaluation but some correlate significantly with human judgments of quality and understandability, making them useful during development.

Abstract

Certain generation applications may profit from the use of stochastic methods. In developing stochastic methods, it is crucial to be able to quickly assess the relative merits of different approaches or models. In this paper, we present several types of intrinsic (system internal) metrics which we have used for baseline quantitative assessment. This quantitative assessment should then be augmented to a fuller evaluation that examines qualitative aspects. To this end, we describe an experiment that tests correlation between the quantitative metrics and human qualitative judgment. The experiment confirms that intrinsic metrics cannot replace human evaluation, but some correlate significantly with human judgments of quality and understandability and can be used for evaluation during development.

References

YearCitations

Page 1