Publication | Open Access
Evaluation metrics for generation
127
Citations
10
References
2000
Year
Unknown Venue
EngineeringQuality MetricSoftware EngineeringEvaluation MetricsSystem MetricSoftware AnalysisSocial SciencesProgram EvaluationQuantitative MetricsEvaluation MethodologyDecision TheoryStatisticsQuantitative ManagementPerformance MetricReliabilityCognitive ScienceDesignUser ExperienceIntrinsic MetricsCertain Generation ApplicationsEvaluation MeasureSoftware TestingSoftware MetricEvaluation Technique
Stochastic methods can benefit generation applications, but rapid assessment of their relative merits is essential during development. The study introduces intrinsic metrics for baseline quantitative assessment and aims to evaluate their correlation with human qualitative judgments. We conduct an experiment measuring the correlation between intrinsic metrics and human qualitative judgments. The experiment shows intrinsic metrics cannot replace human evaluation but some correlate significantly with human judgments of quality and understandability, making them useful during development.
Certain generation applications may profit from the use of stochastic methods. In developing stochastic methods, it is crucial to be able to quickly assess the relative merits of different approaches or models. In this paper, we present several types of intrinsic (system internal) metrics which we have used for baseline quantitative assessment. This quantitative assessment should then be augmented to a fuller evaluation that examines qualitative aspects. To this end, we describe an experiment that tests correlation between the quantitative metrics and human qualitative judgment. The experiment confirms that intrinsic metrics cannot replace human evaluation, but some correlate significantly with human judgments of quality and understandability and can be used for evaluation during development.
| Year | Citations | |
|---|---|---|
Page 1
Page 1