Comparing Internal and External Standards in Voice Quality Judgments

TLDR

Voice quality perception relies on internal standards that are unstable and can drift due to context effects, undermining consistent ratings. The study aims to determine whether using explicit external anchor standards enhances listener reliability compared to internal standards. Twelve clinicians rated 22 synthetic voice stimuli for roughness using both a traditional 5‑point equal‑appearing interval scale and a scale with explicit anchor stimuli at each point. Ratings with the anchored scale were significantly more reliable and did not drift, whereas unanchored EAI ratings drifted as predicted, supporting the benefit of explicit anchors.

Abstract

A new descriptive framework for voice quality perception (Kreiman, Gerratt, Kempster, Erman, & Berke, 1993) states that when listeners rate a voice on some quality dimension (e.g., roughness), they compare the stimulus presented to an internal standard or scale. Hypothetically, substituting explicit, external standards for these unstable internal standards should improve listener reliability. Further, the framework suggests that internal standards for vocal qualities are inherently unstable, and may be influenced by factors other than the physical signal being judged. Among these factors, context effects may cause drift in listeners’ voice ratings by influencing the internal standard against which judgments are made. To test these hypotheses, we asked 12 clinicians to judge the roughness of 22 synthetic stimuli using two scales: a traditional 5-point equal-appearing interval (EAI) scale and a scale with explicit anchor stimuli for each scale point. The stimulus set included a relatively large number of normal and mildly rough voices. We predicted that this would produce an increase in the perceived roughness of moderately rough stimuli over time for the EAI ratings, but not for the explicitly anchored ratings. Ratings made using the anchored scale were significantly more reliable than those gathered using the unanchored paradigm. Further, as predicted, ratings on the unanchored EAI scale drifted significantly within a listening session in the direction expected, but ratings on the anchored scale did not. These results are consistent with our framework and suggest that explicitly anchored paradigms for voice quality evaluation might improve both research and clinical practice.

References

Page 1

	Year	Citations

Page 1