Increasing Video Accessibility for Visually Impaired Users with Human-in-the-Loop Machine Learning

TLDR

Video accessibility is crucial for blind and visually impaired individuals for education, employment, and entertainment, yet professional video descriptions are costly and time‑consuming, and volunteer‑created descriptions vary in quality and can be intimidating for novice describers. The study aims to develop a Human‑in‑the‑Loop Machine Learning approach that automates video text generation and scene segmentation while allowing human editing. The HILML system combines automated video text generation and scene segmentation with a human editing interface to streamline description creation. The HILML system was significantly faster and easier for first‑time describers and produced higher‑quality descriptions and better topic understanding than a human‑only control, as rated by blind and visually impaired users.

Abstract

Video accessibility is crucial for blind and visually impaired individuals for education, employment, and entertainment purposes. However, professional video descriptions are costly and time-consuming. Volunteer-created video descriptions could be a promising alternative, however, they can vary in quality and can be intimidating for novice describers. We developed a Human-in-the-Loop Machine Learning (HILML) approach to video description by automating video text generation and scene segmentation while allowing humans to edit the output. Our HILML system was significantly faster and easier to use for first-time video describers compared to a human-only control condition with no machine learning assistance. The quality of the video descriptions and understanding of the topic created by the HILML system compared to the human-only condition were rated as being significantly higher by blind and visually impaired users.

References

Page 1

	Year	Citations

Page 1