Publication | Closed Access
ImageExplorer: Multi-Layered Touch Exploration to Encourage Skepticism Towards Imperfect AI-Generated Image Captions
51
Citations
40
References
2022
Year
Artificial IntelligenceNatural User InterfaceEngineeringSpatial Information PresentationCommunicationMultimodal LlmInteractive Machine LearningText-to-image RetrievalVisual GroundingMulti-layered Touch ExplorationMultimodal InteractionRobot LearningCognitive ScienceBlind UsersUser ExperienceVision Language ModelComputer ScienceAlternative TextHuman-computer Interaction
Blind users rely on alternative text (alt-text) to understand an image; however, alt-text is often missing. AI-generated captions are a more scalable alternative, but they often miss crucial details or are completely incorrect, which users may still falsely trust. In this work, we sought to determine how additional information could help users better judge the correctness of AI-generated captions. We developed ImageExplorer, a touch-based multi-layered image exploration system that allows users to explore the spatial layout and information hierarchies of images, and compared it with popular text-based (Facebook) and touch-based (Seeing AI) image exploration systems in a study with 12 blind participants. We found that exploration was generally successful in encouraging skepticism towards imperfect captions. Moreover, many participants preferred ImageExplorer for its multi-layered and spatial information presentation, and Facebook for its summary and ease of use. Finally, we identify design improvements for effective and explainable image exploration systems for blind users.
| Year | Citations | |
|---|---|---|
Page 1
Page 1