Concepedia

TLDR

Statistical machine learning is increasingly viewed as a tool for software development, yet prior research has largely focused on software development itself rather than developers applying machine learning. The study investigates how experienced researchers and novice participants apply statistical machine learning to human‑computer interaction problems through interviews and a five‑hour task. The authors conducted semi‑structured interviews with eleven experts and observed ten participants over a five‑hour session applying statistical machine learning to a realistic problem. The authors identify three main difficulties—treating machine learning as an iterative, exploratory process, understanding data–algorithm relationships, and evaluating performance in application contexts—highlighting the need for better development tools.

Abstract

As statistical machine learning algorithms and techniques continue to mature, many researchers and developers see statistical machine learning not only as a topic of expert study, but also as a tool for software development. Extensive prior work has studied software development, but little prior work has studied software developers applying statistical machine learning. This paper presents interviews of eleven researchers experienced in applying statistical machine learning algorithms and techniques to human-computer interaction problems, as well as a study of ten participants working during a five-hour study to apply statistical machine learning algorithms and techniques to a realistic problem. We distill three related categories of difficulties that arise in applying statistical machine learning as a tool for software development: (1) difficulty pursuing statistical machine learning as an iterative and exploratory process, (2) difficulty understanding relationships between data and the behavior of statistical machine learning algorithms, and (3) difficulty evaluating the performance of statistical machine learning algorithms and techniques in the context of applications. This paper provides important new insight into these difficulties and the need for development tools that better support the application of statistical machine learning.

References

YearCitations

Page 1