Publication | Closed Access
How Data Science Workers Work with Data
248
Citations
62
References
2019
Year
Unknown Venue
Data ModelingEngineeringData ScienceData EngineeringData PracticeManagementCollaborative Data AnalysisData IntegrationHealth Data ScienceComputer ScienceData Science WorkersCollaborative Data ScienceResearch Data ManagementData LiteracyData ManagementBig DataHci Researchers
The rise of big data has increased demand for practitioners and research into their workflows, yet existing descriptions focus on abstract processes rather than how workers engage with data. The study aims to describe how data science professionals work with data and propose analytic interventions to better understand their complex data practices. The authors interviewed 21 professionals and defined five intervention-based data approaches—data as given, captured, curated, designed, and created—to analyze and improve data practices. Workers develop an intuitive sense of their data and actively shape it.
With the rise of big data, there has been an increasing need for practitioners in this space and an increasing opportunity for researchers to understand their workflows and design new tools to improve it. Data science is often described as data-driven, comprising unambiguous data and proceeding through regularized steps of analysis. However, this view focuses more on abstract processes, pipelines, and workflows, and less on how data science workers engage with the data. In this paper, we build on the work of other CSCW and HCI researchers in describing the ways that scientists, scholars, engineers, and others work with their data, through analyses of interviews with 21 data science professionals. We set five approaches to data along a dimension of interventions: Data as given; as captured; as curated; as designed; and as created. Data science workers develop an intuitive sense of their data and processes, and actively shape their data. We propose new ways to apply these interventions analytically, to make sense of the complex activities around data practices.
| Year | Citations | |
|---|---|---|
Page 1
Page 1