Publication | Open Access
Manufacturing process data analysis pipelines: a requirements analysis and survey
297
Citations
47
References
2019
Year
EngineeringBusiness IntelligenceIndustrial EngineeringDigital ManufacturingSmart ManufacturingBig Data InfrastructureBig Data ModelData ScienceData-intensive PlatformManagementSystems EngineeringData IntegrationData ManagementRequirements AnalysisProcess MiningData ModelingMost PipelinesProcess SpecificationProcess AnalysisData-intensive ComputingManufacturing ActivitiesIndustrial DesignData ProcessingProcess PlanningIndustrial InformaticsMassive Data ProcessingBig Data
Smart manufacturing is closely linked to the digitization of all manufacturing activities, which increases the volume of data available for productivity and profit through data‑driven decision‑making programs. The article aims to help data engineers design big data analysis pipelines for manufacturing process data. It characterizes the requirements for such pipelines and surveys existing academic platforms. The results show that pipelines focus more on storage and analysis than on ingestion, communication, and visualization, use custom tools for ingestion and visualization, rely on relational tools for storage and analysis, handle heterogeneous data well, prefer batch over real‑time processing, and typically employ script‑based processing, with recommendations offered for each pipeline phase.
Smart manufacturing is strongly correlated with the digitization of all manufacturing activities. This increases the amount of data available to drive productivity and profit through data-driven decision making programs. The goal of this article is to assist data engineers in designing big data analysis pipelines for manufacturing process data. Thus, this paper characterizes the requirements for process data analysis pipelines and surveys existing platforms from academic literature. The results demonstrate a stronger focus on the storage and analysis phases of pipelines than on the ingestion, communication, and visualization stages. Results also show a tendency towards custom tools for ingestion and visualization, and relational data tools for storage and analysis. Tools for handling heterogeneous data are generally well-represented throughout the pipeline. Finally, batch processing tools are more widely adopted than real-time stream processing frameworks, and most pipelines opt for a common script-based data processing approach. Based on these results, recommendations are offered for each phase of the pipeline.
| Year | Citations | |
|---|---|---|
Page 1
Page 1