Machine learning and deep learning based predictive quality in manufacturing: a systematic review
TL;DR Summary
This review systematically analyzes 2012-2021 studies on ML and DL for predictive quality in manufacturing, categorizing methods and data usage, identifying challenges, and outlining future research to advance data-driven quality assurance.
Abstract
Journal of Intelligent Manufacturing (2022) 33:1879–1905 https://doi.org/10.1007/s10845-022-01963-8 Machine learning and deep learning based predictive quality in manufacturing: a systematic review Hasan Tercan 1 · Tobias Meisen 1 Received: 22 December 2021 / Accepted: 5 May 2022 / Published online: 28 May 2022 © The Author(s) 2022 Abstract With the ongoing digitization of the manufacturing industry and the ability to bring together data from manufacturing processes and quality measurements, there is enormous potential to use machine learning and deep learning techniques for quality assurance. In this context, predictive quality enables manufacturing companies to make data-driven estimations about the product quality based on process data. In the current state of research, numerous approaches to predictive quality exist in a wide variety of use cases and domains. Their applications range from quality predictions during production using sensor data to automated quality inspection in the field based on measurement data. However, there is currently a lack of an overall view of where predictive quality research stands as a whole, what approaches are currently being investi
Mind Map
In-depth Reading
English Analysis
1. Bibliographic Information
1.1. Title
Machine learning and deep learning based predictive quality in manufacturing: a systematic review
1.2. Authors
Hasan Tercan and Tobias Meisen
1.3. Journal/Conference
The paper was published in an academic journal, specifically stated as being "Published online: 28 May 2022". Given the typical publication venues for systematic reviews in engineering and computer science, it is likely a reputable journal in the field of manufacturing, artificial intelligence, or industrial informatics. The Publisher's Note at the end mentions "Springer Nature", indicating it's published by a major academic publisher.
1.4. Publication Year
Received: 22 December 2021 / Accepted: 5 May 2022 / Published online: 28 May 2022. The publication year is 2022. The review covers literature from 2012 to 2021.
1.5. Abstract
This systematic review investigates the application of machine learning (ML) and deep learning (DL) techniques for predictive quality in manufacturing. Motivated by the increasing digitization of manufacturing and the availability of process and quality data, the paper highlights the potential for data-driven quality assurance. Despite numerous existing approaches across diverse use cases—from in-production quality predictions using sensor data to automated quality inspection via measurement data—a comprehensive overview of the field's current state, prevalent approaches, and existing challenges is lacking. To address this, the authors conducted a systematic review of scientific publications from 2012 to 2021. They categorized these publications based on the manufacturing processes addressed, the data bases utilized, and the ML models employed. The review aims to provide key insights into the field's scope, identify gaps and similarities in solution approaches, and derive open challenges. Finally, it offers an outlook on future research directions to overcome these challenges.
1.6. Original Source Link
The original source link is provided as /files/papers/6903060f59708f78ec6faed6/paper.pdf. Based on the abstract and the provided text content, this appears to be the officially published version of the paper.
2. Executive Summary
2.1. Background & Motivation
The core problem the paper aims to solve is the lack of a comprehensive and up-to-date overview of the research landscape concerning predictive quality in manufacturing, particularly using machine learning (ML) and deep learning (DL).
This problem is important because:
-
The manufacturing industry is undergoing significant
digitization(Industry 4.0), leading to an abundance ofprocess dataandquality measurements. -
This data presents an enormous potential for
quality assurancethroughMLandDLtechniques. -
Predictive qualityallows manufacturing companies to make data-driven estimations about product quality based on process data, enabling proactive decision-making to avoid defects and improve efficiency.Specific challenges or gaps in prior research include:
-
While numerous
predictive qualityapproaches exist across varioususe casesanddomains(e.g., inline fault predictions, automated quality inspection), they are often considered in isolation. -
This isolated view makes it difficult to compare proposed approaches and understand the overall state of
predictive qualityresearch. -
Existing reviews are either too broad (e.g., general
MLapplications in production) or outdated, failing to cover recent advancements inMLandDL.The paper's entry point or innovative idea is to conduct a systematic, comprehensive, and up-to-date review specifically focused on
MLandDLbasedpredictive qualityin manufacturing over the last decade (2012-2021), categorizing publications bymanufacturing processes,data bases, andML modelsto identify trends, gaps, and future directions.
2.2. Main Contributions / Findings
The paper's primary contributions are:
-
Comprehensive Overview: Providing a systematic review of 81 scientific publications from 2012 to 2021 on
MLandDLbasedpredictive qualityin manufacturing. -
Categorization Framework: Establishing a clear categorization scheme based on
manufacturing processes(DIN 8580),quality criteria,data sources,input variables,data modality,learning tasks, andML/DL models. -
Key Insights: Identifying the scope of
predictive qualityapplications, common data sources and characteristics, and prevalentML/DLmodels used across different manufacturing domains. -
Identification of Gaps and Challenges: Pinpointing significant research gaps, such as imbalances in covered manufacturing processes, lack of integration into real production, small data amounts, scarcity of benchmark datasets, and underutilization of novel
DLmethods. -
Future Research Directions: Proposing concrete future research directions, including
synthetic data generation,benchmark data sets, exploration ofnovel deep learning methods(e.g., Transformer networks, Graph Neural Networks), advancedtime series classification and forecasting,transfer learningandcontinual learning, and strategies forintegration and deploymentin industrial settings withMLOPsandcertification.Key conclusions or findings reached by the paper include:
-
Predictive qualityis a versatile and powerful tool, primarily validated forprognostic qualityandaccuracy. -
Cuttingandjoiningprocesses dominate the research, while others likecoatingandchanging material propertiesare underrepresented. -
The majority of studies rely on
real manufacturing data, often generated experimentally with small sample sizes (median of 144 samples). -
Process parametersandsensor dataare the most commoninput variables, sometimes combined for better performance.Product designandmaterial propertiesare often overlooked. -
Numerical/continuousandimage dataare the most frequentdata modalities.Time series datais often transformed into scalar features. -
Multilayer Perceptrons (MLPs)andConvolutional Neural Networks (CNNs)are the most popularprime models, especially in recent years, withCNNsexcelling inimage datatasks. -
There's a significant need for larger, more representative
benchmark datasetsand better methods forsynthetic data generation. -
The
process integrationandreal-time capabilityofpredictive qualitysolutions in actual manufacturing environments remain a significant challenge, with a lack of evaluation usingquality-oriented metrics.
3. Prerequisite Knowledge & Related Work
3.1. Foundational Concepts
To understand this paper, a reader should be familiar with the following fundamental concepts:
-
Industry 4.0 (I4.0): This refers to the fourth industrial revolution, characterized by the digitalization and integration of manufacturing processes. It involves the use of cyber-physical systems, the Internet of Things (IoT), cloud computing, and artificial intelligence to create "smart factories." The goal is to achieve greater automation, real-time data exchange, and decentralized decision-making, leading to increased efficiency, flexibility, and productivity in manufacturing.
-
Digitization of Manufacturing: The process of converting information from analog to digital form and integrating digital technologies into manufacturing operations. This includes collecting data from machines and sensors, using digital twins, and implementing advanced analytics and automation.
-
Quality Assurance (QA): A system of activities designed to ensure that products or services meet specified quality requirements. In manufacturing, it involves monitoring and verifying processes and products throughout their lifecycle to prevent defects and ensure customer satisfaction.
-
Predictive Quality: A data-driven approach to quality assurance that uses
machine learninganddeep learningmodels to estimate the quality of a product or process based on various input data (e.g., process parameters, sensor readings). The goal is to predict potential quality issues before they occur or are fully realized, enabling proactive intervention to prevent defects, reduce waste, and optimize production. The paper defines it as: "Predictive quality comprises the use ofmachine learninganddeep learningmethods in production to estimate product-related quality based on process and product data with the goal of deriving quality-enhancing insights." -
Machine Learning (ML): A subfield of artificial intelligence that enables systems to learn from data without being explicitly programmed.
MLalgorithms build a mathematical model based on sample data, known as "training data," to make predictions or decisions without being specifically programmed to perform the task. Key types ofMLrelevant here are:- Supervised Learning: This is the primary focus of
predictive quality. Insupervised learning, the algorithm learns from labeled data, meaning the input data is paired with the correct output (target variable). The goal is to learn a mapping function from input to output so that it can predict the output for new, unseen input data.- Classification: A
supervised learningtask where the model predicts a categorical (discrete) output. For example, classifying a product as "OK" or "Not OK" (OK/NOK), or identifying different types of defects (e.g., "crack," "porosity," "roughness"). - Regression: A
supervised learningtask where the model predicts a numerical (continuous) output. For example, predicting the exact value of surfaceroughness,tensile strength, orproduct dimensions.
- Classification: A
- Unsupervised Learning: Algorithms learn from unlabeled data, identifying patterns or structures within the data without prior knowledge of correct outputs. While not the main focus,
anomaly detectionis a related concept often addressed byunsupervised learning, though explicitly excluded from this review's scope asanomaliesare not initially associated with a known defect. - Reinforcement Learning: Algorithms learn to make decisions by performing actions in an environment and receiving rewards or penalties based on their outcomes. Not a primary focus of this paper.
- Supervised Learning: This is the primary focus of
-
Deep Learning (DL): A subfield of
MLthat uses artificial neural networks with multiple layers (hence "deep") to learn representations of data with multiple levels of abstraction.DLmodels have shown remarkable success in tasks involving large amounts of data, especially for image, speech, and text processing. -
Artificial Neural Networks (ANNs): Computational models inspired by the structure and function of biological neural networks. They consist of interconnected nodes (neurons) organized in layers (input, hidden, output). Each connection has a weight, and neurons apply an activation function to their weighted sum of inputs.
ANNscan learn complex, non-linear relationships.- Multilayer Perceptron (MLP): A type of
feed-forward ANNcharacterized by multiple layers of neurons (at least three: an input layer, one or more hidden layers, and an output layer).MLPsare versatile and widely used for bothclassificationandregressiontasks. - Convolutional Neural Network (CNN): A specialized type of
ANNprimarily used for processing data with a grid-like topology, such asimage data.CNNsemployconvolutional layersthat automatically learn spatial hierarchies of features from the input, making them highly effective forimage recognition,object detection, andvisual inspectiontasks. - Recurrent Neural Network (RNN): A type of
ANNdesigned to process sequential ortime series data. Unlikefeed-forward ANNs,RNNshave connections that form directed cycles, allowing them to maintain an internal state (memory) and use information from previous inputs in the sequence to influence the processing of current inputs.- Long Short-Term Memory (LSTM): A special kind of
RNNcapable of learning long-term dependencies.LSTMshave internal memory cells and gating mechanisms (input, output, and forget gates) that regulate the flow of information, effectively addressing thevanishing gradient problemcommon in traditionalRNNsand enabling them to capture patterns over extended sequences.
- Long Short-Term Memory (LSTM): A special kind of
- Transformer Networks: A
DLmodel, primarily used innatural language processing (NLP), that relies entirely onself-attentionmechanisms, eschewingrecurrenceandconvolutions. They are highly effective for sequential data and have recently been adapted forcomputer vision(e.g.,Vision Transformers).
- Multilayer Perceptron (MLP): A type of
-
Support Vector Machine (SVM): A
supervised learningmodel used forclassificationandregression.SVMsfind an optimal hyperplane in a high-dimensional space that best separates different classes (or fits data points inregression), maximizing the margin between them. They are effective fornon-linear classificationby usingkernel tricks. -
Random Forest: An
ensemble learningmethod forclassificationandregression. It operates by constructing a multitude ofdecision treesat training time and outputting the class that is the mode of the classes (forclassification) or mean prediction (forregression) of the individual trees. It reducesoverfittingand improves accuracy compared to singledecision trees. -
Decision Tree: A
non-parametric supervised learningmethod used forclassificationandregression. It creates a model that predicts the value of a target variable by learning simple decision rules inferred from the data features. It structures decisions in a tree-like model of choices and their possible consequences. -
Data Modality: Refers to the type or format of data collected. Common modalities in manufacturing include:
Numerical/Continuous Data: Quantitative data with scalar values that can take any value within a given range (e.g., temperature, pressure, dimensions).Categorical/Discrete Data: Qualitative data that can take on a limited number of values, often representing types or categories (e.g., tool type, material batch, OK/NOK status).Time Series Data: A sequence of data points indexed in time order, typically collected from sensors over a period (e.g., vibration signals, welding current over time).Image Data: Visual data, often 2D images captured by cameras or other imaging sensors, used forvisual inspectionanddefect detection.
3.2. Previous Works
The paper explicitly mentions and differentiates itself from several related survey papers:
- Köksal et al. (2011): This study conducted an extensive literature review of
data miningapplications forquality improvementtasks in manufacturing.- Differentiation: While relevant, this study dates back to 2011, meaning it does not cover the significant advancements in
MLandDLthat have occurred since, especially with the rise ofdeep learning. The current paper aims to provide an up-to-date view.
- Differentiation: While relevant, this study dates back to 2011, meaning it does not cover the significant advancements in
- Rostami et al. (2015): This paper proposed a similar approach but with a specific focus on applications of
Support Vector Machines (SVMs)in manufacturing quality assessment.- Differentiation: Similar to Köksal et al., this review is also several years old and narrow in scope, focusing only on
SVMs. It does not encompass the broader range ofMLandDLmethods, particularly newerdeep learningarchitectures.
- Differentiation: Similar to Köksal et al., this review is also several years old and narrow in scope, focusing only on
- Weichert et al. (2019): This review focused on
machine learningapplications forproduction process optimizationwith regard to product- or process-specific metrics, often based onroot-cause analysisandfault diagnosis.- Differentiation: While there are overlaps, Weichert et al. primarily addressed
process optimizationapproaches. The current paper, in contrast, focuses specifically onquality estimation(prediction and classification) and evaluates approaches based on thedataandmethodsused for that purpose.
- Differentiation: While there are overlaps, Weichert et al. primarily addressed
- Broader ML/AI in Production Surveys: The paper also acknowledges other extensive studies that cover
AIandMLtechniques in theproduction and manufacturing contextbut with a broader scope:- Shang and You (2019): Overview of
data analyticsfor various application task areas likeprocess monitoringandoptimization, also discussingusabilityandinterpretability. - Fahle et al. (2020) and Mayr et al. (2019): Studied
MLapplications in differenttask scenariossuch asprocess planningandcontrol. - Sharp et al. (2018): Focused on
cross-domain applicationsin theproduct lifecycle. - Related Fields: Surveys on
ML-based predictive maintenance(Dalzochio et al., 2020; Zonta et al., 2020),condition monitoring(Serin et al., 2020b), andmachine fault diagnosis(Ademujimi et al., 2017). - Differentiation: The current paper distinguishes itself by explicitly focusing on approaches that primarily address the quality of the products produced, rather than broader
MLapplications in manufacturing,process optimization,predictive maintenance, orfault diagnosisfor machinery.
- Shang and You (2019): Overview of
3.3. Technological Evolution
The evolution of technologies in this field can be traced through several stages:
-
Early Data-Driven Quality (Pre-2010s): Initial attempts at using data for quality involved traditional statistical process control (
SPC),Six Sigma, and basicdata miningtechniques. These methods often relied on simpler linear models or expert-rule systems. Reviews like Köksal et al. (2011) represent this era, focusing ondata miningfor quality improvement. -
Rise of Traditional Machine Learning (Early 2010s): With increased computational power and algorithm development, more sophisticated
MLmodels likeSVMs,Random Forests, andMLPsbecame accessible. Rostami et al. (2015) focusing onSVMshighlights the growing interest in specificMLalgorithms for quality applications. -
Deep Learning Revolution (Mid-2010s onwards): The breakthrough of
deep learning, particularlyCNNsforimage dataandRNNs/LSTMsforsequential data, transformedAIcapabilities. This period sawDLmodels achieve state-of-the-art performance in complex tasks likevisual inspectionandtime series prediction. The current paper's timeframe (2012-2021) directly captures this shift, showing a significant increase inCNNandLSTMusage. -
Industry 4.0 Integration: Parallel to
ML/DLadvancements, theIndustry 4.0paradigm has driven the integration ofsensor technologies,data collection infrastructure, anddigital twinsinto manufacturing. This creates the necessary data ecosystem forpredictive qualitysolutions to thrive.This paper fits within the technological timeline by documenting the shift from earlier
data miningandtraditional MLapproaches to the widespread adoption and exploration ofdeep learningmodels within the context ofIndustry 4.0for the specific application ofpredictive quality.
3.4. Differentiation Analysis
Compared to the main methods in related work, the core differences and innovations of this paper's approach are:
- Specific Focus: Unlike broader surveys of
MLin manufacturing, this paper has a precise and narrow focus:machine learninganddeep learningspecifically forpredictive qualityof products (prediction/classification of product-related quality based on process/product data). This excludes related but distinct fields likepredictive maintenance,fault diagnosis(for machines), and generalprocess optimization. - Time Scope: By analyzing publications from 2012 to 2021, it captures the most recent decade of research, encompassing the rapid advancements in
deep learningthat older reviews missed. - Systematic Categorization: The review employs a structured methodology (defined guiding questions, detailed categories like DIN 8580 for processes, various data characteristics, and model types) to provide a granular and comparable analysis across studies. This structured approach allows for the identification of specific trends, commonalities, and gaps that a less systematic review might overlook.
- Identification of Gaps and Future Directions: Beyond summarizing existing work, the paper actively derives
open challengesandfuture research directionsgrounded in the systematic analysis, offering a forward-looking perspective crucial for guiding future academic and industrial efforts. This includes highlighting the need forsynthetic data,benchmark datasets, and the adoption ofnovel deep learning architectureslikeTransformers.
4. Methodology
4.1. Principles
The core idea of the method used in this paper is to conduct a systematic review of scientific literature to provide a comprehensive, structured overview of machine learning and deep learning applications for predictive quality in manufacturing. The theoretical basis or intuition behind this approach is that by systematically collecting, categorizing, and analyzing a defined body of literature, one can gain a clear understanding of the current state of research, identify prevalent trends, uncover existing gaps, and formulate informed directions for future work. This methodology ensures rigor and reduces bias compared to anecdotal or less structured literature reviews.
The authors define predictive quality as: "Predictive quality comprises the use of machine learning and deep learning methods in production to estimate product-related quality based on process and product data with the goal of deriving quality-enhancing insights."
The common approach to predictive quality includes four main steps (schematically illustrated in Fig. 1):
-
Formulation of the manufacturing process and target quality: Clearly defining what process is being analyzed and what specific quality aspect (e.g.,
surface roughness,defect types) is to be predicted. -
Selection and collection of process and quality data: Gathering relevant data from the manufacturing environment, which could include
process parameters,sensor data, orproduct measurements. -
Training of an
ML/DL model: Using the collected data to train amachine learningordeep learningmodel to learn the relationship between input data and product quality. -
Use of the model for estimations and decision support: Deploying the trained model to make predictions about product quality, which can then inform decisions for
quality enhancement,process adjustment, orautomated inspection.The following figure (Figure 1 from the original paper) illustrates the four main steps of the common approach to predictive quality:
该图像是一个示意图,展示了制造业中基于机器学习和深度学习的预测质量流程,包括制造过程、数据基础和模型训练与预测三个主要环节。
4.2. Core Methodology In-depth (Layer by Layer)
The systematic review methodology employed in this paper follows a structured process to ensure comprehensiveness and relevance.
4.2.1. Guiding Questions (Q1, Q2, Q3)
The review is driven by three main questions, which serve as the framework for data extraction and analysis:
- Q1: What are the addressed manufacturing processes and quality criteria? This question aims to understand the scope of
predictive qualityapplications, identify commonmanufacturing processesandquality criteriabeing studied, and uncover similarities or gaps in domains. - Q2: What are the characteristics of the data used for model training? This focuses on the
data sources(e.g.,running production,simulation),input variables(e.g.,sensor data),data modality(e.g.,time series,categorical data), anddata amountutilized. - Q3: Which
machine learning modelsofsupervised learningare commonly trained? This question explores thesupervised learning tasks(e.g.,classification,regression), specificMLandDLmodels employed forquality estimations, and whether they are compared against other models.
4.2.2. Categorization Scheme
To answer the guiding questions, the authors developed a set of categories for reviewing and summarizing the publications. The following are the categories (Table 1 from the original paper):
The following are the results from Table 1 of the original paper:
| Question | Category | Description |
|---|---|---|
| Q1 | Use case | Main use case of paper and purpose of predictive quality |
| Process | Addressed manufacturing process (e.g. laser welding, deep drawing) | |
| Category | Category of process according to DIN 8580 (e.g. cutting, forming) | |
| Criterion | Estimated quality criterion (e.g. product dimensions, OK/NOK quality) | |
| Q2 | Data source | Main source of process data (e.g. running production, simulation) |
| Input variables | Data parameters used for the model training (e.g. sensor data) | |
| Data modality | Data types of gathered data (e.g. time series, categorical data) | |
| Data amount | Number of observations used for model training | |
| Q3 | Learning task | Formulated learning task (e.g. classification, regression) |
| Prime model | Primarily used (or best performing) ML/DL model (e.g. SVM, CNN) | |
| Baselines | ML/DL-models used for comparison to prime model (e.g. SVM, CNN) |
4.2.3. Literature Search Strategy
The literature search was performed in the Web of Science and ScienceDirect databases. To cover the broad scope of predictive quality and manufacturing, a comprehensive set of search terms was defined and categorized into three groups:
-
Predictive Quality: Terms directly related to quality prediction (
predictive quality,predictive analytics,fault prediction,fault classification,defect prediction,quality prediction,smart manufacturing). -
Learning: Terms related to the core methodologies (
deep learning,neural network,machine learning). -
Domain: Terms specifying the application area (
manufacturing,production,industrial,engineering,automation,assembly).The following are the results from Table 2 of the original paper:
Category Search terms Predictive Quality Predictive quality, predictive analytics, fault prediction, fault classification, defect prediction, quality prediction, smart manufacturing Learning Deep learning, neural network, machine learning Domain Manufacturing, production, industrial, engineering, automation, assembly
Search queries were formulated to find publications containing at least one term from each of these three categories. The search was restricted to publications from 2012 to 2021 (conducted on June 29, 2021). This initial search yielded 1,261 potentially relevant publications.
4.2.4. Screening and Exclusion Criteria
A multi-stage screening process was applied:
- Initial Screening (Title and Abstract): Publications were first screened based on their titles and abstracts. Many were discarded if they fell into unrelated fields such as
predictive maintenance,fault diagnosis,remaining useful lifetime prediction,software defect prediction,water quality prediction,process engineering, orcivil engineering. This step reduced the pool to 144 publications. - Detailed Review (Full Text) and Exclusion Criteria: The remaining 144 publications were then read in detail and categorized according to the scheme in Table 1. During this process, publications were excluded if they met any of the following criteria:
-
Do not contain information about the addressed
manufacturing processor thedata basis. -
Are
survey papersorliterature studies(as this review aims to analyze primary research). -
Do not perform any
development,implementation, orevaluation of methods. -
Are
not accessibleto the reviewers.This rigorous selection process resulted in a final corpus of 81 publications for the systematic review. The overall methodology is summarized in the following figure (Figure 2 from the original paper):
该图像是论文中的流程示意图,展示了文献检索的方法论。图中显示从两个数据库“Web of Science”和“ScienceDirect”检索到的初始文献数量,经过预筛选和排除标准,最终确定用于综述的文献数量。
-
4.2.5. Final Corpus
The final corpus consisted of 81 selected publications. The majority (69%) were published in journals, and 31% in scientific conference proceedings. The number of publications per year showed a consistent increase, particularly between 2012 and 2020.
The following figure (Figure 3 from the original paper) illustrates the number of publications per year:
该图像是一个柱状图,展示了2012年至2021年间预测质量相关论文的年度发表数量,从2012年至2019年逐步增长,2020年达到峰值30篇,2021年有所下降。图表反映了该领域研究的增长趋势。
5. Experimental Setup
As this paper is a systematic review, it does not involve traditional experimental setups with datasets, models, and evaluation metrics in the same way an empirical research paper would. Instead, its "experimental setup" is its methodology for conducting the review itself.
5.1. Datasets
The "dataset" for this systematic review is the corpus of 81 selected scientific publications.
- Source: Publications identified through searches in
Web of ScienceandScienceDirectdatabases. - Scale: The corpus consists of 81 peer-reviewed papers.
- Characteristics: These papers exclusively deal with
machine learninganddeep learningforpredictive qualityinmanufacturing. They span the period from 2012 to 2021. The types include both journal articles (69%) and conference papers (31%). - Domain: The publications cover a wide range of manufacturing processes and quality criteria, as detailed in the results section.
- Why chosen: This corpus was chosen to provide a comprehensive and up-to-date overview of the research landscape, capturing recent advancements in
ML/DLandIndustry 4.0. The stringent selection criteria ensure that only highly relevant and methodologically sound primary research papers are included.
5.2. Evaluation Metrics
The "evaluation metrics" for this systematic review are the categories defined in Table 1, which serve as criteria for extracting and synthesizing information from the selected publications. These categories are structured to address the three guiding questions (Q1, Q2, Q3) of the review.
For each publication, the following information was extracted and categorized:
-
Use case: The main application scenario and purpose ofpredictive quality. -
Process: The specificmanufacturing processaddressed (e.g.,laser welding,deep drawing). -
Category: Themanufacturing processcategory according toDIN 8580(e.g.,cutting,forming). -
Criterion: Theestimated quality criterion(e.g.,product dimensions,OK/NOK quality). -
Data source: The origin of the process data (e.g.,running production,simulation). -
Input variables: Thedata parametersused formodel training(e.g.,sensor data). -
Data modality: Thedata typesof gathered data (e.g.,time series,categorical data). -
Data amount: Thenumber of observationsused formodel training. -
Learning task: The formulatedlearning task(classificationorregression). -
Prime model: Theprimarily used(orbest performing)ML/DL model. -
Baselines:ML/DL modelsused forcomparisonto theprime model.These categories serve as qualitative and quantitative metrics to systematically analyze and compare the diverse approaches presented in the literature. For example,
Data amountallows for quantitative analysis of data usage trends, whilePrime modelandBaselinesenable an understanding of model prevalence and comparative studies.
5.3. Baselines
In the context of a systematic review, "baselines" don't refer to models being compared in experiments, but rather to existing literature reviews or surveys that cover similar ground. The paper differentiates its scope from these previous works to establish its unique contribution.
The main baselines (related surveys) against which this review implicitly compares its scope and timeliness are:
-
Köksal et al. (2011): This review focused on
data miningforquality improvement. The current paper differentiates by being more recent and focused onML/DL, especiallydeep learning. -
Rostami et al. (2015): This review focused on
SVMapplications. The current paper differentiates by covering a broader range ofML/DLmodels and being more up-to-date. -
Weichert et al. (2019): This review focused on
process optimization. The current paper differentiates by focusing specifically onproduct quality estimation. -
Broader
ML/AIin Production Surveys: Other surveys onML/AIin manufacturing (e.g., Shang & You, 2019; Fahle et al., 2020; Mayr et al., 2019; Sharp et al., 2018) orpredictive maintenance(Dalzochio et al., 2020; Zonta et al., 2020) are mentioned but deemed to have a broader or different scope.By highlighting these prior works, the authors establish that their review fills a specific gap: providing a current, comprehensive, and focused analysis of
ML/DLforpredictive qualityin manufacturing.
6. Results & Analysis
This section presents the findings of the systematic review, structured according to the guiding questions.
6.1. Manufacturing Process Types and Quality Criteria
The review categorized manufacturing processes based on . Additionally, additive manufacturing, assembly processes, and multi-stage processes were added.
The following figure (Figure 4 from the original paper) illustrates the number of publications for each manufacturing process type:
该图像是图4的柱状图,展示了不同制造工艺类型的相关文献发表数量。图中显示,切削工艺的文献数量最多,其次是连接和初级成形工艺,其他如成形、增材制造等数量相对较少。
- Overall Distribution: As seen in Figure 4,
cuttingprocesses comprise the largest group (26 publications), followed byjoining(14 publications).Primary shapingandformingeach have 10 publications.Additive manufacturinghas 8,assemblyhas 5, andcoatinghas 4.Multi-stageprocesses also have 4 publications. Notably, there are no publications primarily addressingchanging material properties.
6.1.1. Cutting Processes
- Dominance: This category has the most research.
- Common Criteria: Most research focuses on
quality characteristicsreflecting the product's shape, withsurface roughnessbeing the most prevalent. Other criteria includehole diameter,kerf waviness,geometric deviation,material removal rate,machinability, andgroove geometry. - Examples:
-
Turning:Surface roughness(e.g., Du et al., 2021; Elangovan et al., 2015) usingmultivariate regressionorANNsbased onsensor dataandmachine parameters. -
Drilling:Hole diameter(Neto et al., 2013) andsurface roughness(Vrabel et al., 2016). -
Laser cutting:Surface roughness(Tercan et al., 2016, 2017) andkerf waviness(Nguyen et al., 2020).The following are the results from Table 3 of the original paper, showing considered
cuttingandjoiningprocesses andquality criteria:Category Process Quality criteria Cutting Turning (6) Surface roughness (Acayaba & de Escalona, 2015; Du et al., 2021; Elangovan et al., 2015; Moreira et al., 2019; Tuar et al., 2017), machinability (Lutz et al., 2020) Drilling (6) Diameter (Neto et al., 2013), Schorr et al. 2020a, 2020b, surface roughness (Vrabel et al., 2016), hole defects (Jiao et al., 2020), surface gap (Bustillo et al., 2018) Laser cutting (4) Surface roughness (Tercan et al., 2016, 2017; Zhang & Lei, 2017), kerf waviness (Nguyen et al., 2020) Milling (3) Surface roughness (Hossain & Ahmad, 2014; Serin et al., 2020a), geometric deviation (de Oliveira Leite et al., 2015) Honing (2) Curface roughness (Gejji et al., 2020; Klein et al., 2020) C. M. polishing (1) Material removal rate (Yu et al., 2019) Diamond wire cutting (1) Surface roughness (Kayabasi et al., 2017) Grinding (1) Surface roughness (Varma et al., 2017) Laser micro grooving (1) Groove geometry (Zahrani et al., 2020) Joining Laser machining (1) Dimensions (McDonnell et al., 2021) Laser welding (4) Weld bead dimensions (Ai et al., 2016; Lei et al., 2019), tensile strength (Yu et al., 2016), quality types (Yu et al., 2020) Resistance spot welding (3) Tensile shear strength (Hamidinejad et al., 2012), tensile shear load bearing (Martín et al., 2016), welding deformation (Li et al., 2020a) Ultrasonic welding (3) Quality types (Goldman et al., 2021; Li et al., 2020b), tensile strength (Natesh et al., 2019) Gas metal arc welding (2) Weld penetration (Gyasi et al., 2019), weld bead dimensions (Wang et al., 2021) Gluing (1) Glue volume (Dimitriou et al., 2020) Welding (1) Residual stress (Dhas & Kumanan, 2014)
-
6.1.2. Joining Processes
- Focus: The second-largest category, mainly
weldingapplications. - Common Criteria:
Tensile strength,weld bead dimensions,residual stress,quality types(classification). - Examples:
Laser welding:ANNstrained onwelding parametersorsensor datato predicttensile strength(Yu et al., 2016) orweld bead dimensions(Ai et al., 2016).Spot weldingandultrasonic welding: Similar approaches fortensile shear strengthorquality types.Gluing:Glue volumeestimation based on3D laser topology scans(Dimitriou et al., 2020).
6.1.3. Primary Shaping Processes
- Focus: Producing a defined shape from shapeless material (10 publications).
- Common Criteria:
Casting defects,product dimensions,product weight,warpage,product geometry,yield stress,yarn quality(count-strength-product),leveling action point. - Examples:
Casting:CNNsonX-ray images(Ferguson et al., 2018) orMLPsonsensor data(Kim et al., 2018) fordefect detection.Injection molding:Product dimensions(Ke & Huang, 2020) orproduct weights(Ge et al., 2012) frommachine parameters.
6.1.4. Forming Processes
- Focus: Transforming raw parts into different shapes (10 publications).
- Common Criteria:
Surface defects,slab geometry,part defects,machine speed,process feasibility,shear deformation. - Examples:
Metal rolling:In-line quality estimationslikesurface defectsdetection usingCNNsoncamera images(Yun et al., 2020) orNOK qualityprediction fromultrasonic measurements(Lieber et al., 2013).Sheet metal forming:Part defectsprediction usingLSTMsonsensor data(Meyes et al., 2019) or simulated experiments (Dib et al., 2020).
6.1.5. Additive Manufacturing Processes
- Focus: 8 publications, used both in
designandrealizationphases. - Common Criteria:
Geometric deviation,inherent strain,structural defects,single-track width,volume porosity,tensile strength,surface roughness. - Examples:
-
Design phase:ANNsfor predictinginherent strain(Li & Anand, 2020) orgeometric deviations(Zhu et al., 2020) fromprocess simulations. -
Realization phase:Quality predictionsbased onoptical measurements(Gaikwad et al., 2020) ormachine sensors(Li et al., 2019).The following are the results from Table 4 of the original paper, showing considered
primary shaping,forming, andadditive manufacturingprocesses andquality criteria:Process type Process Quality criteria Primary shaping Casting (3) Casting defects (Ferguson et al., 2018; Kim et al., 2018; Lee et al., 2018) Injection molding (3) Product dimensions (Ke & Huang, 2020), product weight (Ge et al., 2012), warpage (Alvarado-Iniesta et al., 2012) Plastics extrusion (2) Product geometry (Garcia et al., 2019), yield stress (Mulrennan et al., 2018) Forming Spinning (2) Yarn strength (Nurwaha & Wang, 2012), sliver evenness (Abd-Ellatif, 2013) Metal rolling (5) Surface defects (Li et al., 2018; Lieber et al., 2013; Liu et al., 2021; Yun et al., 2020), slab geometry (Stähl et al., 2019) Sheet metal forming (3) Part defects (Dib et al., 2020; Meyes et al., 2019), machine speed (Essien & Giannetti, 2020) Additive manuf. Forging (1) Process feasibility (Ciancio et al., 2015) Textile draping (1) Shear deformation (Zimmerling et al., 2020) Laser powder bed fusion (4) Geometric deviation (Zhu et al., 2020), inherent strain (Li & Anand, 2020), structural defects (Bartlett et al., 2020), single-track width (Gaikwad et al., 2020) Direct metal deposition (1) Volume porosity (Zhang et al., 2019a) Fused deposition modeling (2) Tensile strength (Zhang et al., 2018, 2019b) PLA 3D printing (1) Surface roughness (Li et al., 2019)
-
6.1.6. Assembly Processes
- Focus: 5 publications, mainly on
ML-based classificationof successful/unsuccessfulassembly tasks. - Common Criteria:
Operation success,component position. - Examples:
Detection of functioning productsinmanual assembly(Wagner et al., 2020),correct positioninginSMT assembly(Schmitt et al., 2020a),wire plug connection qualityusingacoustic signals(Sarivan et al., 2020),fastened screwsdetection usingline camera images(Martinez et al., 2020).
6.1.7. Coating Processes
- Focus: 4 publications, applying an adhesive layer.
- Common Criteria:
Defect detection(defect types),OK/NOK classification,paint structure. - Examples:
SVM-based defect detectionindispensing(Oh et al., 2019),CNNsonmachine sensor dataforelectric wafer quality(Hsu & Liu, 2021).
6.1.8. Multi-stage Processes
-
Focus: 4 publications, dealing with complex production lines.
-
Common Criteria:
Battery capacity,state of health,product dimensions,quality types,fabric defects. -
Challenge: Increased complexity and
data sources. -
Examples:
DL-methods(e.g.,LSTMs) forquality predictionin larger production lines based onmultimodal sensor data(Liu et al., 2020b).The following are the results from the table continued after Table 4 of the original paper, showing considered
assembly,multi-stage, andcoatingprocesses andquality criteria:Process type Process Quality criteria Assembly Manual assembly (2) Operation success (Wagner et al., 2020; Sarivan et al., 2020) Screw fastening (1) Operation success (Martinez et al., 2020) SMT assembly (1) Component position (Schmitt et al., 2020a) Snap-fit assembly (1) Operation success (Doltsinis et al., 2020) Multi-stage Battery-cell manufacturing (1) Battery capacity and state of health (Turetskyy et al., 2021) Metal forming and machining (1) Product dimensions (Papananias et al., 2019) Production line (1) Quality types (Liu et al., 2020b) Textile manufacturing (1) Fabric defects (Jun et al., 2021) Coating Chemical vapor deposition (1) quality types (Hsu & Liu, 2021) Lacquering (1) Defect types (Thomas et al., 2018) Painting (1) Paint structure (Kebisek et al., 2020) Primer-sealer dispensing (1) Defect types (Oh et al., 2019)
6.2. Data Bases and Characteristics
6.2.1. Data Set Sources and Amount
The review identifies three main data sources: real data from manufacturing processes (further divided into experimental and running production), virtual data from simulations, and freely available data sets (benchmark/competition).
The following are the results from Table 6 of the original paper, showing main sources of process data to train machine learning models:
| Data source | Publications |
|---|---|
| Simulation | Alvarado-Iniesta et al. (2012), Ciancio et al. (2015), Dib et al. (2020), Li and Anand (2020), Tercan et al. (2016, 2017), Zhu et al. (2020), Zimmerling et al. (2020) |
| Benchmark/competition | Ferguson et al. (2018), Jun et al. (2021), Liu et al. (2020b, 2021), Yu et al. (2019) |
| Real data (running production) | Essien and Giannetti (2020), Goldman et al. (2021), Kebisek et al. (2020), Lee et al. (2018), Li et al. (2018), Meyes et al. (2019), Oh et al. (2019), Schmitt et al. (2020a), Sthl et al. (2019), Wagner et al. (2020), Yun et al. (2020) |
| Real data (experimental) | Acayaba and de Escalona (2015), Ai et al. (2016), Bartlett et al. (2020), Bustillo et al. (2018), Dhas and Kumanan (2014), Dimitriou et al. (2020), Doltsinis et al. (2020), Du et al. (2021), Elangovan et al. (2015), Gaikwad et al. (2020), Garcia et al. (2019), Gejji et al. (2020), Zahrani et al. (2020), Gyasi et al. (2019), Hamidinejad et al. (2012), Hossain and Ahmad (2014), Hsu and Liu (2021), Jiao et al. (2020), Kayabasi et al. (2017), Ke and Huang (2020), Kim et al. (2018), Klein et al. (2020), Li et al. (2019, 2020a, 2020b), Lutz et al. (2020), Mulrennan et al. (2018), Lei et al. (2019), de Oliveira Leite et al. (2015), Lieber et al. (2013), Martín et al. (2016), Martinez et al. (2020), McDonnell et al. (2021), Moreira et al. (2019), Natesh et al. (2019), Neto et al. (2013), Nguyen et al. (2020), Papananias et al. (2019), Sarivan et al. (2020), Schorr et al. (2020a, 2020b), Serin et al. (2020a), Thomas et al. (2018), Turetskyy et al. (2021), Varma et al. (2017), Vrabel et al. |
- Dominance of Real Data: The majority of publications (65%) use
real dataobtainedexperimentally, while 14% usereal datafromrunning production.Simulation dataaccounts for 10%, andbenchmark datafor 6%. - Simulation Data: Used in 8 publications, often to demonstrate feasibility or generate fast
ML modelsforprocess design. Data samples vary widely (e.g., 30 to >22,000, with an average of 9,864).Data augmentationtechniques were sometimes used (e.g., Zhu et al., 2020). - Benchmark Data: Identified in 5 publications, primarily for
image-based defect classification(e.g.,GRIMA X-Ray casting data,NEU-DET,Xuelang manufacturing AI challenge data set). Average size is 5,722 samples/images.Data augmentationwas also used here. - Real Data (Experimental): Most research (65%) conducts predefined experiments, varying
process parametersunder fixed conditions.Design of Experiments(e.g.,full factorial,Taguchi) is commonly used. The number of parameters varied is typically small (2-8).- Data Amount (Experimental): The majority use around 100 samples (median 144). The average is about 5,600, but this is skewed by some publications using
data augmentationormultiple measurements per experiment(Figure 5, blue bars).
- Data Amount (Experimental): The majority use around 100 samples (median 144). The average is about 5,600, but this is skewed by some publications using
- Real Data (Running Production): 11 publications use data from
running manufacturing processes, collected over longer periods (e.g., months to years).-
Data Amount (Running Production): These datasets are generally larger (average 73,984 samples; Figure 5, red bars), with the largest containing 525,600 samples.
The following figure (Figure 5 from the original paper) illustrates the distribution of publications according to the number of data samples used for model training and evaluation:
该图像是图表,展示了文献中用于模型训练和评估的数据样本数量分布(对数坐标)。其中蓝色柱表示实验数据,红色柱表示运行时采集的真实数据,大部分研究集中在样本数量100至1000之间。
-
6.2.2. Input Variables
Three major types of input variables are identified: process parameters, sensor data, and product measurements.
The following are the results from Table 7 of the original paper, showing types of input variables used for predictive quality models:
| Variable type | Publications |
|---|---|
| Process parameters (30) | Acayaba and de Escalona (2015), Ai et al. (2016), Alvarado-Iniesta et al. (2012), Bustillo et al. (2018), Ciancio et al. (2015), Dhas and Kumanan (2014), Dib et al. (2020), Ge et al. (2012), Gejji et al. (2020), Zahrani et al. (2020), Hamidinejad et al. (2012), Hossain and Ahmad (2014), Jiao et al. (2020), Kayabasi et al. (2017), Kebisek et al. (2020), Mulrennan et al. (2018), Lei et al. (2019), Li and Anand (2020), Martín et al. (2016), McDonnell et al. (2021), Natesh et al. (2019), Nguyen et al. (2020), Serin et al. (2020a), Tercan et al. (2016, 2017), Varma et al. (2017), Yu et al. (2020), Zhang and Lei (2017), Zhu et al. (2020), Zimmerling et al. |
| Sensor Data (22) | (2020) Essien and Giannetti (2020), Doltsinis et al. (2020), Du et al. (2021), Garcia et al. (2019), Goldman et al. (2021), Gyasi et al. (2019), Hsu and Liu (2021), Kim et al. (2018), Lee et al. (2018), Li et al. (2019, 2020a, 2020b), Lieber et al. (2013), Meyes et al. (2019), Moreira et al. (2019), Neto et al. (2013), Nurwaha and Wang (2012), Papananias et al. (2019), Sarivan et al. (2020), Schorr et al. 2020a, 2020b, Turetskyy et al. (2021) |
| Sensor data + process parameters (9) | Elangovan et al. (2015); Ke and Huang (2020), Klein et al. (2020), Lutz et al. (2020), Thomas et al. (2018), Vrabel et al. (2016), Yu et al. (2016), Zhang et al. (2018, 2019b) |
| Product measurements (16) | Bartlett et al. (2020), Dimitriou et al. (2020), Ferguson et al. (2018), Gaikwad et al. (2020), Jun et al. (2021), de Oliveira Leite et al. (2015), Li et al. (2018), Liu et al. (2021), Martinez et al. (2020), Oh et al. (2019), Schmitt et al. (2020a), Sthl et al. (2019), Tuar et al. (2017), Wang et al. (2021), Yun et al. (2020), Zhang et al. (2019a) |
-
Process Parameters (30 publications): Settings for production (e.g.,
feed rate,cutting speed,laser power,focal position,process times,temperatures). Used to predict quality under new parameter spaces. Some works also includedesign parameters(e.g.,hatch patterns,product size). -
Sensor Data (22 publications): Real-time data from the process/machine (e.g.,
welding current,temperature,pressure,vibration,torque,force). Used for in-process quality estimation. -
Sensor Data + Process Parameters (9 publications): A combination of both, shown to
significantly improve performance(Elangovan et al., 2015). Also includesmaterial batchesandtool typesin some cases (Lutz et al., 2020). -
Product Measurements (16 publications): Data from the product itself during or after production (e.g.,
imagesfrom cameras,geometric measurements,thermal images,X-ray images,melt pool images,laser topology scans). Primarily used forautomated defect detection.Image datais particularly common.The following figure (Figure 6 from the original paper) shows the number of occurrences of the input variable types in the addressed manufacturing processes:
该图像是图表,展示了图6中不同制造工艺下输入变量类型的出现次数,分类包括切削、连接、初级成形和成形,数据仅来自真实的实验或生产环境。圆点大小表示出处数量,横轴为制造工艺,纵轴为输入数据类型。
Figure 6 illustrates the varying prevalence of input variable types across different manufacturing processes. For instance, turning and drilling widely use both parameters and sensor data, while laser cutting predominantly relies on process parameters. Metal rolling shows a strong emphasis on product measurements.
6.2.3. Data Modality
The review identifies four data modalities: categorical, time series, image, and numerical/continuous.
The following are the results from Table 8 of the original paper, showing occurring modalities of data sets used for training the ML and DL models:
| Variable Type | Publications |
|---|---|
| Categorical/discrete Time Series | Lee et al. (2018), Liu et al. (2020b), Lutz et al. (2020), Thomas et al. (2018) |
| Essien and Giannetti (2020), Goldman et al. (2021), Gyasi et al. (2019), Hsu and Liu (2021), Meyes et al. (2019), Sarivan et al. (2020), Sthl et al. (2019), Zhang et al. (2018), Zhang et al. (2019b) | |
| Image | Bartlett et al. (2020), Dimitriou et al. (2020), Ferguson et al. (2018), Jun et al. (2021), Li et al. (2018), Liu et al. (2021), Martinez et al. (2020), Oh et al. (2019), Wang et al. (2021), Yun et al. (2020), Zhang et al. (2019a) |
| Continuous/numerical | Abd-Ellatif (2013), Acayaba and de Escalona (2015), Ai et al. (2016), Alvarado-Iniesta et al. (2012), Bustillo et al. (2018), Ciancio et al. (2015), Dhas and Kumanan (2014), Dib et al. (2020), Doltsinis et al. (2020), Du et al. (2021), Elangovan et al. (2015), Gaikwad et al. (2020), Garcia et al. (2019), Ge et al. (2012), Gejji et al. (2020), Zahrani et al. (2020), Hamidinejad et al. (2012), Hossain and Ahmad (2014), Jiao et al. (2020), Kayabasi et al. (2017), Ke and Huang (2020), Kebisek et al. (2020), Kim et al. (2018), Klein et al. (2020), Mulrennan et al. (2018), Lee et al. (2018), Lei et al. (2019), de Oliveira Leite et al. (2015), Li et al. (2019, 2020a, 2020b), Li and Anand (2020), Lieber et al. (2013), Liu et al. (2020b), Lutz et al. (2020), Martín et al. (2016), McDonnell et al. (2021), Moreira et al. (2019), Natesh et al. (2019), Neto et al. (2013), Nguyen et al. (2020), Nurwaha and Wang (2012), Papananias et al. (2019), Schmitt et al. (2020a), Schorr et al. (2020a, 2020b), Serin et al. (2020a), Tercan et al. (2017), Tercan et al. (2016), Thomas et al. (2018), Turetskyy et al. (2021), Tuar et al. (2017), Varma et al. (2017), Vrabel et al. (2016), Yu et al. (2016, |
- Categorical Data: Least common, representing non-numeric entities like
tool typeormaterial batch. - Time Series Data: Used in 9 publications, primarily derived from
sensor data(8 pubs) orproduct measurements(1 pub). Can beunivariateormultivariate. - Image Data: Common, especially 2D images from
product measurementsfordefect detection.3D point cloudsare also used.Data augmentation(e.g., adding noise, cropping) is frequently applied to increase image datasets. - Numerical/Continuous Data: The vast majority of publications use this type, representing scalar values from
parameter settingsor transformedsensor/measurement data.Feature extraction(e.g.,mean,max,min) orexpert-driven aggregationis common to converttime series dataintoscalar numerical quantities.
6.3. Machine Learning Methods
6.3.1. Learning Tasks
- Classification: 30 of 81 publications, for
error detectionordefect type classification. - Regression: 51 of 81 publications, for
numerical predictionof quality variables.
6.3.2. Model Comparison
-
Single Model/Variants: 49% of publications (40 out of 81) evaluate only a single model or its variants.
-
Multiple Models (Comparison): 51% of publications (41 out of 81) compare several models experimentally.
-
Prime Model: The focus model or the best-performing model. -
Baseline Models: Other models used for comparison.The following are the results from Table 9 of the original paper, showing overview of addressed learning tasks and used prime models in all publications:
Learning task ML-model Publications Classification CNN Ferguson et al. (2018), Goldman et al. (2021), Hsu and Liu (2021), Jun et al. (2021), Li et al. (2018), Liu et al. (2021), Martinez et al. (2020), Sarivan et al. (2020), Yun et al. (2020), Zhang et al. (2019a) Decision tree Tercan et al. (2016, 2017) Ensemble model Gejji et al. (2020), Kim et al. (2018), Thomas et al. (2018) K-NN Lieber et al. (2013) MLP Bustillo et al. (2018), Dib et al. (2020), Ke and Huang (2020), Kebisek et al. (2020), Lee et al. (2018), Wagner et al. (2020), Yu et al. (2020) Naive Bayes Bartlett et al. (2020) Random forest Zahrani et al. (2020) RNN Liu et al. (2020b), Meyes et al. (2019) SVM Doltsinis et al. (2020), Oh et al. (2019), Schmitt et al. (2020a) Regression ANFIS Hossain and Ahmad (2014), Moreira et al. (2019), Varma et al. (2017), Zhang and Lei (2017) CNN Dimitriou et al. (2020), Wang et al. (2021), Zhu et al. (2020), Zimmerling et al. (2020) Ensemble model Li et al. (2019) Extra tree Schorr et al. (2020a) EML Nguyen et al. (2020) GA-BPNN Ai et al. (2016) Linear regression Elangovan et al. (2015) MLP Abd-Ellatif (2013), Acayaba and de Escalona (2015), Ciancio et al. (2015), Du et al. (2021), Gyasi et al. (2019), Hamidinejad et al. (2012), Jiao et al. (2020), Kayabasi et al. (2017), Lei et al. (2019), de Oliveira Leite et al. (2015), Li et al. (2020a, 2020b), Li and Anand (2020), Lutz et al. (2020), McDonnell et al. (2021), Natesh et al. (2019), Neto et al. (2013), Nurwaha and Wang (2012), Papananias et al. (2019), Serin et al. (2020a), Turetskyy et al. (2021), Vrabel et al. (2016), Yu et al. (2016) NN-GA-PSO Dhas and Kumanan (2014) Quadratic regression Martín et al. (2016) Random forest Klein et al. (2020), Mulrennan et al. (2018), Schorr et al. (2020b), Tuar et al. (2017), Yu et al. (2019) Relevance vector machine Ge et al. (2012) RNN Alvarado-Iniesta et al. (2012), Essien and Giannetti (2020), Stähl et al. (2019), Zhang et al. (2018), Zhang et al. (2019b) SeDANN Gaikwad et al. (2020) SVM Garcia et al. (2019)
6.3.3. Prime Models
-
Multilayer Perceptron (MLP): Most frequently usedprime model(30 publications). Popular even in 2020-2021. Versatile for bothclassification(e.g.,fault types,OK/NOK) andregression(e.g.,surface roughness,tensile strength,dimensions). Often compared againstSVM,Random Forest,Linear Regression,Decision Tree,K-NN,AdaBoost. -
Convolutional Neural Network (CNN): Second most frequentprime model(14 publications). Well-suited forpattern recognitioninhigher dimensionalandspatial data, applied to2D images,3D point clouds, andtime series data.CNN architecturesare often compared with variations ofCNNsor otherdeep learning modelslikeAlexNet,VGG-16,ResNet-50, andR-CNNvariants. -
Recurrent Neural Network (RNN): Used as aprime modelin 7 publications, primarily fortime-dependentorsequential data(e.g.,time series sensor data). Focus is onLSTM network architecturesforbinary classificationofdefectsorregressionof quantities likematerial warpageormachine speed. Compared withSVM,Random Forest,XGBoost,polynomial regression,logistic regression, andARIMA.The following figure (Figure 7 from the original paper) illustrates the proportions of ML models (prime) used in publications in 2020 and 2021:
该图像是一张图表,展示了2020年和2021年相关文献中各类机器学习模型的使用比例,其中MLP占38%,CNN占30%,其他模型占14%,随机森林8%,SVM和RNN各占5%。
Figure 7 shows that in 2020 and 2021, MLP (38%) and CNN (30%) together account for 68% of prime models, highlighting their dominance in recent research.
6.3.4. Non-linear ML Models
Includes SVM (for classification and regression), Relevance Vector Machine (RVM), Decision Trees, Quadratic Regression, K-NN, and Naive Bayes. These are often compared among themselves and with MLPs, gradient boosted trees, or generalized additive models (GAM).
6.3.5. Ensembles
Ensemble methods combine multiple models. Random Forest is the most popular, used for both classification and regression. Extensive comparisons show ensembles can outperform single models (Gejji et al., 2020).
6.3.6. Variants and Hybrid Models with Neural Networks
These include ANFIS (Adaptive Neuro-Fuzzy Inference System), ANN variants like SeDANN (Sequential Decision Analysis Neural Network) and Extreme Machine Learning (EML), and hybrid models combining neural networks with evolutionary computation methods like genetic algorithms or particle swarm optimization. These are frequently compared with regular MLPs.
The following figure (Figure 8 from the original paper) illustrates the occurrences of ML models as baselines in all publications:
该图像是图表,展示了图8中在所有文献中作为基线模型出现的机器学习模型次数统计。图中显示支持向量机(SVM)、多层感知器(MLP)和随机森林(Random Forest)为最常用的基线模型。
Figure 8 reveals that SVM, MLP, and Random Forest are the most common baseline models used for comparison, indicating their established role in evaluating new predictive quality approaches.
7. Conclusion & Reflections
7.1. Conclusion Summary
This systematic review provided a comprehensive overview of 81 scientific publications from 2012 to 2021 concerning machine learning and deep learning based predictive quality in manufacturing. The analysis was structured around three guiding questions, categorizing publications by manufacturing processes and quality criteria, data bases and characteristics, and machine learning models.
Key findings include:
-
Predictive qualityis applied across diverse manufacturing processes, withcuttingandjoiningbeing the most researched areas, while others likecoatingandchanging material propertiesare underrepresented. -
The majority of studies use
real manufacturing data, often generated experimentally with relatively small sample sizes, although data fromrunning productiontends to be larger.Process parameters,sensor data, andproduct measurementsare commoninput variables. -
Numerical/continuous dataandimage dataare the predominantdata modalities. -
Multilayer Perceptrons (MLPs)andConvolutional Neural Networks (CNNs)are the most popularprime models, particularly in recent years, withCNNsbeing extensively used forimage-based tasks. TraditionalMLmodels likeSVMsandRandom Forestsfrequently serve asbaselines.The review highlights that
MLandDLoffer significant potential forquality assuranceandinspectionin manufacturing, but the field is heterogeneous and often isolated in its approaches and results.
7.2. Limitations & Future Work
The authors identified several limitations and suggested future research directions:
7.2.1. Limitations (Identified Gaps)
- Manufacturing Domain Imbalance: Significant differences in research activity across
manufacturing process groups. Many established processes (e.g.,riveting,gluing,solderingwithinjoining) are hardly covered. This may stem from varyingdigitization levelsanddata availability. - Lack of Process Integration: While promising,
predictive qualityapproaches are rarely integrated into actualmanufacturing processes.Training,evaluation, andmodel usageoften occur offline. Discussions onimplementationinreal quality assurance processesare scarce, as is evaluation usingquality-oriented metrics(e.g.,reject rate reduction,yield rate). - Limited Input Variables: Most approaches focus on a few
input variablesof the same type. Other crucial factors likeproduct designormaterial propertiesare often neglected. Models are typically for asingle product type, leaving the question ofgeneralizabilityopen. - Data Scarcity and Representation: Many approaches are developed on
small amounts of experimentally generated data, which may not berepresentativeofrunning production. This limits thegeneralizabilityandindustrial applicabilityof results. - Lack of Benchmark Data & Reproducibility: A significant absence of
freely available benchmark datasetsand sharedsource codehinderscomparabilitybetween approaches andreproducibilityof research. - Limited Deep Learning Model Exploration: Only
CNNsandLSTM-based modelsare extensively investigated.Novel deep learning methodslikeTransformer networksandGraph Neural Networksare largely unexplored forpredictive quality. - Time Series Data Processing: While
deep learningonimage datais mature,time series dataoften undergoesfeature extractionintoscalar featuresrather than being directly processed byDL modelscapable of handling sequential data more effectively.
7.2.2. Future Research Directions
To address these limitations, the paper envisions the following research directions:
- Synthetic Data Generation: Developing and researching methods, particularly
generative deep learning models, to createrealistic synthetic training datain large quantities, especially forrare process variationsandproduct defects. Expandingdata augmentationforsensorandtime series data. - Benchmark Data Sets: Establishing
freely available benchmark datasetsforpredictive qualitytasks to enhancecomparability,reproducibility, andfurther developmentof approaches. - Novel Deep Learning Methods: Exploring the applicability of
Transformer networks(forsequentialandimage data) andGraph Neural Networks(forgraph datalikeCADorsimulation data) forpredictive quality scenarios. - Time Series Classification and Forecasting: Further research into
deep learning model approachesspecifically designed fortime series classificationandforecastingto directly leverageraw sensor data. - Transfer Learning and Continual Learning: Investigating
data-efficientandcost-effective modelsusingtransfer learning(adapting models from one domain to another) andcontinual learning(learning continuously from new data without forgetting old knowledge) to overcome challenges posed by continuous changes in manufacturing processes andprocess variants. - Integration and Deployment: Researching strategies for evaluating
predictive quality solutionsinreal quality assurance processes, developingautomated feedback mechanisms, and quantifying impact usingquality-oriented metrics(yield rate,reject reduction). AdvancingMLOPs(machine learning operationalization) strategies forcontinuous monitoring,integration, anddeliveryofML models, and establishingcertification processesforML modelsin industrial manufacturing.
7.3. Personal Insights & Critique
This systematic review provides a valuable, structured overview that clearly demarcates the field of predictive quality from related ML applications in manufacturing. Its strict adherence to inclusion/exclusion criteria and detailed categorization makes the findings robust and easy to comprehend, even for beginners.
Inspirations:
- Bridging the Gap: The emphasis on
synthetic data generation,benchmark datasets,transfer learning, andcontinual learninghighlights a crucial challenge in industrialML: the perennial struggle with data scarcity and domain adaptation. These areas are vital for transitioningpredictive qualityfrom academic prototypes to robust industrial solutions. - Unexplored DL Potential: The call to explore
Transformer networksandGraph Neural Networksis exciting.Transformers, with theirself-attentionmechanisms, could revolutionizetime seriesand evenmultimodal datafusion in manufacturing.Graph neural networksoffer a powerful way to model complex relationships inCADorprocess flow graphs, which is a rich but underutilized data source. - Operationalization Focus: The explicit mention of
MLOPsandcertificationunderscores the practical realities of industrialAIdeployment. It's not enough to build accurate models; they must be reliable, maintainable, and trustworthy in safety-critical environments.
Potential Issues, Unverified Assumptions, or Areas for Improvement:
- Definition of "Predictive Quality": While the paper provides a clear definition of
predictive quality, the distinction between "prediction of quality" and "fault diagnosis" (for the machine, not product) or "anomaly detection" can sometimes be subtle in practice. Some boundary cases might be open to interpretation, potentially leading to slight variations in what is included/excluded in similar reviews by other researchers. The explicit exclusion ofanomaly detectionis a clear scope limitation, but the justification for it could be elaborated more, especially since someanomaliesmight eventually correlate withproduct defects. - Granularity of Process Categories: While
DIN 8580is a standard, some categories like "Cutting" are still very broad and encompass highly diverse processes. More granular sub-categorization within these major groups could reveal even finer trends or unique challenges specific to, say,millingversuslaser cutting. - Bias in Publication Selection: Despite systematic search, reliance on
Web of ScienceandScienceDirectmight inadvertently favor certain publication types or regions. Including Scopus or Google Scholar could potentially broaden the initial pool, though it would also increase the workload significantly. - Qualitative Depth: While the categorization is quantitative (number of papers), deeper qualitative analysis of why certain
ML modelsare chosen for specificdata modalitiesbeyond "well-suited for images" could be beneficial. For example, whyMLPis so versatile, or whyLSTMsare preferred fortime seriesover otherRNNvariants. - Real-world Impact Metrics: The critique about the lack of
quality-oriented metricsforprocess impactis valid and highlights a major gap. Future work should not only focus on model accuracy but also on quantifiable benefits likecost savings,waste reduction, andthroughput increaseto truly demonstrate industrial value.
Applicability to Other Domains:
The methodology and many of the identified challenges (e.g., data scarcity, need for benchmark data, integration into real systems, transfer learning) are highly transferable to other industrial AI applications beyond predictive quality, such as predictive maintenance, process optimization, or supply chain forecasting. The framework for systematically reviewing literature based on specific guiding questions, categorization, and identification of research gaps is a broadly applicable scientific practice.
Similar papers
Recommended via semantic vector search.