Abstract

Critical Reviews in Food Science and Nutrition ISSN: 1040-8398 (Print) 1549-7852 (Online) Journal homepage: www.tandfonline.com/journals/bfsn20 Applications of machine learning techniques for enhancing nondestructive food quality and safety detection Yuandong Lin, Ji Ma, Qijun Wang & Da-Wen Sun To cite this article: Yuandong Lin, Ji Ma, Qijun Wang & Da-Wen Sun (2023) Applications of machine learning techniques for enhancing nondestructive food quality and safety detection, Critical Reviews in Food Science and Nutrition, 63:12, 1649-1669, DOI: 10.1080/10408398.2022.2131725 To link to this article: https://doi.org/10.1080/10408398.2022.2131725 Published online: 12 Oct 2022. Submit your article to this journal Article views: 5243 View related articles View Crossmark data Citing articles: 129 View citing articles Full Terms & Conditions of access and use can be found at https://www.tandfonline.com/action/journalInformation?journalCode=bfsn20

1. Bibliographic Information

1.1. Title

Applications of machine learning techniques for enhancing nondestructive food quality and safety detection

1.2. Authors

Yuandong Lin, Ji Ma, Qijun Wang & Da-Wen Sun

1.3. Journal/Conference

Critical Reviews in Food Science and Nutrition. This journal is a highly reputable publication in the field of food science, known for publishing comprehensive review articles that critically evaluate advancements in various aspects of food research, including quality, safety, processing, and nutrition. Its reviews are often influential in shaping future research directions.

1.4. Publication Year

2023

1.5. Abstract

The paper addresses the growing global demand for high-quality food, highlighting the increasing interest in nondestructive and rapid detection technologies within the food industry. It acknowledges that data analysis from most nondestructive techniques is often complex, time-consuming, and requires specialized skills. Traditional chemometric or statistical methods face limitations due to noise, sample variability, and data complexity under diverse testing conditions. The review positions machine learning (ML) techniques as a powerful solution due to their capabilities in handling irrelevant information, extracting feature variables, and building calibration models, particularly in nondestructive technology and equipment intelligence.

The paper introduces and compares ML techniques, categorizing them into traditional machine learning (TML) and deep learning (DL). It then presents applications of several novel nondestructive technologies—acoustic analysis, machine vision (MV), electronic nose (e-nose), and spectral imaging—combined with advanced ML techniques for food quality assessment, including variety identification and classification, safety inspection, and processing control. Challenges and future prospects are also discussed. The review concludes that the integration of nondestructive testing and state-of-the-art ML techniques holds significant potential for monitoring food quality and safety, with different ML algorithms having distinct characteristics and applicability. Deep learning, characterized by its feature learning nature, is identified as one of the most promising and powerful techniques for real-time applications, warranting further extensive research and wider adoption in the food industry.

1.6. Original Source Link

https://doi.org/10.1080/10408398.2022.2131725 (Officially published)

2. Executive Summary

2.1. Background & Motivation

The global population's increasing demand for high-quality food necessitates robust and efficient methods for ensuring food quality and safety. Traditional detection methods, such as gas chromatography (GC), high-performance liquid chromatography (HPLC), and polymerase chain reaction (PCR), are destructive, labor-intensive, expensive, and time-consuming, making them unsuitable for rapid, online, or industrial applications.

This limitation led to the development of nondestructive techniques like spectral imaging, acoustic vibration methods, machine vision (MV), and electronic nose (e-nose). While these technologies offer advantages in speed and non-invasiveness, they generate complex, high-dimensional, noisy, and often redundant datasets. Analyzing these datasets with conventional chemometric or statistical methods is challenging, particularly for nonlinear and large-scale data, requiring highly skilled operators and significant time. This complexity limits their real-time industrial application.

The core problem the paper aims to solve is the need for efficient and effective data analysis methods to unlock the full potential of nondestructive food quality and safety detection technologies. The paper's innovative idea and entry point is to leverage the capabilities of machine learning (ML) techniques, including traditional machine learning (TML) and deep learning (DL), to overcome the analytical challenges posed by nondestructive sensor data. ML can handle complex, nonlinear, and large datasets, automate feature extraction, and build robust predictive models, thereby enhancing the efficiency and applicability of these detection technologies.

2.2. Main Contributions / Findings

The paper makes several primary contributions:

Comprehensive Overview of ML Techniques: It provides a structured introduction and comparison of traditional machine learning (TML) and deep learning (DL) algorithms relevant to nondestructive food analysis, outlining their principles, advantages, and limitations.
Integration with Novel Nondestructive Technologies: It systematically summarizes the applications of advanced ML techniques when combined with various novel nondestructive technologies, specifically acoustic analysis, machine vision (MV), electronic nose (e-nose), and spectral imaging.
Diverse Application Scenarios: The review showcases the utility of these integrated approaches across key areas of the food industry, including food quality assessment (variety identification, classification), safety inspection (chemical and microbial contamination), and processing control.
Identification of Challenges and Future Directions: It critically discusses the existing challenges in applying ML to nondestructive food detection, such as the lack of labeled data, data standardization, and computational demands, while also proposing promising future research avenues like transfer learning, lifelong learning, and the development of lightweight DL models.

The key conclusions and findings reached by the paper are:
Nondestructive testing technologies, when combined with state-of-the-art ML techniques, demonstrate significant potential for monitoring the quality and safety of food products.
Different ML algorithms possess unique characteristics and are suited for specific application scenarios.
Deep learning, owing to its powerful feature learning capabilities and ability to handle complex data, is identified as a particularly promising technique for real-time applications. However, its widespread adoption requires further research to address challenges related to data volume, computational resources, and model complexity.
The integration of ML simplifies data analysis, feature extraction, and model building, thereby enhancing the efficiency and effectiveness of nondestructive detection systems and moving towards device intellectualization in the food industry.

3.1. Foundational Concepts

To fully understand this paper, a beginner should grasp several foundational concepts in food science, sensing technology, and artificial intelligence.

Nondestructive Technologies: These are methods for testing materials, components, or systems without causing damage to the item being tested. In food science, they are crucial for quality control without altering the food product.
- Acoustic Analysis / Ultrasound: Utilizes sound waves (audible or ultrasonic) to probe material properties. Changes in sound propagation (velocity, attenuation, resonance frequencies) are correlated with physical characteristics (e.g., firmness, moisture content, internal defects).
- Machine Vision (MV): Also known as computer vision system (CVS), it involves equipping computers with the ability to "see" and interpret images. In food, this often uses RGB cameras to capture visual data (color, shape, texture, size, surface defects) for automated inspection and grading.
- Electronic Nose (E-nose): An array of gas sensors designed to mimic the human olfactory system. It detects and analyzes volatile organic compounds (VOCs) emitted by food, generating unique "odor fingerprints" that can be correlated with freshness, spoilage, or specific aromas.
- Spectral Imaging (e.g., Hyperspectral Imaging - HSI): Combines traditional imaging with spectroscopy. Instead of just RGB (red, green, blue) bands, HSI collects light across a wide range of continuous electromagnetic spectrum bands for each pixel, creating a hypercube. This allows for simultaneous acquisition of spatial (image) and spectral (chemical/physical properties) data, enabling the identification and quantification of chemical compositions.
Chemometrics / Statistical Methods: These are mathematical and statistical techniques applied to chemical data. In food analysis, they are used to extract meaningful information from complex analytical measurements. Examples include Principal Component Analysis (PCA), Partial Least Squares (PLS), Linear Discriminant Analysis (LDA). The paper notes their limitations in handling nonlinear and high-dimensional datasets.
Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL):
- Artificial Intelligence (AI): A broad field of computer science aimed at creating intelligent machines that can reason, learn, and act autonomously.
- Machine Learning (ML): A subfield of AI where algorithms learn patterns from data and make predictions or decisions without being explicitly programmed. The machine "learns" from training data.
  - Traditional Machine Learning (TML): Refers to ML algorithms that typically require manual feature extraction (i.e., human expertise to define relevant characteristics in the data) and often perform well on smaller datasets.
  - Deep Learning (DL): A subfield of ML inspired by the structure and function of the human brain's neural networks. It uses deep neural networks with multiple layers (hence "deep") to automatically learn complex feature representations from raw data, often performing exceptionally well on large datasets. It minimizes the need for manual feature engineering. The relationship can be visualized as a hierarchy, with AI encompassing ML, and ML encompassing DL, as shown in the following figure (Figure 2 from the original paper).
    
    该图像是一个示意图，展示了人工智能、机器学习和深度学习之间的包含关系，及其与无损检测技术的关联，突出深度学习在机器学习和人工智能中的核心地位。
    
    The above figure (Figure 2 from the original paper) illustrates the hierarchical relationship, showing Deep Learning as a subset of Machine Learning, which in turn is a subset of Artificial Intelligence.
Types of Machine Learning:
- Supervised Learning: Algorithms learn from labeled data, where both the input and the desired output are provided. The goal is to learn a mapping from inputs to outputs for prediction or classification.
  - Support Vector Machine (SVM): A powerful algorithm for classification and regression. It finds an optimal hyperplane that best separates different classes in the feature space, maximizing the margin between them. It can handle nonlinear relationships using kernel functions (e.g., radial basis function (RBF)).
  - Logistic Regression (LR): A statistical model used for binary classification (predicting a probability of 0 or 1). It models the probability of a certain class or event existing.
  - K-Nearest Neighbor (KNN): A simple, non-parametric algorithm used for both classification and regression. It classifies a data point based on the majority class of its $K$ nearest data points in the feature space.
  - Decision Tree (DT): A tree-like model where each internal node represents a test on an attribute, each branch represents the outcome of the test, and each leaf node represents a class label or a predicted value. Random Forest (RF) and Gradient Boosting are ensemble methods built upon decision trees.
  - Naïve Bayes (NB): A probabilistic classifier based on Bayes' theorem with the "naïve" assumption of conditional independence between features.
  - Artificial Neural Networks (ANNs) / Multilayer Perceptrons (MLPs): Inspired by biological neural networks, ANNs consist of interconnected nodes (neurons) organized in layers (input, hidden, output). They learn by adjusting weights and biases through training (e.g., backpropagation) to map inputs to outputs, capable of modeling nonlinear systems.
    
    $该图像是三种神经网络结构的示意图，包括(a)卷积神经网络结构流程，(b)多层感知器结构，及(c)采用自编码器进行特征编码和解码的流程图。图中用$Z=Wx+b$表示感知器函数。$ 该图像是三种神经网络结构的示意图，包括(a)卷积神经网络结构流程，(b)多层感知器结构，及(c)采用自编码器进行特征编码和解码的流程图。图中用 $Z=Wx+b$ 表示感知器函数。
    
    The above figure (Figure 4 from the original paper) illustrates different neural network architectures. Specifically, Figure 4(b) depicts a Multilayer Perceptron (MLP), also known as a typical Artificial Neural Network (ANN), with multiple hidden layers. The figure also shows the convolutional neural network structure (a) and the autoencoder process (c), which are deep learning architectures.
- Unsupervised Learning: Algorithms learn from unlabeled data to discover hidden patterns, structures, or relationships within the data.
  - Dimensionality Reduction: Techniques to reduce the number of features (variables) in a dataset while retaining most of the important information.
    - Principal Component Analysis (PCA): A linear dimensionality reduction technique that transforms data into a new coordinate system, where the greatest variance by some projection comes to lie on the first principal component, the second greatest variance on the second principal component, and so on.
  - Clustering: Grouping similar data points into clusters without prior knowledge of their labels.
    - K-means Clustering: An iterative algorithm that partitions $n$ observations into $k$ clusters, where each observation belongs to the cluster with the nearest mean (centroid).
- Semi-supervised Learning: Combines a small amount of labeled data with a large amount of unlabeled data during training. It is useful when obtaining labeled data is expensive or time-consuming.
Deep Learning Architectures:
- Autoencoder (AE): A type of neural network used for unsupervised learning of efficient data encodings. It has an encoder that compresses the input into a latent-space representation and a decoder that reconstructs the input from this representation. It's often used for dimensionality reduction or feature learning.
- Convolutional Neural Network (CNN): A specialized type of neural network particularly effective for processing grid-like data such as images. Key components include convolutional layers (which apply filters to detect features), pooling layers (which reduce spatial dimensions), and fully connected layers (for classification).
- Restricted Boltzmann Machine (RBM): A generative stochastic artificial neural network that can learn a probability distribution over its set of inputs. It consists of a visible layer and a hidden layer, with symmetric connections between them, but no connections within a layer. Deep Belief Networks (DBNs) are often built by stacking RBMs.

3.2. Previous Works

The paper acknowledges several relevant prior reviews but distinguishes its contribution by highlighting their specific focus:

Rehman et al. (2019): Focused on current applications of TML techniques in MV systems.
Saha and Manickavasagan (2021): Outlined applications of different machine learning techniques in hyperspectral image analysis.
Liu, Pu, and Sun (2021): Summarized applications of CNN models in food quality detection and discussed feature extraction methods based on 1D, 2D, and 3D models.
Sun, Zhang, and Mujumdar (2019): Reviewed AI technologies and their applications specifically in food drying.
D. Y. Wang et al. (2022): Reviewed AI-based methods for detection and prediction in sorting, drying, disinfecting, sterilizing, and freezing of berries.

While these reviews cover important aspects, the current paper observes that they either concentrate on a single nondestructive technology or a single food process. This implies a gap in a systematic, comprehensive review that compares the performances of both TML and DL across various novel nondestructive technologies for enhancing efficiency and effectiveness in food quality and safety detection.

3.3. Technological Evolution

The evolution of technologies in this field has been driven by advancements in several interconnected areas:

Sensing Technology: Development of more sophisticated and affordable sensors for spectral imaging, acoustic analysis, machine vision, and e-noses has enabled richer and more diverse data collection from food products. This includes higher resolution cameras, more sensitive gas sensors, and more precise acoustic transducers.
Computer Science and Hardware: Significant progress in computational power (especially with Graphics Processing Units - GPUs), data storage, and efficient algorithms has made it feasible to process the massive amounts of information generated by nondestructive techniques. This includes the development of user-friendly software packages (e.g., Scikit-learn, TensorFlow, PyTorch).
Data Science and AI: The maturation of machine learning and deep learning algorithms has provided powerful tools for extracting meaningful patterns from complex, high-dimensional, and noisy datasets, moving beyond the limitations of traditional chemometric and statistical methods. The shift from manual feature engineering in TML to automatic feature learning in DL represents a significant leap.

This paper's work fits into the current technological timeline by showcasing how the convergence of advanced sensing, powerful computing, and sophisticated AI algorithms is leading to the intellectualization of food detection equipment, enabling more accurate, rapid, and automated quality and safety assessment.

3.4. Differentiation Analysis

Compared to the main methods in related work, this paper's core differentiation and innovation lie in its comprehensive and comparative approach:

Holistic ML Coverage: Unlike reviews focusing solely on TML or DL, this paper introduces and contrasts both categories (TML and DL), providing a broader perspective on the available ML toolkit.
Multi-Technology Integration: While other reviews often target a single nondestructive technology (e.g., only MV or only HSI), this paper integrates acoustic analysis, machine vision, electronic nose, and spectral imaging within a single framework. This cross-technology perspective is crucial for understanding the diverse applications and challenges.
Systematic Comparison: By presenting Table 1 (Comparison of DL against TML) and discussing specific algorithm choices, the paper aims to provide insights for better selection of ML techniques for different nondestructive technologies and application scenarios, a level of comparative analysis that was less explicit in prior single-focus reviews.
Emphasis on "Device Intellectualization": The paper explicitly frames the combination of nondestructive techniques with advanced ML as a trend towards device intellectualization, going beyond mere automation to intelligent decision-making.

In essence, this review fills a gap by offering a systematic, side-by-side analysis of various ML paradigms (TML and DL) applied across a spectrum of novel nondestructive food sensing technologies, providing a more integrated and comparative understanding of the field's current state and future potential.

4. Methodology

As a review paper, the "methodology" is not an experimental procedure but rather the structured approach taken by the authors to survey, categorize, and synthesize existing research. The paper's methodology involves a systematic organization of knowledge, an introduction to foundational concepts, a comparative analysis of techniques, and a summary of applications, followed by a discussion of challenges and future directions.

4.1. Principles

The core principle guiding this review is to provide a comprehensive and insightful overview of how machine learning (ML) techniques are being utilized to enhance nondestructive food quality and safety detection. It aims to clarify the landscape of ML algorithms (categorized into Traditional Machine Learning (TML) and Deep Learning (DL)) and demonstrate their synergistic application with various nondestructive technologies across different food industry needs. The theoretical basis is that ML can overcome the limitations of traditional data analysis by effectively processing complex, high-dimensional, and noisy data generated by advanced sensors, thereby enabling more efficient and accurate real-time food assessment.

4.2. Core Methodology In-depth (Layer by Layer)

The paper's methodology can be broken down into the following structural and conceptual layers:

4.2.1. Introduction to the Problem and Scope (Section 1)

The review begins by establishing the necessity for nondestructive and rapid detection technologies in the food industry due to the global demand for high-quality food and the shortcomings of traditional destructive methods. It then identifies the key challenge: the complexity, time-consuming nature, and skill requirement for analyzing data from nondestructive technologies, alongside the limitations of conventional chemometric or statistical methods when faced with noise, sample variability, and data complexity. Machine learning (ML) is introduced as a powerful solution for handling irrelevant information, extracting feature variables, and building calibration models, specifically highlighting its role in nondestructive technology and equipment intelligence. The scope of the review is then defined: an introduction and comparison of TML and DL, a summary of their applications with acoustic analysis, machine vision (MV), electronic nose (e-nose), and spectral imaging for food quality assessment, safety inspection, and processing control, concluding with a discussion of challenges and prospects.

4.2.2. Overview and Comparison of Machine Learning Techniques (Section 2)

This section forms the backbone of the review's conceptual methodology, systematically detailing ML approaches.

4.2.2.1. Traditional Machine Learning (TML) (Section 2.1)

The paper defines TML as algorithms that typically involve manual feature extraction on smaller sample sets, balancing validity with interpretability. It outlines the statistical foundations and categorizes TML into:

Supervised Learning (Section 2.1.1):
- Support Vector Machine (SVM): Explained as a technique for classification and regression that constructs a maximal marginal hyperplane to separate data points (e.g., "red star" and "blue star" categories in Figure 3(a)). It addresses overfitting using kernel functions (linear, radial basis function (RBF), polynomial, sigmoid). Support Vector Regression (SVR) is mentioned as its regression counterpart.
  
  该图像是一个示意图，展示了图(a)支持向量机（SVM）分类，图(b) k近邻（k-NN）分类，图(c)单棵决策树分类，以及图(d)随机森林多棵决策树集成的基本原理和流程。
  
  The above figure (Figure 3(a) from the original paper) illustrates the core principle of a Support Vector Machine (SVM). It shows how an SVM identifies an optimal hyperplane (represented by the dashed line) that maximizes the separation margin between two classes of data points (labeled as "red stars" and "blue stars"). The points closest to the hyperplane are called support vectors.
- Logistic Regression (LR): Described as a method for binary classification (0 or 1) and estimating probabilities. It focuses on the odds (ratio of occurrence to non-occurrence probability). Multinomial Logistic Regression (MLR) is noted as an extension for more than two classes.
- K-Nearest Neighbor (KNN): Explained as a classification method based on distance or similarity measurement. An unclassified point is assigned the class of the majority of its $K$ nearest neighbors (illustrated in Figure 3(b)). The choice of $K$ is crucial for accuracy.
  
  该图像是一个示意图，展示了图(a)支持向量机（SVM）分类，图(b) k近邻（k-NN）分类，图(c)单棵决策树分类，以及图(d)随机森林多棵决策树集成的基本原理和流程。
  
  The above figure (Figure 3(b) from the original paper) demonstrates the K-Nearest Neighbor (KNN) algorithm. It shows an unclassified data point (black star) and how its classification depends on the value of $K$ . For $K=3$ , the black star is classified as "blue" due to two blue neighbors and one red neighbor. For $K=5$ , it is classified as "red" due to three red neighbors and two blue neighbors, illustrating the importance of the $K$ parameter.
- Algorithms based on Decision Trees (DT) (Section 2.1.1.4): Described as tree-like structures where internal nodes are attribute judgments and leaf nodes are classification results or continuous values (Figure 3(d)). Random Forest (RF) builds multiple deep, potentially overfitted trees and combines their outputs. Gradient Boosting (e.g., XGBoost, CatBoost) builds shallow trees sequentially to reduce classification error.
  
  该图像是一个示意图，展示了图(a)支持向量机（SVM）分类，图(b) k近邻（k-NN）分类，图(c)单棵决策树分类，以及图(d)随机森林多棵决策树集成的基本原理和流程。
  
  The above figure (Figure 3(c) from the original paper) depicts a single Decision Tree, showing how it makes decisions based on attributes at each node to reach a classification at the leaf nodes. Figure 3(d) illustrates a Random Forest (RF), an ensemble method that combines multiple such decision trees to make a more robust prediction.
- Naïve Bayes (NB): A probabilistic classifier that predicts values based on Bayes' theorem and the assumption of independence between variables. It has Multinomial, Poisson, and Bernoulli models. Feature selection is critical due to the independence assumption.
- Artificial Neural Networks (ANNs): Presented as biologically inspired models for prediction and classification. They consist of artificial neurons in layers (Figure 4(b)), using transfer functions (e.g., sigmoid, linear, hyperbolic tangent, logistic). Backpropagation is the typical training method to minimize the loss function. ANNs excel at nonlinear systems but require large datasets and careful hyperparameter tuning.
  
  $该图像是三种神经网络结构的示意图，包括(a)卷积神经网络结构流程，(b)多层感知器结构，及(c)采用自编码器进行特征编码和解码的流程图。图中用$Z=Wx+b$表示感知器函数。$ 该图像是三种神经网络结构的示意图，包括(a)卷积神经网络结构流程，(b)多层感知器结构，及(c)采用自编码器进行特征编码和解码的流程图。图中用 $Z=Wx+b$ 表示感知器函数。
  
  The above figure (Figure 4(b) from the original paper) illustrates the typical structure of an Artificial Neural Network (ANN) or Multilayer Perceptron (MLP). It shows an input layer, multiple hidden layers, and an output layer, with neurons (nodes) in each layer connected to neurons in adjacent layers. The perceptron function $Z=Wx+b$ represents the weighted sum of inputs plus bias before activation.
Unsupervised Learning (Section 2.1.2):
- Dimensionality Reduction: Techniques to simplify messy or high-dimensional datasets.
  - Principal Component Analysis (PCA): A linear projection method for data decomposition and visualization that maximizes variance in lower dimensions. It identifies dominant patterns and can reduce data by selecting components explaining a high cumulative variance (e.g., 90%).
  - Other methods mentioned: Minimum Noise Fraction (MNF) and Independent Component Analysis (ICA). These are noted for their linear assumption, which may not suit inherent nonlinear structures in nondestructive data.
- Clustering: Groups unlabeled data items into similar groups based on mathematical similarity.
  - K-means Clustering: A common algorithm that partitions data into $K$ clusters by iteratively assigning points to the nearest cluster centroid and updating centroids. The number of clusters $K$ is a crucial parameter.
Semi-supervised Learning (Section 2.1.3): Addresses the challenge of limited labeled data by combining small labeled sets with large unlabeled data. It generally includes generative models, self-learning models, co-training models, transductive SVM (TSVM) learning models, and graph-based learning models. Graph-based learning is highlighted for its better classification accuracy but higher computational complexity.

4.2.2.2. Deep Learning (DL) (Section 2.2)

DL is defined as representational learning using deep ANNs to extract features from raw data automatically for detection, classification, or regression.

Auto-encoder (AE): An unsupervised neural network using backpropagation. It acts as a feature extractor, mapping input to a feature vector (encoding) and reconstructing the input (decoding) (Figure 4(c)). Deep AEs stack layers for encoding and decoding. Various types exist (de-noising, sparse, variational, contractive). AE offers more flexibility than PCA by allowing linear and nonlinear representations.

$该图像是三种神经网络结构的示意图，包括(a)卷积神经网络结构流程，(b)多层感知器结构，及(c)采用自编码器进行特征编码和解码的流程图。图中用$Z=Wx+b$表示感知器函数。$ 该图像是三种神经网络结构的示意图，包括(a)卷积神经网络结构流程，(b)多层感知器结构，及(c)采用自编码器进行特征编码和解码的流程图。图中用 $Z=Wx+b$ 表示感知器函数。

The above figure (Figure 4(c) from the original paper) illustrates the architecture of an Autoencoder (AE). It consists of an encoder that compresses the input data into a lower-dimensional latent-space representation and a decoder that attempts to reconstruct the original input from this compressed representation. The goal is to learn an efficient encoding of the input.
Convolutional Neural Network (CNN): A flourishing ML model since 1998, highly successful for image applications. Its architecture includes convolutional layers (for feature asset representation through 2D symmetric operations and nonlinear transfer functions), pooling layers (to reduce dimensions and parameters), and fully connected layers (for final prediction) (Figure 4(a)). CNNs require large datasets but can leverage transfer learning with pre-trained models (e.g., ResNet-34, VGG-16/19, AlexNet, MobileNetv2) to improve performance on smaller datasets.

$该图像是三种神经网络结构的示意图，包括(a)卷积神经网络结构流程，(b)多层感知器结构，及(c)采用自编码器进行特征编码和解码的流程图。图中用$Z=Wx+b$表示感知器函数。$ 该图像是三种神经网络结构的示意图，包括(a)卷积神经网络结构流程，(b)多层感知器结构，及(c)采用自编码器进行特征编码和解码的流程图。图中用 $Z=Wx+b$ 表示感知器函数。

The above figure (Figure 4(a) from the original paper) presents the typical architecture of a Convolutional Neural Network (CNN). It shows the sequential arrangement of convolutional layers (which extract features using filters), pooling layers (which reduce the dimensionality of feature maps), and fully connected layers (which perform classification based on the learned features).
Restricted Boltzmann Machine (RBM): An undirected graphical model representing hidden and visible layers with symmetric connections but no intra-layer connections. Deep Belief Networks (DBNs) are multi-layer architectures incorporating RBMs.

4.2.2.3. Comparison (Section 2.3)

The paper provides a comparative analysis between DL and TML (summarized in Table 1, which will be presented in the Results section). Key aspects include data dependency (DL needs big data, TML for small/medium), hardware dependency (DL on GPUs, TML on CPUs), feature engineering (DL automatic, TML hand-crafted), training time (DL longer, TML shorter), logical explanatory (DL black-box, TML mathematical foundation), and multiple tasks (DL simultaneous, TML step-by-step).

4.2.3. Nondestructive Analytical Techniques (Section 3)

This section describes the principles of the four main nondestructive technologies and how ML integrates with them.

Acoustic Techniques (Section 3.1): Based on extracting signal characteristics (propagation velocity, resonance frequencies, damping ratios) after sound contact or passage through food, relating to mechanical and physical properties. ML (PCA, NB, LR, RF, ANNs, SVM) is used to build food quality inspection models from these acoustic parameters.
Electronic Nose (E-nose) (Section 3.2): Mimics human olfaction to detect volatiles using a gas sensor array. ML is crucial for pattern recognition from the time-varying intensities of electric signals, performing feature extraction, dimensionality reduction (PCA, LDA), and classification/prediction (PLSR, SVM, RF, ANNs).
Machine Vision (MV) (Section 3.3): A computer vision system (CVS) using RGB cameras to capture color and texture information. ML (TML and DL) is applied to image processing and analysis to extract features (color, texture, shape, size, surface defects) and make automated decisions for quality inspection and identification.
Spectral Imaging (Section 3.4): Combines spectroscopy and imaging (e.g., HSI) to acquire spatial and spectral data (hypercube). ML plays an essential role in data analysis for quality determination, particularly in wavelength selection and feature extraction to reduce redundancy. Calibration models are built using TML (PLSR, SVM, RF, ANNs) and DL (CNN) to determine chemical compositions.

4.2.4. Recent Applications in the Food Industry (Section 4)

This section surveys practical applications, categorizing them by purpose and food type, and correlating them with specific nondestructive technologies and ML algorithms. This section uses Table 2 (presented in the Results section) to summarize findings.

Recognition and Classification (Section 4.1): Examples include food adulteration (e.g., sorghum, minced beef) and food classification (e.g., fresh/frozen pork, Chinese liquors, bulk raisins), showing how ML improves accuracy, sometimes through feature-level or decision-level fusion of multisensor data.
Quality Detection (Section 4.2):
- Fruits: Ripeness classification, defect discrimination, prediction of soluble solids content (SSC) and firmness (e.g., apples, loquat, kiwifruit, blueberries) using HSI, MV, e-nose, and acoustic methods combined with TML (PLSR, SVM, RF, ANNs) and DL (CNN, AE).
- Meat and Aquatic Products: DHA/EPA prediction, TVB-N, WHC, pH (e.g., salmon, crucian carp, cod), bone residues detection (e.g., salmon), texture detection (fish meat). Utilizes HSI, MV, Ultrasound with TML (PLSR, KNN, SVM, BP-ANN) and DL (CNN).
- Other Products: Quality assessment for nuts, tea, seeds, edible oil, ginger slices, mushroom using various combinations of nondestructive tech and ML.
Safety Detection (Section 4.3):
- Chemical Contaminations: Detection of pesticide residues (e.g., apples, mulberry fruit) and heavy metals (e.g., lettuce) using e-nose, LIBS, HSI, with TML (PCA, LDA, SVM, PLSR) and DL (WT-SCAE-SVR).
- Microbial Contaminations: Identification of fungal growth and pathogenic bacteria (e.g., peanuts, rice kernels, Agaricus bisporus) using HSI, e-nose, MV, with TML (KNN, RF, RBF-SVM, BP-ANN, PLSR, SVM, LVQ) and DL (CNN).

4.2.5. Challenges and Future Work (Section 5)

This section outlines the current limitations and proposes future research directions, serving as a forward-looking aspect of the methodology. It focuses on:

Lack of labeled data and the potential of transfer learning.
Need for benchmark and open datasets and Standard Operating Procedures (SOPs).
Addressing environmental effects and variability through lifelong learning (LL) and reinforcement learning.
Optimizing DL for large datasets and computational demands, suggesting lightweight models (e.g., SqueezeNet, ShuffleNet, MobileNet) for real-time, on-site detection.
Expanding DL applications to e-nose and acoustic analysis, including automatic drift compensation.

4.2.6. Conclusions (Section 6)

A summary reiterating the potential of ML with nondestructive techniques and emphasizing the promise of DL for real-time applications.

5. Experimental Setup

As a review paper, this article does not present its own experimental setup in the traditional sense (i.e., no specific datasets, metrics, or baselines were generated by the authors of this paper for new experiments). Instead, it synthesizes and reports the experimental setups and results from numerous other research papers. Therefore, this section will describe the types of datasets, evaluation metrics, and baselines commonly reported in the reviewed literature, as presented in the paper.

5.1. Datasets

The paper discusses a wide array of datasets based on the specific food products and nondestructive technologies used in the reviewed studies. These datasets vary significantly in their characteristics, scale, and domain.

Source and Characteristics:
- Food Products: The datasets originated from diverse food items, including:
  - Grains: Sorghum, maize kernels, wheat kernels, rice kernels, barley, peanuts, seeds.
  - Meats & Aquatic Products: Minced beef, pork (fresh, frozen-thawed), salmon fillet, crucian carp fillets, cod, fish meat.
  - Fruits & Vegetables: Gooseberry, loquat, apples, kiwifruit, yellow peach, blueberries, mulberry fruit, lettuce, potatoes, dry black goji berries, Agaricus bisporus (mushroom).
  - Beverages & Other: Chinese liquors, bulk raisin, black tea, tea, olive oil.
- Data Types based on Nondestructive Technologies:
  - Machine Vision (MV): Typically RGB images containing color and texture information. For example, RGB images of apples for defect detection, or images of bulk raisins for classification.
  - Hyperspectral Imaging (HSI): Hypercubes (3D data) containing both spatial and spectral information across hundreds of narrow wavelength bands. Examples include hyperspectral images of sorghum for adulteration, pork for quality, or blueberries for SSC and firmness.
  - Electronic Nose (E-nose): Time-varying intensities of electric signals from sensor arrays, forming vectors or odor fingerprints. Used for detecting mycotoxin contamination in maize, kiwifruit ripeness, or pesticide residues on apples.
  - Acoustic Analysis / Ultrasound: Response signals (e.g., acoustic emissions, ultrasonic velocity, attenuation coefficient, acoustic impedance) which can be analyzed in the time domain or frequency domain. Used for mealiness detection in apples or fish meat texture.
  - Laser-Induced Breakdown Spectroscopy (LIBS): Spectral data (intensity vs. wavelength) from plasma generated by a laser. Used for thiophanate-methyl residue detection on mulberry fruit.
Scale and Domain: The paper notes that imaging technologies like MV and HSI generate massive amounts of information. DL methods, in particular, are highlighted as data-hungry, performing well with large datasets, while TML can work with small or medium datasets. The domain is consistently food quality and safety detection.
Example Data Sample (Conceptual):
- For Machine Vision, a data sample might be a JPEG image of an apple, from which features like average red color intensity, circularity of shape, or texture variance could be extracted.
- For Hyperspectral Imaging, a data sample would be a hypercube for a specific pixel or region on a food item, providing a spectrum (light intensity across wavelengths) that reveals its chemical composition.
- For E-nose, a data sample is a vector of sensor responses at a given time point, representing an odor fingerprint.
- For Acoustic Analysis, a data sample could be a frequency spectrum of a vibration signal from a fruit, indicating its firmness.
  
  The choice of these datasets in the reviewed literature is driven by the specific food quality attribute (e.g., ripeness, defect, adulteration, chemical composition) or safety concern (e.g., pesticide, microbial contamination) being investigated, and the suitability of the chosen nondestructive technology to acquire relevant information.

5.2. Evaluation Metrics

The paper, through its summary of applications in Table 2, refers to several common evaluation metrics used in classification and regression tasks within machine learning.

Accuracy:
- Conceptual Definition: Measures the proportion of correctly predicted instances (both true positives and true negatives) out of the total number of instances. It is a general measure of correctness for classification models.
- Mathematical Formula: $ \text{Accuracy} = \frac{\text{Number of Correct Predictions}}{\text{Total Number of Predictions}} = \frac{TP + TN}{TP + TN + FP + FN} $
- Symbol Explanation:
  - TP: True Positives (correctly predicted positive instances).
  - TN: True Negatives (correctly predicted negative instances).
  - FP: False Positives (incorrectly predicted positive instances, type I error).
  - FN: False Negatives (incorrectly predicted negative instances, type II error).
$R^2$ (Coefficient of Determination):
- Conceptual Definition: A statistical measure that represents the proportion of the variance in the dependent variable that can be predicted from the independent variables. It indicates how well the model explains the variability of the response data around its mean. A higher $R^2$ value generally indicates a better fit.
- Mathematical Formula: $ R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}i)^2}{\sum{i=1}^{n} (y_i - \bar{y})^2} $
- Symbol Explanation:
  - $y_i$ : The actual value of the dependent variable for the $i$ -th observation.
  - $\hat{y}_i$ : The predicted value of the dependent variable for the $i$ -th observation.
  - $\bar{y}$ : The mean of the actual values of the dependent variable.
  - $n$ : The total number of observations.
RMSEP (Root Mean Squared Error of Prediction):
- Conceptual Definition: A measure of the average magnitude of the errors between predicted values and actual values. It's the square root of the average of the squared differences between prediction and actual observation. It gives a relatively high weight to large errors. Lower RMSEP indicates better predictive performance.
- Mathematical Formula: $ \text{RMSEP} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2} $
- Symbol Explanation:
  - $y_i$ : The actual value for the $i$ -th observation.
  - $\hat{y}_i$ : The predicted value for the $i$ -th observation.
  - $n$ : The total number of observations.
RPD (Ratio of Performance to Deviation):
- Conceptual Definition: Used in spectroscopy, RPD is a measure of the predictive performance of a calibration model, indicating the ratio of the standard deviation of the reference values to the standard error of prediction. A higher RPD value (typically > 2.0) indicates a robust and reliable model for prediction.
- Mathematical Formula: $ \text{RPD} = \frac{SD}{\text{RMSEP}} $
- Symbol Explanation:
  - SD: Standard Deviation of the reference (actual) values.
  - $\text{RMSEP}$ : Root Mean Squared Error of Prediction.
Recall (Sensitivity or True Positive Rate):
- Conceptual Definition: Measures the proportion of actual positive instances that were correctly identified by the model. It's important when the cost of false negatives is high (e.g., missing a defect).
- Mathematical Formula: $ \text{Recall} = \frac{TP}{TP + FN} $
- Symbol Explanation:
  - TP: True Positives.
  - FN: False Negatives.
Specificity (True Negative Rate):
- Conceptual Definition: Measures the proportion of actual negative instances that were correctly identified by the model. It's important when the cost of false positives is high (e.g., incorrectly identifying a good product as defective).
- Mathematical Formula: $ \text{Specificity} = \frac{TN}{TN + FP} $
- Symbol Explanation:
  - TN: True Negatives.
  - FP: False Positives.
F1-score:
- Conceptual Definition: The harmonic mean of Precision and Recall. It is a balanced metric that considers both false positives and false negatives, especially useful when class distribution is uneven.
  - Precision = $\frac{TP}{TP + FP}$
- Mathematical Formula: $ \text{F1-score} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} $
- Symbol Explanation:
  - $\text{Precision}$ : Proportion of positive identifications that were actually correct.
  - $\text{Recall}$ : Proportion of actual positives that were correctly identified.
PR-AUC (Area Under Precision-Recall Curve):
- Conceptual Definition: The area under the Precision-Recall Curve, which plots Precision on the y-axis and Recall on the x-axis for different thresholds. It is particularly useful for evaluating imbalanced datasets where the positive class is rare, as it focuses on the performance on the positive class. A higher PR-AUC indicates better performance.
- Mathematical Formula: There is no single explicit formula for PR-AUC as it's calculated by integrating the area under the curve formed by various precision and recall points. It is typically approximated using numerical integration methods.
Correlation Coefficient ( $R$ ):
- Conceptual Definition: A statistical measure that expresses the extent to which two variables are linearly related. It ranges from -1 to +1. A value of +1 indicates a perfect positive linear relationship, -1 a perfect negative linear relationship, and 0 no linear relationship.
- Mathematical Formula (Pearson correlation coefficient): $ R = \frac{n(\sum xy) - (\sum x)(\sum y)}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}} $
- Symbol Explanation:
  - $n$ : Number of data points.
  - $x$ : Values of the first variable (e.g., predicted values).
  - $y$ : Values of the second variable (e.g., actual values).
  - $\sum xy$ : Sum of the product of $x$ and $y$ values.
  - $\sum x$ : Sum of $x$ values.
  - $\sum y$ : Sum of $y$ values.
  - $\sum x^2$ : Sum of squared $x$ values.
  - $\sum y^2$ : Sum of squared $y$ values.

5.3. Baselines

The reviewed papers commonly compare the proposed ML methods against:

Traditional Chemometric/Statistical Methods: PCA, PLS, LDA, which are often linear and require more manual feature engineering. For instance, FDPCA was compared against PCA and DPCA.
Other TML Algorithms: Different TML algorithms are often benchmarked against each other (e.g., SVM vs. KNN vs. RF vs. ANNs vs. NB vs. LR) to find the best performing model for a specific task.
Deep Learning Models: CNNs are often compared against TML classifiers (e.g., KNN, RF, RBF-SVM, BP-ANN) to demonstrate the superiority of DL in feature extraction and accuracy for image-based tasks. Different CNN architectures (e.g., AlexNet, VGG-16/19, ResNet) are also compared.
Ablation Studies/Feature Engineering Variants: Comparisons are also made regarding the impact of feature selection methods (e.g., CARS, SPA, UVE, GA) or preprocessing methods on model performance, often implicitly serving as baselines for the full proposed method.
Human Inspection/Conventional Methods: Although not explicitly shown in tables, the underlying motivation for nondestructive ML techniques is to surpass the limitations of destructive, labor-intensive, or subjective human inspection methods.

6. Results & Analysis

The paper synthesizes a vast array of results from various studies, demonstrating the effectiveness of combining machine learning (ML) techniques with nondestructive technologies for food quality and safety detection. The results generally indicate superior performance of ML-enhanced systems compared to traditional methods, with Deep Learning (DL) often outperforming Traditional Machine Learning (TML) in complex, data-rich scenarios, especially for vision-based tasks.

6.1. Core Results Analysis

The paper highlights how ML algorithms significantly improve the accuracy and robustness of nondestructive detection.

Food Adulteration, Recognition, and Classification:
- Studies show high accuracy in detecting food adulteration (e.g., sorghum adulteration using HSI+PLS-DA with 91% accuracy; minced beef recognition using HSI+PLSR with $R^2=0.97$ ).
- For classification tasks, the fusion of multiple sensor data (e.g., E-nose and MV for tea quality) can boost accuracy (from 78% to 92% for multi-sensor data with SVM).
- DL models, particularly CNNs with transfer learning, showed enhanced performance in barley classification (from 88.15% to 92.23% accuracy), demonstrating their ability to extract relevant spatial and spectral information from hyperspectral images.
Food Quality Detection (Fruits, Meat, Aquatic, Other Products):
- Fruits: ML models successfully predict ripeness, detect defects, and quantify chemical compositions (SSC, firmness, total phenolics/flavonoids/anthocyanins). For instance, PLSR models with HSI for blueberries achieved $R_p^2=0.75$ for SSC and $R_p^2=0.876$ for firmness. CNNs and deep AE used as feature extraction methods with HSI for black goji berries showed high $R_p^2$ (0.897 for total anthocyanins). CNNs also showed high accuracy (92%) for defective apple detection using MV.
- Meat and Aquatic Products: HSI combined with PLSR accurately predicted liquid loss in cod ( $R^2=0.97$ , RMSEP=0.58%). DL models like AlexNet and VGG achieved high accuracy (0.75-0.87 F1-score) for bone residue detection in salmon, despite some vulnerability in industrial application.
- General Performance: TML algorithms (SVM, BP-ANN, KNN) are powerful tools for quality classification and nutrient prediction. The paper notes that SVM is a good choice for nondestructive technologies due to its ability to handle nonlinear problems and avoid over-learning.
Food Safety Detection (Chemical and Microbial Contaminations):
- Chemical Contaminations: E-nose with TML algorithms (PCA, LDA, SVM) could detect pesticide residues on apples with 94% overall accuracy. LIBS and HSI with PLSR detected thiophanate-methyl residue on mulberry fruit ( $R=0.921$ ). Advanced DL methods (e.g., WT-SCAE-SVR) demonstrated high $R^2$ values (0.9319 for Cd, 0.9418 for Pb) for heavy metal detection on lettuce.
- Microbial Contaminations: HSI combined with CNN achieved high recognition rates (over 96% pixel-level, over 90% kernel-level) for aflatoxin detection in peanuts, outperforming traditional TML models. E-nose with BP-ANN accurately classified and predicted Aspergillus spp. contamination in rice (96.4% accuracy).
  
  Overall, the results consistently support the premise that integrating ML, particularly DL for complex imaging data, significantly enhances the capabilities of nondestructive technologies. However, the paper also implicitly highlights that the choice of ML algorithm (TML vs. DL) and specific variant depends heavily on the data characteristics, computational resources, and specific application requirements.

6.2. Data Presentation (Tables)

The following are the results from Table 1 and Table 2 of the original paper:

The following are the results from Table 1 of the original paper:

Features	Deep learning (DL)	Traditional machine learning (TML)
Data dependency	Requirement of big data for training	Requirement of small or medium datasets for training
Hardware dependency	Dependency of graphics processing units (GPUs) and storage rooms for accelerating the training process	Dependency of low-end processors such as central processing units (CPUs)
Feature engineering	An end-to-end mapping from the input to output automatically	Requirement of hand-crafted features, human expertise and complicated task-specific
Training time	Taking much time	optimization Taking a relatively short time
Logical explanatory	A black-box method and confusion of the hyperparameters for designing the complex	Requirement of a certain mathematical theoretical
Multiple tasks	network Completing simultaneously	foundation Completing step by step Higher accuracy in a
Others	Higher accuracy in vision-based tasks such as objection detection, classification, and segmentation Having the premier flexibility and generalization ability	simple task with a small dataset Applications in a specific domain

The above table (Table 1 from the original paper) provides a comparison between Deep Learning (DL) and Traditional Machine Learning (TML) across several key features, highlighting their differences in data requirements, hardware needs, feature engineering processes, training time, interpretability, and task handling capabilities.

The following are the results from Table 2 of the original paper:

Applications of nondestructive qualitative analysis in food.
Product	Purpose	Tools	Data processing	Feature selection	ML classifiers	Best results	Reference
Food adulteration
Sorghum	Adulteration recognition	HSI	MSC	PCA	PLS-DA	Accuracy > 90%	Bai et al. (2020)
Minced beef	Adulteration recognition	HSI	MSC, SNV, 1st and 2nd derivatives, and SG	Stepwise regression	PLSR	R²= 0.97, RMSEP = 2.61%, RPD = 5.86	Kamruzzaman, Makino, and Oshita (2016)
Pork	Adulteration recognition	E-nose	The DWT and several mother wavelets, PCA Box plot and	Standard deviation, mean, kurtosis, skewness Time domain	SVM, ANNs, LDA, KNN NB, SVM,	Accuracy = 98.10%	Sarno et al. (2020)
Food classification
Thawed pork	Fresh and frozen pork classification	HSI	PCA	HS, GLCM, GLGCM, SPA,	PNN	Accuracy = 94.14%	Pu et al. (2015)
Meat product	Category classification	HSI	Graph-based post-processing	UVE 3D-CNN	3D-CNN, PLS-DA,	Accuracy = 97.1%	Al-Sarayreh et al. (2020)
Chinese liquors	Category classification	E-nose	method Data correction, average value, normalization	FDPCA, DPCA, PCA	SVM KNN	Accuracy = 98.78%	X. H. Wu et al. (2019)
Bulk raisin	Category classification	MV	Image thesholding	GLCM, GLRM, LBP	PCA, SVM, LDA	Accuracy = 85.55%	Khojastehnazhand and Ramezani (2020)
Fruit quality detection
Gooseberry	Ripeness classification	MV	PCA	Color feature extraction	ANNs, DT, SVM, KNN	Accuracy = 93.02%	Castro et al. (2019)
Loquat	Defect discrimination	HSI	SNV, SG, 1st and 2nd derivatives, gap segment derivative	200 random models, Monte Carlo method	PLS-DA, RF, XGBoost	Accuracy > 95.9%	Munera et al. (2021)
Apples	Defect detection	MV	Otsu thresholding	CNN, RGB color, GLCM	CNN, SVM	Recall = 91%, Specificity = 93%, Accuracy = 92%	Fan et al. (2020)
Kiwifruit	Ripeness prediction	E-nose		Time domain features	PLSR, SVM, RF	Accuracy > 99.4%, R² > 0.9143	Du et al. (2019)
Apple	Mealiness detection	Acoustic	STFT	CNN	AlexNet, VGG	Accuracy = 91.11%	Lashgari, Imanmehr, and Tavakoli (2020)
Aquatic and meat products quality detection
Salmon fillet	DHA and EPA prediction	HSI	Image thesholding	GA, PNN	PLSR, MLR, PCR	DHA:R² = 0.829, RMSEP = 0.585, EPA: R² = 0.843, RMSEP = 0.195	Cheng et al. (2019)
Crucian carp fillets	TVB-N, WHC, pH	HSI	Image thesholding	SPA, K-means	PLSR	TVB-N: R² = 0.79, WHC: R² = 0.71, pH: R² = 0.88	Wang et al. (2019)
Cod	Liquid loss of cod	HSI	PCA, 1st derivative	GA	PLSR, KNN	R² = 0.88, RMSEP = 0.62%	Anderssen et al. (2020)
Salmon	Bone residues detection	MV	Data augmentation, JEPG lossy compression	Faster-RCNN	AlexNet, VGG	F1-score = 0.87, PR-AUC = 0.76	Xie et al. (2021)
Fish meat	Fast content and texture detection	Ultrasound	Low-pass filter, pre-emphasis, STFT,	SOM	RBF network	R² = 0.89 (p < 0.05)	Tokunaga et al. (2020)
Other products quality detection
Nuts		HSI	normalization PCA	CNN	CNN-SVM	Accuracy =	Han et al. (2021)
Black tea	Quality grades classification	MV, MNIR	SNV, PCA	Color and texture feature extraction	PCA-SVM	94.29%	Li et al. (2021)
Tea	Quality identification	MV, E-nose		Color and texture feature extraction	KNN, SVM, MLR	Accuracy = 100%	Xu, Wang, and Gu (2019)
Seed	Viability prediction	HSI	PCA, DWT	CNN	CNN, SVM	Accuracy > 90%	Ma, Tsuchikawa, and Inagaki (2020)
Chemical contamination
Mulberry fruit	Thiophanate-methyl residue detection	LIBS, HSI	SNV, PCA	CARS	PLSR, PCA	Correlation coefficient = 0.921	D. Wu et al. (2019)
Chlorella pyrenoidosa	Pesticide residues detection	HSI	WT, SG smoothing, and baseline correction	SPA	PLS-DA, LDA	Accuracy = 90%	Shao et al. (2016)
Apples	Pesticide residue detection	E-nose	Radar map and bar char	Heat map, PCA, LDA	PCA, LDA, SVM	Accuracy = 99.37%	Tang et al. (2021)
Microbial contamination
Agaricus bisporus	Fungal growth detection	HSI, NIR, MIR, E-nose	MSC, SNV, 1st and 2nd derivatives, normalization	PCA	PLSR, PLS-DA	R² = 0.670 ~ 0.821, Accuracy = 99%	Wang et al. (2020)
Peanuts	Mouldy identification	HSI	Image thresholding		KNN, RF, RBF-SVM, BP-ANNs, CNN	Accuracy = 97.26%	Han and Gao (2019)
Rice kernels	Aspergillus spp. discrimination	E-nose	PCA		PLSR, BP-ANN, SVM, LVQ	Accuracy = 96.4%, R² = 0.917	Gu, Wang, and Wang (2019)
Bacteria	Species classifi cation	MV	PCA		SVM	Accuracy = 93.3%	Kim et al. (2021)

The above table (Table 2 from the original paper) comprehensively lists various applications of nondestructive technologies combined with machine learning (ML) for food quality and safety detection. It details the product, purpose of detection, specific tools (nondestructive technologies), data processing techniques, feature selection methods, ML classifiers used, the best results achieved (often in terms of accuracy or $R^2$ ), and the corresponding references. This table serves as a concrete summary of the empirical evidence supporting the paper's claims.

6.3. Ablation Studies / Parameter Analysis

The review paper does not present new ablation studies or parameter analyses, but it references how various studies incorporate these concepts:

Feature Selection Methods: Many studies mentioned in Table 2 or the text utilize different feature selection or feature extraction methods (PCA, SPA, UVE, CARS, GA, FDPCA, DWT, CNN as feature extractor) to reduce data dimensionality and improve model performance. Comparing models with and without these steps, or with different feature sets, serves as an implicit form of ablation. For example, the use of Genetic Algorithms (GA) with PLSR for cod liquid loss prediction showed modest improvements and reduction of necessary components, indicating the value of feature selection.
Hyperparameter Optimization: For algorithms like KNN, the $K$ parameter needs optimization. For SVM, regularization parameter (C) and $kernel parameter (γ)$ are optimized, especially for RBF kernels. Random Forest models optimize the number of decision trees and features in each tree. While the paper doesn't detail specific optimization processes, it highlights their importance for achieving the reported accuracies.
Preprocessing Methods: The impact of data preprocessing methods (e.g., MSC, SNV, SG, first/second derivatives, normalization, baseline correction, image thresholding) on model performance is frequently investigated in the referenced studies, which can be seen as an analysis of component contribution to the overall system.
Comparison of ML Algorithms: The explicit comparisons between TML and DL algorithms, or between different TML algorithms (e.g., KNN, RF, RBF-SVM, BP-ANN vs. CNN for aflatoxin detection), effectively demonstrate the relative strengths of different modeling components for specific tasks and data types.

These analyses, embedded within the cited works, confirm that individual components like specific feature engineering techniques, preprocessing steps, or the choice of ML architecture significantly influence the final predictive accuracy and robustness of the nondestructive food detection systems.

7. Conclusion & Reflections

7.1. Conclusion Summary

The paper effectively concludes that the integration of machine learning (ML) techniques with nondestructive inspection technologies offers substantial potential for advancing food quality and safety assessment. It systematically reviews traditional machine learning (TML) and deep learning (DL) algorithms, detailing their principles and diverse applications across acoustic analysis, machine vision, electronic nose, and spectral imaging. The review highlights that ML is crucial for efficiently extracting useful information from complex, high-dimensional data generated by these sensing technologies, overcoming limitations of conventional chemometric methods. While acknowledging the value of various ML algorithms in different contexts, the paper particularly emphasizes deep learning as a powerful and promising technique for real-time applications due to its inherent feature learning capabilities, although it notes the need for further research to enable its full and wide applications in the food industry.

7.2. Limitations & Future Work

The authors identify several critical challenges and propose future research directions:

Lack of Labeled Data: A significant barrier for ML applications, especially DL, is the scarcity of labeled data. Manually labeling optimal features from complex data like hyperspectral images is time-consuming and challenging.
- Future Work: Focus on transfer learning (inductive, transductive, unsupervised) to leverage existing knowledge with minimal new labeled data, by re-weighting data, finding suitable feature representations, or building relational knowledge between domains.
Lack of Benchmark Datasets and Standardization: The absence of publicly available, standardized benchmark datasets across various nondestructive technologies hinders large-scale application and comparative analysis of different techniques.
- Future Work: Establish Standard Operating Procedures (SOPs) for creating benchmark and open datasets for each nondestructive technology.
Environmental and Dynamic Variability: Factors like variety differences, instrument effects, and environmental conditions (e.g., lighting, humidity, temperature) dynamically alter food quality patterns, necessitating costly and time-consuming model recalibration.
- Future Work: Explore lifelong learning (LL) with neural networks and reinforcement learning to build robust models that accumulate knowledge over time, akin to human intelligence. This involves integrating regularization, ensembling, rehearsal, and dual-memory methodologies to account for various influencing factors.
Computational Demands of Deep Learning: DL is data-hungry and requires significant training time and expensive GPUs and storage rooms, posing challenges for further development, especially for HSI and MV.
- Future Work: Focus on shortening training times and simplifying DL model architectures. Develop lightweight and efficient models (e.g., SqueezeNet, ShuffleNet, MobileNet, GhostNet) that offer better accuracy with fewer parameters, suitable for on-site and real-time detection on portable devices.
Limited DL Application in E-nose and Acoustic Analysis: While MV and HSI extensively use DL, its application in e-nose and acoustic analysis is less developed.
- Future Work: Expand DL techniques to e-nose and acoustic analysis, for example, as an automatic drift compensation method to address sensor drift caused by environmental conditions without manual rule-setting.

7.3. Personal Insights & Critique

This review serves as an excellent entry point for anyone interested in the intersection of machine learning and nondestructive food quality/safety detection. Its strength lies in its comprehensive structure, clearly differentiating TML from DL and mapping their applications across diverse sensing technologies and food products. The comparison table (Table 1) is particularly helpful for beginners to grasp the fundamental trade-offs between DL and TML.

One key insight drawn from this paper is the undeniable shift towards deep learning for vision-based tasks (MV, HSI), where its ability to automatically extract complex features from high-dimensional raw data offers significant performance advantages over traditional methods requiring manual feature engineering. This highlights the increasing importance of raw data quality and volume as direct inputs to DL models.

A potential area for improvement, though typical for a high-level review, is the relative lack of detail on how certain ML models are implemented within specific food contexts. For a truly beginner-friendly guide, elaborating on the typical data preprocessing pipelines, feature engineering steps for TML, or specific CNN architectures used for a given task would be beneficial. For instance, explaining the general workflow of a CNN for image classification in the context of apple defect detection with concrete examples of convolutional layers and pooling operations would deepen understanding.

From a critical perspective, while the paper emphasizes DL's power, it also implicitly reveals its primary practical hurdle: the data-hungry nature and computational expense. The call for lightweight models and lifelong learning is critical, as real-time industrial applications demand not just accuracy but also efficiency and adaptability to changing conditions without constant, costly retraining. The black-box nature of DL also remains a concern for regulatory environments where interpretability and explainability are highly valued.

The paper's emphasis on standardized datasets and SOPs is a crucial point that often gets overlooked. Without these, comparing research, reproducing results, and deploying models at scale remain challenging. The concepts of transfer learning and lifelong learning are vital for overcoming the cold-start problem (lack of initial labeled data) and the catastrophic forgetting problem (loss of old knowledge when learning new tasks) in real-world, dynamic food processing environments.

Overall, this review provides a robust framework for understanding the current state of the art and clearly charts a path for future research, particularly in making DL more accessible, efficient, and interpretable for the complex and dynamic domain of food quality and safety detection.

Applications of machine learning techniques for enhancing nondestructive food quality and safety det

TL;DR Summary