Paper status: completed

A Comprehensive Survey of Multi‑Level Thresholding Segmentation Methods for Image Processing

Published:03/27/2024
Original Link
Price: 0.100000
2 readers
This analysis is AI-generated and may not be fully accurate. Please refer to the original paper.

TL;DR Summary

The paper reviews multi-level thresholding methods in image processing, focusing on capturing image complexity through multi-range intensity partitioning. It discusses metaheuristic algorithms for optimizing threshold values and outlines advantages, limitations, and future resear

Abstract

In image processing, multi-level thresholding is a sophisticated technique used to delineate regions of interest in images by identifying intensity levels that differentiate different structures or objects. Multi-range intensity partitioning captures the complexity and variability of an image. The aim of metaheuristic algorithms is to find threshold values that maximize intra-class differences and minimize inter-class differences. Various approaches and algorithms are reviewed and their advantages, limitations, and challenges are discussed in this paper. In addition, the review identifies future research areas such as handling complex images and inhomogeneous data, determining thresholding levels automatically, and addressing algorithm interpretation. The comprehensive review provides insights for future advancements in multilevel thresholding techniques that can be used by researchers in the field of image processing.

Mind Map

In-depth Reading

English Analysis

1. Bibliographic Information

1.1. Title

A Comprehensive Survey of Multi-Level Thresholding Segmentation Methods for Image Processing

1.2. Authors

  • Mohammad Amiriebrahimabadi

  • Zhina Rouhil

  • Najme Mansouri

    The affiliations are not explicitly stated for all authors, but Mohammad Amiriebrahimabadi and Najme Mansouri are listed with a "1" superscript, suggesting a common primary affiliation.

1.3. Journal/Conference

The paper was published online by Springer on March 27, 2024. Although the specific journal is not explicitly named in the provided text, Springer is a highly reputable publisher for scientific and technical content, particularly in fields like image processing, computer science, and engineering. Publication with Springer suggests a peer-reviewed process and contributes to the paper's credibility and visibility within the academic community.

1.4. Publication Year

2024

1.5. Abstract

This paper presents a comprehensive survey of multi-level thresholding segmentation methods used in image processing. The technique is crucial for delineating regions of interest by identifying intensity levels that distinguish different structures and objects, effectively capturing image complexity through multi-range intensity partitioning. The survey reviews various approaches, particularly those employing metaheuristic algorithms, which aim to find optimal threshold values by maximizing intra-class differences and minimizing inter-class differences. The authors discuss the advantages, limitations, and challenges of these methods. Furthermore, the review identifies critical future research areas, including handling complex and inhomogeneous image data, automating the determination of thresholding levels, and improving algorithm interpretability. This extensive review provides valuable insights for researchers, guiding future advancements in multi-level thresholding techniques within the image processing domain.

/files/papers/692b228e4114e99a4cde874e/paper.pdf (This is a relative path; the actual link would be a full URL if available). The paper was published online on March 27, 2024, indicating it is an officially published work.

2. Executive Summary

2.1. Background & Motivation

The core problem the paper addresses is image segmentation, specifically focusing on multi-level thresholding. Image segmentation is the process of dividing an image into meaningful, segmentally coherent regions or objects, which is foundational for numerous applications such as object recognition, medical imaging, autonomous systems, virtual/augmented reality, video surveillance, and environmental monitoring.

While thresholding is a simple and direct method for segmentation, Bi-Level Thresholding (BLT) is often insufficient for images with more than two classes or complex intensity distributions. This limitation motivates the need for Multi-Level Thresholding (MLT), which divides an image into multiple intensity levels, offering more precise segmentation by accommodating intensity variations and capturing finer details.

A significant challenge in MLT is determining the optimal threshold values. Traditional methods can be computationally expensive, especially as the number of thresholds increases. This is where metaheuristic algorithms (MAs) become crucial. They offer stochastic search capabilities to find optimal solutions efficiently, reducing computational effort and avoiding local optima. The paper is motivated by the need for a comprehensive resource that reviews these MLT techniques, analyzes their strengths and weaknesses, and identifies future research directions to enhance image segmentation in various real-world applications.

2.2. Main Contributions / Findings

The paper makes several significant contributions by providing a comprehensive survey of multi-level thresholding image segmentation methods:

  • Comprehensive Review: It offers an extensive review of multi-level thresholding image segmentation techniques, consolidating knowledge in this specific field.

  • Detailed Taxonomy: The paper provides a detailed taxonomy of various thresholding approaches, including prominent methods like Otsu, Kapur, Tsallis, Fuzzy entropy, Minimum Cross Entropy (MCE), and Renyi's entropy.

  • Classified Applications and Case Studies: It classifies thresholding applications and provides case studies across diverse fields such as medical imaging, remote sensing, plant pathology, and industrial quality control.

  • Comparative Analysis: The survey compares different datasets used in reviewed studies and describes existing image segmentation techniques, highlighting their advantages and disadvantages.

  • Technical Details: It presents simulation environments, programming languages, and discusses evaluation metrics used for proposed algorithms in detail.

  • Identification of Research Gaps: Crucially, it identifies research gaps and challenges for multi-level thresholding in image segmentation, outlining areas for future research.

    The key findings revolve around the demonstrated effectiveness of multi-level thresholding for flexible and adaptive segmentation, its ability to handle complex intensity distributions, and the role of metaheuristic algorithms in optimizing threshold selection to reduce computational costs. The survey concludes by emphasizing the need to address identified challenges to further advance these techniques.

3. Prerequisite Knowledge & Related Work

3.1. Foundational Concepts

To fully understand this paper, a reader needs to grasp several fundamental concepts in image processing and optimization:

  • Image Segmentation:

    • Conceptual Definition: Image segmentation is a fundamental task in computer vision and image processing that involves partitioning a digital image into multiple segments (sets of pixels, also known as image objects). The goal is to simplify and/or change the representation of an image into something more meaningful and easier to analyze. It divides an image into meaningful, segmentally coherent regions or objects.
    • Purpose: The output of image segmentation is a set of segments that collectively cover the entire image, or a set of contours extracted from the image. Each pixel in a region is similar with respect to some characteristic or computed property, such as color, intensity, or texture. Adjacent regions are significantly different with respect to the same characteristics.
    • Granularity Levels: The paper mentions image segmentation can occur at coarse, medium, and fine levels of granularity.
    • Classification (Figure 1): Image segmentation techniques are broadly classified into:
      • Thresholding: Based on pixel intensity values.

      • Clustering: Grouping pixels based on similarity.

      • Edge-based: Detecting boundaries between regions.

      • Region-based: Growing regions from seed points.

      • Artificial Neural Networks (ANNs) / Deep Learning: Learning complex patterns for segmentation.

        The following figure (Figure 1 from the original paper) illustrates how segmentation techniques can be classified:

        Fig. 1 Classification of segmentation techniques 该图像是一个示意图,展示了图像分割技术的分类,包括阈值法、聚类法、基于边缘的方法和基于区域的方法,表明了这些方法之间的相互关系。

  • Thresholding:

    • Conceptual Definition: Thresholding is a simple and direct method of image segmentation that separates the foreground of an image from its background based on pixel intensity values. It involves converting a grayscale image into a binary image, where pixels are assigned to one of two classes based on whether their intensity is above or below a certain threshold.
    • Bi-Level Thresholding (BLT): This is the simplest form, where a single threshold value tt is used to divide an image into two classes: foreground and background. Pixels with intensity values greater than tt belong to one class, and those less than or equal to tt belong to the other.
    • Multi-Level Thresholding (MLT): This technique extends BLT by using multiple threshold values (t1,t2,,tmt_1, t_2, \ldots, t_m) to divide an image into m+1m+1 distinct classes or regions. This allows for more precise segmentation, especially for images with complex intensity distributions or multiple objects of interest. For example, three thresholds would divide an image into four classes.
    • Types of Thresholding:
      • Global Thresholding: Uses a single threshold value for the entire image.
      • Local Thresholding: Selects individual thresholds for different regions of the image, adapting to local intensity variations.
      • Parametric Thresholding: Assumes a statistical model (e.g., Gaussian distribution) for the image's intensity values and estimates parameters to determine thresholds. Can be time-consuming.
      • Non-Parametric Thresholding: Uses statistical criteria (e.g., entropy, variance) directly from the image histogram to determine threshold values, without assuming a specific model.
  • Meta-Heuristic Algorithms (MAs):

    • Conceptual Definition: Meta-heuristic algorithms are high-level problem-solving procedures designed to find optimal or near-optimal solutions to complex optimization problems, especially those where exact methods are computationally infeasible. They often employ stochastic (randomized) search strategies and are capable of avoiding local optima (suboptimal solutions that are better than their immediate neighbors but not the absolute best) by exploring the search space broadly.
    • Characteristics:
      • Derivative-free: They do not require gradient information, making them suitable for non-differentiable or discontinuous objective functions.
      • Simplicity: Often easy to understand and implement.
      • Flexibility: Can be adapted to various problem types.
      • Exploration vs. Exploitation: They balance exploration (searching diverse regions of the solution space) and exploitation (intensifying the search around promising solutions).
    • Examples (Figure 7): The paper categorizes MAs into:
      • Evolution-Based (EB): Mimic natural evolution (e.g., Genetic Algorithm (GA), Differential Evolution (DE)).

      • Swarm-Based (SB): Mimic social behavior of animal swarms (e.g., Particle Swarm Optimization (PSO), Whale Optimization Algorithm (WOA), Ant Colony Optimization (ACO), Cuckoo Search Algorithm (CSA), Firefly Algorithm (FA), Slime Mould Algorithm (SMA), Gray Wolf Optimization (GWO)).

      • Physics/Chemistry-Based (PCB): Inspired by physical or chemical phenomena (e.g., Multi-Verse Optimizer (MVO), Sine Cosine Algorithm (SCA)).

      • Human-Based (HB): Inspired by human behavior.

      • Others.

        The following figure (Figure 7 from the original paper) illustrates the classification of meta-heuristic approaches:

        Fig. 7 Classification of meta-heuristic approaches 该图像是一个示意图,展示了不同类型的元启发式算法的分类,包括基于进化、物理、群体和人类的算法。这些算法在多级阈值分割方法中应用广泛,能够帮助提高图像处理的效果。

  • Objective Functions (or Fitness Functions):

    • Conceptual Definition: In optimization problems, an objective function (also known as a fitness function in meta-heuristics) is a mathematical function that a meta-heuristic algorithm aims to either maximize or minimize. For multi-level thresholding, the objective function quantifies the "goodness" of a set of chosen threshold values.
    • Role in MLT: The threshold values are the decision variables that the meta-heuristic algorithm adjusts to optimize the objective function. For example, Otsu's method uses an objective function that maximizes the variance between classes, while Kapur's entropy maximizes the sum of entropies of the segmented regions.
  • Entropy:

    • Conceptual Definition: In information theory, entropy is a measure of the uncertainty or randomness associated with a random variable. In image processing, it can be used to quantify the information content or uniformity of different regions within an image. Higher entropy often indicates more information or less predictability.
    • Types in MLT: The paper discusses Kapur's Entropy, Tsallis Entropy (a generalization of Shannon's Entropy), Fuzzy Entropy, and Renyi's Entropy, all adapted as objective functions for MLT.

3.2. Previous Works

The paper dedicates Section 2.5 to reviewing related survey papers, emphasizing that while many surveys exist in computer vision and image processing, none specifically and comprehensively cover multi-level thresholding image segmentation to the same extent as the proposed work. The key prior surveys mentioned and their focus are:

  • Nakane et al. [55] (2020): Focused on the application of Evolutionary Algorithms (EAs) and Swarm Algorithms (SAs) (specifically Genetic Algorithms (GA), Differential Evolution (DE), Particle Swarm Optimization (PSO), and Ant Colony Optimization (ACO)) to computer vision problems.

  • Zhang et al. [56] (2022): Reviewed image analysis methods for microorganism counting, from classical image processing to deep learning.

  • Agrawal and Choudhary [57] (2023): Systematically surveyed segmentation and classification methods for chest radiography, including Generative Adversarial Networks (GANs) for lung segmentation and disease detection.

  • Mittal et al. [58] (2022): Investigated clustering-based image segmentation techniques, categorizing them into hierarchical and partitional methods, with a focus on histogram-based, K-means-based, and meta-heuristics-based partitional clustering.

  • Punn and Agarwal [59] (2022): Described the U-Net framework and its variants for biomedical image segmentation, particularly in the context of SARS-CoV-2 and COVID-19.

  • Loyola-Gonzalez et al. [60] (2020): Reviewed Contrast Pattern-based classification (CP) and its challenges, including exhaustive-search-based algorithms and decision tree-based approaches.

  • Iqbal et al. [61] (2022): Provided a comprehensive study on the application of GANs to medical image segmentation, discussing various models, metrics, loss functions, and datasets.

  • Ramadan et al. [62] (2020): Surveyed Interactive Image Segmentation (IIS) methods, also known as foreground/background separation.

  • Liu et al. [63] (2020): Reviewed deep learning-based achievements in object detection, covering frameworks, features, proposals, context modeling, and evaluation.

  • Rai et al. [64] (2022): Focused on Nature-Inspired Optimization Algorithms (NIOA) for multilevel thresholding problems, highlighting challenges in developing multi-thresholding models. This survey is the closest in topic to the current paper.

  • Borji et al. [65] (2019): Reviewed advances in salient object detection, including its relation to generic scene segmentation and object proposal generation.

  • Aljuaid and Anwar [66] (2022): Surveyed supervised learning techniques for medical image processing, covering methods like Convolutional Neural Networks (CNNs), region-based CNNs, Fully Convolutional Networks (FCNs), and U-Net architectures.

  • Sasmal and Dhal [67] (2023): Compared superpixel images and clustering techniques for image segmentation, discussing superpixel generation and partitional clustering.

  • Aliabugah et al. [68] (2023): Investigated multilevel threshold image segmentation-based metaheuristic optimization methods, outlining definitions, procedures, optimization methods, and performance analysis using benchmark images. This is also a very close survey, but the current paper claims to have a "higher presentation quality" and a more specific, comprehensive focus.

  • Bagwari et al. [69] (2023): Provided a comprehensive analysis of satellite image segmentation techniques, including deep learning approaches.

    The following are the results from Table 4 of the original paper, summarizing related surveys:

    Article Year Study area Main Focus Author Title Keywords
    Proposed survey 2023 Multi-level thresholding segmentation in image processing Review the most useful and effective methods and innovations presented in the past few years for multi-level thresholding segmentation in image processing Amiriebrahimabadi et al A comprehensive survey of multi-level thresholding segmentation methods for image processing Thresholding, Segmentation, Deep learning, Survey
    [55] 2020 Survey of EAs and SAs adopted to solving computer vision problems Summary of GA, DE, PSO and ACO swarm algorithms performance in computer vision field Nakane et al Application of evolutionary and swarm optimization in computer vision: a literature survey Evolutionary algorithms, Swarm algorithms, Computer vision, Literature survey
    [56] 2022 Microorganism counting The process of identifying future trends in microorganism counting and offering systematic guidelines for the deployment of comprehensive microorganism counting systems is being developed Zhang et al A comprehensive review of image analysis methods for microorganism counting: from classical image processing to deep learning approaches Microorganism counting, Digital image processing, Microscopic images, Image analysis, Image segmentation
    [57] 2023 Chest radiography The study explores the segmentation of lungs and the identification and categorization of lung diseases using datasets that are publicly accessible Agrawal and Choudhary Segmentation and classification on chest radiography: a systematic survey Deep convolutional neural network, Computer vision, Lung segmentation, Multiclass classification, Nodule, TB, COVID-19, Pneumothorax detection, GAN
    [58] 2021 Image segmentation clustering methods In the field of image segmentation, examining and comparing clustering methods related performance parameters Mittal et al A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets Image segmentation, Clustering methods, Performance parameters, Benchmark datasets
    [59] 2022 Biomedical Through the categorization of U-Net variants into inter-modality and intra-modality, they are able to deepen their comprehension of the problems and potential solutions related to U-Net Punn and Agarwal Modality specific U-Net variants for biomedical image segmentation: a survey Biomedical image segmentation, Deep learning, U-Net
    [60] 2020 Contrast Patterns A study of supervised classification based on CP and its applications Loyola-Gonzalez A Review of Supervised Classification based on Contrast Patterns: Applications, Trends, and Challenges Supervised classification, Contrast patterns, Review, Taxonomy
    [61] 2022 Biomedical This is a summary of various models based on GANs, performance indicators, loss functions, datasets, methods of augmentation, implementations of research papers, and source code used in the field of medical image segmentation Iqbal et al Generative adversarial networks and its applications in the biomedical image segmentation: a comprehensive survey Generative adversarial network, GANs applications, GANs in medical image segmentation
    [62] 2020 Interactive-based image segmentation methods Object extraction based on user interaction, often called foregroundbackground separation in IIS Ramadan et al A survey of recent interactive image segmentation methods Interactive image segmentation, user interaction, label propagation, deep learning, superpixels
    [63] 2019 Generic object detection based on deep learning methods Purpose of mentioned is to provide a comprehensive survey of the recent developments in deep learning in this field Liu et al Deep Learning for Generic Object Detection: A Survey Object detection, Deep learning, Convolutional neural networks, Object recognition
    [64] 2022 multi-thresholding image segmentation Review of the challenges encountered when developing image multi-threshold models based on NIOA in recent years (20192021) Rai et al Nature-inspired optimization algorithms and their significance in multi-thresholding image segmentation: an inclusive review Multilevel Thresholding, Nature-Inspired Optimization Algorithms (NIOA), Exponential, Nonlinear, Combinatorial, Nondeterministic, Image segmentation
    [65] 2019 segmenting salient objects This is an extensive review of the latest advancements in the detection of salient objects, positioning this domain in relation to other closely associated fields such as general scene segmentation, generation of object proposals, and saliency in prediction of fixation Borji et al Salient object detection: A survey Salient object detection, salient object detection, survey
    [66] 2022 Medical Images Medical image processing tasks were supervised using learning methods and metrics Aljuaid and Anwar Survey of Supervised Learning for Medical Image Processing Deep learning, Convolutional neural network (CNN), Fast R-CNN, Faster R-CNN, FCN, Mas R-CNN, Medical image learning, U-Net
    [67] 2023 Superpixel image clustering based Determine the effectiveness of integrating superpixel and partitional Sasmal and Dhal A survey on the utilization of Superpixel image for clustering-based image Superpixel, Clustering, Image segmentation, Optimization, Leaf images, Oral histopathology
    [68] 2022 Multilevel thresholding segmentation using meta-heuristic Presents the application of Multilevel thresholding as the whale optimization algorithm, particle swarm optimization, and others, to multilevel thresholding for image segmentation Multilevel thresholding optimization algorithms: comparative analysis, open challenges and new trends Multilevel threshold- optimization algorithms, Real-world problems, Optimization problems, Survey
    [69] 2023 Satellite Images It aims to address the challenges and limitations in satellite image processing and segmentation methodologies Bagwari et al A Comprehensive Review on Segmentation Techniques for Satellite Images

The following are the results from Table 5 of the original paper, summarizing comparisons made in reviewed articles:

Article Highest order evaluation metric Analysis Comparative Taxonomy Presenting a comparative chart setup environment details Datasets analysis Reviewed Articles Motivation or open Issues
Proposed review 79
[55] X X X X 109 X
[56] X V X X 144
[57] V X X 85 X
[58] X X X X
[59] X X X 57
[60] X X X 105 X
[61] X X 138 X
[62] X X X 150 X
[63] X 300
[64] X X 65
[65] X X X 228
[66] X 36 X
[67] X 34 X
[68] X 80
[69] X 56

3.3. Technological Evolution

The evolution of image segmentation techniques has progressed from simple thresholding methods to more sophisticated approaches. Initially, Bi-Level Thresholding (BLT) provided a quick way to separate foreground from background. However, as images became more complex with varying lighting conditions and multiple objects, Multi-Level Thresholding (MLT) emerged to offer finer-grained segmentation by dividing images into several intensity ranges.

The main challenge for MLT has been finding the optimal set of thresholds, which becomes a computationally intensive search problem as the number of thresholds increases. This led to the integration of optimization algorithms. Deterministic optimization methods were often slow or prone to local optima. The next significant step was the adoption of meta-heuristic algorithms (MAs). These MAs, inspired by natural phenomena (e.g., evolutionary algorithms, swarm intelligence), offered efficient stochastic search capabilities to explore vast solution spaces and find near-optimal thresholds, significantly reducing computation time and improving segmentation quality.

More recently, the field has seen advancements in:

  • Hybrid Approaches: Combining different meta-heuristics or integrating meta-heuristics with objective functions (like Otsu or Kapur entropy) to leverage their respective strengths.

  • Deep Learning Integration: While this survey focuses on thresholding and meta-heuristics, other surveys mentioned (e.g., [57], [59], [63], [66]) highlight the rise of deep learning models (like CNNs and U-Nets) for complex segmentation tasks, particularly in medical imaging.

  • Addressing Specific Challenges: Research has also focused on improving robustness to noise, handling inhomogeneous data, and developing adaptive or automatic threshold determination methods.

    This paper's work fits within this timeline by consolidating the advancements in MLT primarily driven by meta-heuristic optimization over recent years (2017-2023), providing a foundation for future research in these areas.

3.4. Differentiation Analysis

Based on the provided Related Works section (Section 2.5) and Table 5, the current survey differentiates itself from prior studies through its specific focus and comprehensive analytical scope:

  • Specific Focus: Unlike many previous surveys that cover broader topics like evolutionary/swarm optimization in computer vision [55], microorganism counting [56], chest radiography [57], clustering methods [58], U-Net variants [59], contrast patterns [60], GANs in medical imaging [61], interactive image segmentation [62], generic object detection [63], supervised learning for medical images [66], superpixel clustering [67], or satellite image segmentation [69], this paper exclusively concentrates on Multi-Level Thresholding Segmentation for image processing. While Rai et al. [64] and Aliabugah et al. [68] also cover multi-thresholding, this survey claims to offer a "higher presentation quality" and a more specific, comprehensive analysis.

  • Comprehensive Analytical Depth: The paper claims to be more comprehensive by providing:

    • Detailed Taxonomy of Thresholding Approaches: It specifically delves into various objective functions like Otsu, Kapur, Tsallis, Fuzzy, MCE, and Renyi entropies, which is a core component missing or less detailed in other surveys.

    • Setup Environment Details: Table 5 indicates that this proposed review provides setup environment details (), which many other surveys (e.g., [55], [56], [57], [58], [59], [60], [61], [62], [64], [65]) marked with XX. This detail is valuable for reproducibility.

    • Datasets Analysis: The survey provides an in-depth analysis of datasets used across the reviewed studies, which is critical for understanding applicability and limitations.

    • Identification of Research Gaps and Open Issues: The paper explicitly identifies research gaps and challenges, offering insights for future work, a feature also present in some other surveys (e.g., [56], [59], [63], [64], [65], [68], [69]), but here specifically tailored to multi-level thresholding.

      In essence, while the general area of image segmentation and meta-heuristics has been surveyed, this paper's differentiation lies in its dedicated, in-depth, and multi-faceted analysis of multi-level thresholding segmentation, presenting detailed comparisons and outlining specific challenges and opportunities for this particular technique.

4. Methodology

4.1. Principles

The core principle of Multi-Level Thresholding Segmentation is to divide an image into multiple regions based on different intensity levels, rather than just two (foreground and background). This is achieved by identifying mm optimal threshold values (t1,t2,,tmt_1, t_2, \ldots, t_m) that partition the image's grayscale histogram into m+1m+1 distinct classes or segments. Each pixel is then assigned to a class based on which intensity range its value falls into.

The process typically involves:

  1. Image Preprocessing: Often includes converting to grayscale if a color image, and potentially noise reduction.

  2. Histogram Calculation: Generating the intensity histogram of the image, which shows the frequency of each pixel intensity level.

  3. Objective Function Definition: Choosing a mathematical objective function (e.g., Otsu's method, Kapur's entropy) that quantifies the "goodness" of a set of threshold values. This function typically aims to maximize the separability between classes (e.g., maximize inter-class variance) or minimize within-class similarity (e.g., minimize cross-entropy).

  4. Optimization: Employing an optimization algorithm (often meta-heuristics) to search for the mm threshold values that maximize or minimize the chosen objective function. These threshold values act as decision variables for the optimization algorithm.

  5. Image Segmentation: Applying the optimal threshold values to the image to classify each pixel into its corresponding segment.

    The following figure (Figure 8 from the original paper) provides a complete overview of the framework of thresholding image segmentation:

    Fig. 8 Framework of thresholding image segmentation 该图像是一个示意图,展示了多级阈值分割的框架,包括输入原始图像、定义阈值数、转换为灰度图、输出分割图像等步骤。该框架突出了通过元启发式和目标函数(如Kapur或Otsu)寻找最佳阈值的过程。

4.2. Core Methodology In-depth (Layer by Layer)

The paper elaborates on several specific thresholding approaches that serve as objective functions for multi-level thresholding. These methods use pixel intensity values to segment an image into different regions.

4.2.1. Otsu's Method

Otsu's method, developed in 1979 [47], is an automated threshold selection method for image segmentation that works by maximizing the between-class variance (or inter-class variance) of the pixel intensities. This variance measures the separability between the different classes created by the threshold(s). A higher between-class variance indicates better separation between the segmented regions.

Summary of Otsu's method for bi-level thresholding [48]: Let's assume an image has NN pixels and can be represented in LL gray levels ranging from 1 to LL. Let fif_i be the number of pixels at gray level ii. Then, the total number of pixels is N=f1+f2++fLN = f_1 + f_2 + \ldots + f_L. The probability of gray level ii occurrence is defined by the following equation: Pi=fiN,Pi0,i=1LPi=1 P_i = \frac{f_i}{N}, \quad P_i \ge 0, \quad \sum_{i=1}^L P_i = 1 where PiP_i is the probability of a pixel having gray level ii, fif_i is the number of pixels with gray level ii, and NN is the total number of pixels in the image.

For bi-level thresholding, if we choose a threshold tt, the pixels are divided into two classes, C0={1,,t}C_0 = \{1, \ldots, t\} and C1={t+1,,L}C_1 = \{t+1, \ldots, L\}. The cumulative probabilities (also known as class probabilities) are computed as follows: ω0=i=1tPi,ω1=i=t+1LPi \omega_0 = \sum_{i=1}^t P_i, \quad \omega_1 = \sum_{i=t+1}^L P_i where ω0\omega_0 is the probability of the first class (pixels with intensity from 1 to tt), and ω1\omega_1 is the probability of the second class (pixels with intensity from t+1t+1 to LL).

The means (average intensity values) of these two classes are calculated using the following equation: μ0=i=1tiPiω0,μ1=i=t+1LiPiω1 \mu_0 = \sum_{i=1}^t \frac{i P_i}{\omega_0}, \quad \mu_1 = \sum_{i=t+1}^L \frac{i P_i}{\omega_1} where μ0\mu_0 is the mean intensity of the first class, and μ1\mu_1 is the mean intensity of the second class.

The mean level of the entire image (global mean) is given by: μT=i=1LiPi\mu_T = \sum_{i=1}^L i P_i where μT\mu_T is the total mean intensity of the image.

The between-class variance for bi-level thresholding is represented by the objective function f(t): f(t)=σ0+σ1f(t) = \sigma_0 + \sigma_1 where \sigma_0 = \omega_0 (\mu_0 - \mu_T)^2 and \sigma_1 = \omega_1 (\mu_1 - \mu_T)^2. These terms represent the variance contributions from each class. The goal is to maximize this sum.

The process of determining the optimal threshold tt^* involves maximizing the inter-class variance: t=argmax1tL(f(t)) t^* = \arg \max_{1 \leq t \leq L} (f(t)) where tt^* is the optimal single threshold value.

Extension to Multi-Level Thresholding: For multi-level thresholding with mm thresholds (t1,t2,,tmt_1, t_2, \ldots, t_m), the image can be divided into m+1m+1 classes. The extended between-class variance is calculated as shown in the following equation: f(t)=i=0mσi f(t) = \sum_{i=0}^m \sigma_i where f(t) is the objective function to be maximized.

The sigma terms for each class ii are determined using the following equation: σ0=ω0(μ0μT)2,σ1=ω1(μ1μT)2,,σM1=ωM1(μM1μT)2 \sigma_0 = \omega_0 (\mu_0 - \mu_T)^2, \sigma_1 = \omega_1 (\mu_1 - \mu_T)^2, \ldots, \sigma_{M-1} = \omega_{M-1} (\mu_{M-1} - \mu_T)^2 where σi\sigma_i is the variance contribution for class ii, ωi\omega_i is the probability of class ii, and μi\mu_i is the mean intensity of class ii. Note that the paper uses M-1 for the last class index, which would imply MM classes, so m=M1m=M-1 thresholds.

The mean levels for each of the MM classes are calculated by the following equation: μ0=i=1t1iPiω0,μ1=i=t1+1t2iPiω1,,μM1=i=tM1+1LiPiωM1 \mu_0 = \sum_{i=1}^{t_1} \frac{i P_i}{\omega_0}, \quad \mu_1 = \sum_{i=t_1+1}^{t_2} \frac{i P_i}{\omega_1}, \ldots, \mu_{M-1} = \sum_{i=t_{M-1}+1}^L \frac{i P_i}{\omega_{M-1}} where PiP_i are the pixel probabilities, and ωj\omega_j are the cumulative probabilities for each class jj.

It is necessary to maximize the variance between classes in order to determine the optimal thresholds: t=argmaxμ(i=0M1σi) t^* = \arg \max_{\mathbf{\mu}} \left( \sum_{i=0}^{M-1} \sigma_i \right) where tt^* represents the vector of optimal thresholds μ=(t1,t2,,tM1)\mathbf{\mu} = (t_1, t_2, \ldots, t_{M-1}).

4.2.2. Kapur's Entropy

Kapur's entropy method [49] is an entropy-based thresholding technique. In information theory, this method aims to determine optimal thresholds by maximizing the entropy (or sum of entropies) of the segmented regions. The idea is that a good segmentation will result in regions that are as "random" or "informative" as possible, meaning their intensity distribution is spread out rather than concentrated.

For nn thresholds (t1,t2,,tnt_1, t_2, \ldots, t_n), which divide the image into n+1n+1 classes, the objective function f(t1,t2,,tn)f(t_1, t_2, \ldots, t_n) is defined as the sum of the entropies of these classes: f(t1,t2,,tn)=H0+H1++Hn f(t_1, t_2, \ldots, t_n) = H_0 + H_1 + \ldots + H_n where H0,H1,,HnH_0, H_1, \ldots, H_n are the entropies of the n+1n+1 classes.

The individual entropies HkH_k and their corresponding cumulative probabilities ωk\omega_k for a grayscale image with LL distinct intensity levels (0 to L-1) are calculated as follows: ω0=i=0t11Pi,H0=i=0t11Piω0lnPiω0.ω1=i=t1t21Pi,H1=i=t1t21Piω1lnPiω1.ω2=i=t2t31Pi,H2=i=t2t31Piω2lnPiω2.ωn=i=tnL1Pi,Hn=i=tnL1PiωnlnPiωn. \begin{array}{r l r} & \omega_0 = \sum_{i=0}^{t_1-1} P_i, & H_0 = - \sum_{i=0}^{t_1-1} \frac{P_i}{\omega_0} \ln \frac{P_i}{\omega_0}. \\ & \omega_1 = \sum_{i=t_1}^{t_2-1} P_i, & H_1 = - \sum_{i=t_1}^{t_2-1} \frac{P_i}{\omega_1} \ln \frac{P_i}{\omega_1}. \\ & \omega_2 = \sum_{i=t_2}^{t_3-1} P_i, & H_2 = - \sum_{i=t_2}^{t_3-1} \frac{P_i}{\omega_2} \ln \frac{P_i}{\omega_2}. \\ & \omega_n = \sum_{i=t_n}^{L-1} P_i, & H_n = - \sum_{i=t_n}^{L-1} \frac{P_i}{\omega_n} \ln \frac{P_i}{\omega_n}. \end{array} Here, PiP_i denotes the probability of a pixel having an intensity value ii. Each ωk\omega_k is the cumulative probability of pixels within class kk, and HkH_k is the Shannon entropy of class kk, normalized by ωk\omega_k.

This objective function determines the optimal thresholds tt^* as follows: t=argmax0tL1(f(t1,t2,,tn)) t^* = \arg \max_{0 \leq t \leq L-1} (f(t_1, t_2, \ldots, t_n)) where tt^* is the vector of optimal thresholds. The segmentation can be performed independently for RR, GG, and BB channels for color images.

4.2.3. Tsallis Entropy

Tsallis's entropy is a generalization of Shannon's entropy, often used in multi-fractal theory and Boltzmann-Gibbs (BGS) statistics [50, 51]. It introduces a parameter qq, known as the entropic index or degree of non-extensivity.

The general expression for Tsallis entropy SqS_q for a system with kk possibilities is: Sq=1i=1kPiqq1 S_q = \frac{1 - \sum_{i=1}^k P_i^q}{q - 1} where PiP_i is the probability of possibility ii, and qq is the entropic index.

For bi-level thresholding, Tsallis entropy is described as follows [50, 51]: J(t)=argmax[SqA(t)+SqB(t)+(1q)SqA(t)SqB(t)] J(t) = \arg \max [S_q^A(t) + S_q^B(t) + (1-q) S_q^A(t) \cdot S_q^B(t)] where tt is the single threshold value, and SqA(t)S_q^A(t) and SqB(t)S_q^B(t) are the Tsallis entropies for the two classes (foreground A and background B). The term (1q)SqA(t)SqB(t)(1-q) S_q^A(t) \cdot S_q^B(t) accounts for the non-extensivity.

Here, PAP^A and PBP^B are the cumulative probabilities for classes A and B, respectively: PA=i=0t1Pi,SqA(t)=1i=0t1(PiPA)qq1 P^A = \sum_{i=0}^{t-1} P_i, \quad S_q^A(t) = \frac{1 - \sum_{i=0}^{t-1} \left(\frac{P_i}{P^A}\right)^q}{q - 1} PB=i=tL1Pi,SqB(t)=1i=tL1(PiPB)qq1 P^B = \sum_{i=t}^{L-1} P_i, \quad S_q^B(t) = \frac{1 - \sum_{i=t}^{L-1} \left(\frac{P_i}{P^B}\right)^q}{q - 1} where PiP_i is the probability of pixel intensity ii, LL is the number of gray levels, and the fractions PiPA\frac{P_i}{P^A} and PiPB\frac{P_i}{P^B} are the conditional probabilities within each class. This method maximizes the information measures between objects and backgrounds.

A multilevel thresholding method based on the Tsallis entropy criterion with mm thresholds is described below: J(t)=argmax[SqA(t)+SqB(t)+SqC(t)++Sqm(t)+(1q)SqA(t)SqB(t)SqC(t)Sqm(t)] J(t) = \arg \max [S_q^A(t) + S_q^B(t) + S_q^C(t) + \ldots + S_q^m(t) + (1-q) S_q^A(t) \cdot S_q^B(t) \cdot S_q^C(t) \cdot \ldots \cdot S_q^m(t)] where tt represents the vector of thresholds (t1,t2,,tm)(t_1, t_2, \ldots, t_m), and SqA(t),SqB(t),,Sqm(t)S_q^A(t), S_q^B(t), \ldots, S_q^m(t) are the Tsallis entropies for the m+1m+1 classes. The product term (1q)SqA(t)Sqm(t)(1-q) S_q^A(t) \cdot \ldots \cdot S_q^m(t) extends the non-extensivity property to multiple classes.

The cumulative probabilities PA,PB,PC,,PmP^A, P^B, P^C, \ldots, P^m for the respective classes are formulated as follows: PA=i=0t11Pi,SqA(t)=1i=0t11(PiPA)qq1 P^A = \sum_{i=0}^{t_1-1} P_i, \quad S_q^A(t) = \frac{1 - \sum_{i=0}^{t_1-1} \left(\frac{P_i}{P^A}\right)^q}{q - 1} PB=i=t1t21Pi,SqB(t)=1i=t1t21(PiPB)qq1 P^B = \sum_{i=t_1}^{t_2-1} P_i, \quad S_q^B(t) = \frac{1 - \sum_{i=t_1}^{t_2-1} \left(\frac{P_i}{P^B}\right)^q}{q - 1} PC=i=t2t31Pi,SqC(t)=1i=t2t31(PiPC)qq1 P^C = \sum_{i=t_2}^{t_3-1} P_i, \quad S_q^C(t) = \frac{1 - \sum_{i=t_2}^{t_3-1} \left(\frac{P_i}{P^C}\right)^q}{q - 1} Pm=i=tmL1Pi,Sqm(t)=1i=tmL1(PiPm)qq1 P^m = \sum_{i=t_m}^{L-1} P_i, \quad S_q^m(t) = \frac{1 - \sum_{i=t_m}^{L-1} \left(\frac{P_i}{P^m}\right)^q}{q - 1} where PiP_i is the probability of pixel intensity ii, and LL is the maximum gray level.

4.2.4. Fuzzy Entropy

Fuzzy entropy approaches use membership functions instead of sharp thresholding to define the degree to which a pixel belongs to a certain class (e.g., foreground or background). These membership functions are considered indicators of foreground and background strength. The method aims to maximize the total entropy based on these fuzzy membership grades [52].

For bi-level thresholding with a single threshold th, the objective function f(th) is defined as: f(th)=max(H1(th)+H2(th)) f(th) = \max (|H_1(th) + H_2(th)|) where H1(th)H_1(th) and H2(th)H_2(th) are the entropies of the two classes (foreground and background).

The entropies of classes H1H_1 and H2H_2 are calculated as follows: H1(th)=i=1thPhiμ1(th)w1ln(Phiμ1(th)w1) H_1(th) = - \sum_{i=1}^{th} \frac{Ph_i \cdot \mu_1(th)}{w_1} \ln \left(\frac{Ph_i \cdot \mu_1(th)}{w_1}\right) H2(th)=i=th+1LPhiμ2(th)w2ln(Phiμ2(th)w2) H_2(th) = - \sum_{i=th+1}^{L} \frac{Ph_i \cdot \mu_2(th)}{w_2} \ln \left(\frac{Ph_i \cdot \mu_2(th)}{w_2}\right) where PhiPh_i is the histogram probability of gray level ii, μ1(th)\mu_1(th) and μ2(th)\mu_2(th) are the membership functions for the first and second classes at threshold th, respectively, and w1w_1 and w2w_2 are distribution functions.

Each distribution function is defined as follows: w1=i=1LPhiμ1(i),w2=i=1LPhiμ2(i) w_1 = \sum_{i=1}^L Ph_i \cdot \mu_1(i), \quad w_2 = \sum_{i=1}^L Ph_i \cdot \mu_2(i) where w1w_1 and w2w_2 represent the fuzzy sum of probabilities weighted by their membership degrees for each class over all gray levels ii.

The calculation of the membership functions μ1(th)\mu_1(th) and μ2(th)\mu_2(th) is performed as follows, typically using triangular or trapezoidal fuzzy functions: μ1(th)={1tha1thb1a1b1a1thb10th>b1μ2(th)={0tha1tha1b1a1a1thb11th>b1 \begin{array}{r l} & \mu_1(th) = \left\{ \begin{array}{l l} 1 & th \leq a_1 \\ \frac{th - b_1}{a_1 - b_1} & a_1 \leq th \leq b_1 \\ 0 & th > b_1 \end{array} \right. \\ & \mu_2(th) = \left\{ \begin{array}{l l} 0 & th \leq a_1 \\ \frac{th - a_1}{b_1 - a_1} & a_1 \leq th \leq b_1 \\ 1 & th > b_1 \end{array} \right. \end{array} Here, a1a_1 and b1b_1 are fuzzy parameters that define the shape of the membership functions. The threshold th is often the midpoint of a1a_1 and b1b_1: th=a1+b12th = \frac{a_1 + b_1}{2}

The fuzzy entropy objective function for multi-level image segmentation with KK classes is defined as follows: f(th)=max(i=1KHi(th)) f(th) = \max \left( \sum_{i=1}^K H_i(th) \right) where th represents the vector of thresholds (th1,th2,,thK)(th_1, th_2, \ldots, th_K).

The entropies of classes are computed similarly for multi-level segmentation. For the first two classes: H1(th1)=i=1th1Phiμ1(th1)w1ln(Phiμ1(th1)w1) H_1(th_1) = - \sum_{i=1}^{th_1} \frac{Ph_i \cdot \mu_1(th_1)}{w_1} \ln \left(\frac{Ph_i \cdot \mu_1(th_1)}{w_1}\right) H2(th2)=i=th1+1th2Phiμ2(th2)w2ln(Phiμ2(th2)w2) H_2(th_2) = - \sum_{i=th_1+1}^{th_2} \frac{Ph_i \cdot \mu_2(th_2)}{w_2} \ln \left(\frac{Ph_i \cdot \mu_2(th_2)}{w_2}\right) The paper presents an extremely corrupted formula for subsequent entropies, which I will reproduce exactly as it appears in the original text, followed by a note: EλB1(θ1,t)=[1101α(θ1,tθt)01α(θ1,tθt)01α(θ1,tθt)](θ1,t)=θ1αB1(θ1,t)×(θ1,tθt)×1α(θ1,tθt)×1α(θ1,tθt)×1α(θ1,tθt)×1α(θ1,tθt)×1α(θ1,tθt)×1α(θ1,tθt)×1α(θ1,tθt)×1α(θ1,tθt)×1α(θ1,tθt)×1α(θ1,tθt)×1α(θ1,tθ \begin{array} { r l } & { \mathbb { E } _ { \lambda \in \mathcal { B } _ { 1 } } ( \theta _ { 1 , t } ) = \left[ \begin{array} { l } { 1 } \\ { 1 } \\ { 0 } \\ { \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ { 0 } \\ { \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ { 0 } \\ { \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \end{array} \right] _ { ( \theta _ { 1 , t } ) = \theta _ { 1 } } \oplus \alpha \in \mathcal { B } _ { 1 } ( \theta _ { 1 , t } ) } \\ & { \qquad \quad \times ( \theta _ { 1 , t } - \theta _ { t } ) } \\ & { \qquad \quad \times \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ & { \qquad \quad \times \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ & { \qquad \quad \times \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ & { \qquad \quad \times \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ & { \qquad \quad \times \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ & { \qquad \quad \times \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ & { \qquad \quad \times \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ & { \qquad \quad \times \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ & { \qquad \quad \times \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ & { \qquad \quad \times \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta _ { t } ) } \\ & \qquad \quad \times \frac { 1 } { \alpha } ( \theta _ { 1 , t } - \theta \end{array} Note: The formula above (part of Equation 23 in the original paper) appears corrupted in the source PDF, showing repetitive terms and incomplete expressions. It has been reproduced exactly as presented in the original text, but its intended meaning for the membership functions in a multi-level context is unclear due to this corruption.

The last class entropy HK(thnt)H_K(th_{nt}) is: HK(thnt)=i=thnt+1LPhiμK(thnt)wKln(PhiμK(thnt)wK) H_K(th_{nt}) = - \sum_{i=th_{nt}+1}^{L} \frac{Ph_i \cdot \mu_K(th_{nt})}{w_K} \ln \left(\frac{Ph_i \cdot \mu_K(th_{nt})}{w_K}\right) Each class has the following cumulative distribution function: wK=i=1LPhiμk(i) w_K = \sum_{i=1}^L Ph_i \cdot \mu_k(i) The membership functions are computed as follows (the paper shows blank space here, implying the definition would be an extension of the bi-level case but not explicitly written out in full): The threshold values are computed by the following equation: th1=a1+b12,th2=a2+b22,,thnt=ant+bnt2 th_1 = \frac{a_1 + b_1}{2}, th_2 = \frac{a_2 + b_2}{2}, \ldots, th_{nt} = \frac{a_{nt} + b_{nt}}{2} where thkth_k is the kk-th threshold, and ak,bka_k, b_k are the fuzzy parameters for the kk-th class.

4.2.5. Minimum Cross Entropy (MCE)

The Minimum Cross Entropy (MCE) method [53] works by reducing the cross entropy between the original image and its segmented counterpart. Cross entropy is a measure of the difference between two probability distributions. In this context, it quantifies how much information is lost when the original image's pixel distribution is approximated by the segmented image's class-based distributions. The goal is to minimize a specific objective function.

The objective function M(th1,th2,,thn)M(th_1, th_2, \ldots, th_n) for multi-level thresholding with nn thresholds is given by: M(th1,th2,,thn)=MG+M0+M1++Mn M(th_1, th_2, \ldots, th_n) = M_G + M_0 + M_1 + \ldots + M_n where MGM_G is a constant term, and M0,M1,,MnM_0, M_1, \ldots, M_n represent the cross entropies of the distinct classes.

The individual terms are defined as: MG=j=0L1jpjlog(j) M_G = \sum_{j=0}^{L-1} j p_j \log (j) where pjp_j is the probability of gray level jj, and LL is the total number of gray levels. This term is constant and does not depend on the thresholds.

For each class kk, the cross entropy MkM_k is calculated based on its mean μk\mu_k and cumulative probability ωk\omega_k: M0=j=0th11jpjlog(μ0),μ0=j=0th11jpjω0,ω0=j=0th11pj M_0 = - \sum_{j=0}^{th_1-1} j p_j \log (\mu_0), \quad \mu_0 = \sum_{j=0}^{th_1-1} \frac{j p_j}{\omega_0}, \quad \omega_0 = \sum_{j=0}^{th_1-1} p_j M1=j=th1th21jpjlog(μ1),μ1=j=th1th21jpjω1,ω1=j=th1th21pj M_1 = - \sum_{j=th_1}^{th_2-1} j p_j \log (\mu_1), \quad \mu_1 = \sum_{j=th_1}^{th_2-1} \frac{j p_j}{\omega_1}, \quad \omega_1 = \sum_{j=th_1}^{th_2-1} p_j Mn=j=thnL1jpjlog(μn),μn=j=thnL1jpjωn,ωn=j=thnL1pj M_n = - \sum_{j=th_n}^{L-1} j p_j \log (\mu_n), \quad \mu_n = \sum_{j=th_n}^{L-1} \frac{j p_j}{\omega_n}, \quad \omega_n = \sum_{j=th_n}^{L-1} p_j Here, thkth_k are the threshold values, pjp_j is the probability of gray level jj, μk\mu_k is the mean intensity of class kk, and ωk\omega_k is the cumulative probability of class kk.

Since MGM_G is a constant, the objective function to be minimized for MCE can be expressed as: η(th1,th2,,thn)=M0+M1++Mn \eta(th_1, th_2, \ldots, th_n) = M_0 + M_1 + \ldots + M_n Substituting the expressions for MkM_k: η(th1,th2,,thn)=j=0th1jpjlog(j=0th11jpjj=0th11pj)j=th1th2jpjlog(j=0th21jpjj=th1th21pj) \begin{array}{r} \eta(th_1, th_2, \ldots, th_n) = - \displaystyle \sum_{j=0}^{th_1} j p_j \log \left( \frac{\sum_{j=0}^{th_1-1} j p_j}{\sum_{j=0}^{th_1-1} p_j} \right) \\ - \displaystyle \sum_{j=th_1}^{th_2} j p_j \log \left( \frac{\sum_{j=0}^{th_2-1} j p_j}{\sum_{j=t h_1}^{th_2-1} p_j} \right) \end{array} To simplify, let m0(a,b)=j=abpjm^0(a,b) = \sum_{j=a}^{b} p_j and m1(a,b)=j=abjpjm^1(a,b) = \sum_{j=a}^{b} j p_j. Then the objective function can be written more compactly: η(th1,th2,,thn)=m1(0,th11)log(m1(0,th11)m0(0,th11)) \eta(th_1, th_2, \ldots, th_n) = - m^1(0, th_1-1) \log \left( \frac{m^1(0, th_1-1)}{m^0(0, th_1-1)} \right) m1(th1,th21)log(m1(th1,th21)m0(th1,th21))m1(thn,L1)log(m1(thn,L1)m0(thn,L1)) \begin{array}{l} \displaystyle - m^1(th_1, th_2-1) \log \left( \frac{m^1(th_1, th_2-1)}{m^0(th_1, th_2-1)} \right) - \ldots \\ \displaystyle - m^1(th_n, L-1) \log \left( \frac{m^1(th_n, L-1)}{m^0(th_n, L-1)} \right) \end{array} The objective function of MCE is minimized to determine optimal threshold values. However, the paper uses arg max in the following equation, which implies maximization of the negative of the η\eta function or an alternative formulation: fMCE(th1,th2,,thn)=argmax{η(th1,th2,,thn)} f_{MCE}(th_1, th_2, \ldots, th_n) = \arg \max \{ \eta(th_1, th_2, \ldots, th_n) \} This typically means minimizing the cross entropy, so maximising the negative cross-entropy.

4.2.6. Renyi's Entropy

Renyi's entropy is a generalized form of Shannon's entropy that introduces an adjustable parameter α\alpha [54]. It provides a more versatile way to measure information coolness. Notably, when α\alpha equals 1, Renyi's entropy aligns with Shannon's entropy.

Let II be a grayscale image to be segmented, with LL gray levels and probability distribution {p(1),p(2),,p(L)}\{ p(1), p(2), \ldots, p(L) \}. For bi-level thresholding with a single threshold TT, the image is divided into a target class C0={0,1,,t}C_0 = \{ 0, 1, \ldots, t \} and a background class C1={t+1,t+2,,L1}C_1 = \{ t+1, t+2, \ldots, L-1 \}.

The probabilities of occurrence for C0C_0 and C1C_1 are w0(t)w_0(t) and w1(t)w_1(t), respectively, such that w0(t)+w1(t)=1w_0(t) + w_1(t) = 1. Their definitions are: w0(t)=i=1tp(i) w_0(t) = \sum_{i=1}^t p(i) w1(t)=i=tL1p(i) w_1(t) = \sum_{i=t}^{L-1} p(i) where p(i) is the probability of gray level ii.

The definitions of Renyi's entropy for the background and target of an image are stated as follows: Rtα0(t)=11αlni=1t(p(i)w0(t))α Rt_\alpha^0(t) = \frac{1}{1-\alpha} \ln \sum_{i=1}^t \left( \frac{p(i)}{w_0(t)} \right)^\alpha Rtα1(t)=11αlni=t+1L1(p(i)w1(t))α Rt_\alpha^1(t) = \frac{1}{1-\alpha} \ln \sum_{i=t+1}^{L-1} \left( \frac{p(i)}{w_1(t)} \right)^\alpha where Rtα0(t)Rt_\alpha^0(t) and Rtα1(t)Rt_\alpha^1(t) are the Renyi's entropies for classes C0C_0 and C1C_1, respectively.

The total Renyi's entropy for bi-level thresholding is: T(t)=Rtα0(t)+Rtα1(t) T(t) = Rt_\alpha^0(t) + Rt_\alpha^1(t) To determine the optimal threshold TT, the objective function T(t) must be maximized: t=argmax(T(t)) t = \arg \max (T(t))

Extension to Multi-Level Thresholding: For multi-level thresholding with NN thresholds {t1,t2,t3,,tN}\{ t_1, t_2, t_3, \ldots, t_N \}, the histogram is divided into N+1N+1 regions. The gray probabilities for these regions are: w1(t)=i=1t1p(i) w_1(t) = \sum_{i=1}^{t_1} p(i) The gray probabilities of other thresholds and the gray probabilities of the last threshold are calculated as follows (generalizing for nn from 1 to NN): wn(t)=i=tn1+1tnp(i) w_n(t) = \sum_{i=t_{n-1}+1}^{t_n} p(i) wN(t)=i=tN1+1L1p(i) w_N(t) = \sum_{i=t_{N-1}+1}^{L-1} p(i) The Renyi's entropy for each class nn is calculated by the following equation: Rtαn(t)=11αlni=tn1+1tn(p(i)ωn(t))α Rt_\alpha^n(t) = \frac{1}{1-\alpha} \ln \sum_{i=t_{n-1}+1}^{t_n} \left( \frac{p(i)}{\omega_n(t)} \right)^\alpha where nn is a number between 1 and NN, and ωn(t)\omega_n(t) is the cumulative probability for class nn.

The total Renyi's entropy for multi-level thresholding is calculated using the following equation: T([t1,t2,t3,,tN])=n=1NRtαn(tn) T([t_1, t_2, t_3, \ldots, t_N]) = \sum_{n=1}^N Rt_\alpha^n(t_n) The selected optimal thresholds [t1,t2,t3,,tN][t_1, t_2, t_3, \ldots, t_N] should meet the following condition: [t1,t2,t3,,tN]=argmax(T([t1,t2,t3,,tN])) [t_1, t_2, t_3, \ldots, t_N] = \arg \max (T([t_1, t_2, t_3, \ldots, t_N]))

4.2.7. Meta-Heuristic Algorithms (MAs) in MLT

Once an objective function (like Otsu, Kapur, Tsallis entropy, etc.) is chosen, meta-heuristic algorithms are employed to find the optimal set of threshold values that either maximize or minimize this objective function. The general framework (as depicted in Figure 8) involves:

  1. Input Original Image: The raw image to be segmented.

  2. Define Threshold Number: The user or an adaptive mechanism specifies mm, the number of desired thresholds.

  3. Grayscale Transformation: If the input is a color image, it's converted to grayscale (or channels are processed independently).

  4. Histogram Creation: The grayscale histogram is generated, showing pixel intensity distributions.

  5. Meta-Heuristic Optimization: An MA (e.g., PSO, WOA, GA, DE, etc.) is initialized with a population of candidate threshold sets.

  6. Objective Function Evaluation: For each candidate set of thresholds, the chosen objective function (e.g., Kapur's entropy or Otsu's method) is calculated. The output of this function determines the fitness of the candidate solution.

  7. Iterative Update: The MA iteratively updates its population of candidate threshold sets based on their fitness values, using its specific rules (e.g., particle movement in PSO, crossover/mutation in GA). This process continues until a stopping criterion (e.g., maximum iterations, convergence) is met.

  8. Optimal Thresholds: The best set of thresholds found by the MA is selected.

  9. Segmented Image Output: The image is then segmented using these optimal thresholds.

    The use of MAs is crucial because MLT often involves searching a high-dimensional, non-linear, and possibly multimodal search space (especially for high numbers of thresholds), where meta-heuristics excel at finding global optima efficiently.

5. Experimental Setup

The paper is a survey, so it reviews the experimental setups of the articles it analyzed rather than presenting its own experimental setup. This section summarizes the common practices and findings regarding datasets, evaluation metrics, and baselines used in the reviewed literature.

5.1. Datasets

The reviewed papers utilized a wide variety of datasets, reflecting the diverse applications of multi-level thresholding segmentation. This includes both small-scale and large-scale datasets, and images from different domains and modalities.

The following are the results from Table 9 of the original paper, detailing the datasets used:

Dataset Data Type Samples
COVID-19 CT images 163
TCIA MRI, CT and digital histopathology 4
Biomedical images Digital images 5000
Insulator infrared images Real insulator infrared images 500/201
DCE-MRI MRIs (2D) 30
Berkeley segmentation dataset Ground truth images 500/300
Weighted brain magnetic resonance images MRIs 2
Plant canopy image & Satellite images Phenotype image & remote sensing data 2/8
Stomach CT images CT 4
Pleiades satellite imaginary multi-spectral images 2
CheX aka CheXpert, OpenI, Google, PC aka PadChest, NIH aka Chest X-ray14, MIMIC-CXR COVID-19 CT images 13
SCI image (Taken from Orange image diagnostic centre) MRIs 500
Landsat Imagery Courtesy of NASA Goddard Space Flight Center and U.S. Geological Survey 41,004,176,035, 225,017, 241,004, 385,028, 388,016, 2092, 14,037, 55,067, 169,012 Natural images 10
DMR-IR Thermography images 10
ABIDE (Autism Brain Imaging Data Exchange, International Neuroimaging Data-sharing) T2-weighted MRI axial brain images 12
Eyes, Liver, Head and Tongue Medical images 6
USC-SIPI Grayscale images (uint8) 5
BT10 and BRATS 2019 T1-weighted contrast-enhanced (T1c) images & FLAIR brain images 10
Kodim Color images (JPEG) 3
Plant leaf disease Tomato leaf images 5512
Zigong dinosaur lantern festvial Color images 4
Kaggle brain MRI MRIs 98/155
(Normal class images & Tumor images)
Random samples from earthobservatory.nasa.gov Satellite images 10
NASA landsat image Color images (JPG) 6
Digital Database for Screening Mammography (DDSM) DICOM 2500
Real-time DICOM CT images of the abdomen DICOM 7
Plant stomata images Color images 2
CASIA v3 Interval, MMU1, and UBIRIS Digital images 4195
Dental radiographs Digital images (X-Ray) 12
MIAS DICOM 322
Histopathological image Digital images 10
Skin cancer images Digital images 10
Art Explosion Grayscale images 8

Characteristics and Choices:

  • Diversity: The datasets span medical imaging (e.g., COVID-19 CT images, MRI brain images, DCE-MRI, DDSM mammograms, dental radiographs, histopathological images), remote sensing (e.g., satellite images, Pleiades satellite imagery), natural images (e.g., Berkeley segmentation dataset, USC-SIPI), plant pathology (e.g., tomato leaf images, plant stomata images), and industrial applications (e.g., insulator infrared images). This variety demonstrates the broad applicability of MLT techniques.
  • Scale: Sample sizes range from very small (e.g., 2 plant stomata images) to very large (e.g., 5512 tomato leaf images, 5000 biomedical images, 2500 DDSM mammograms). This indicates that MLT methods are adapted for various data scales, although small datasets can limit generalizability.
  • Domain Specificity: The extensive use of medical imaging datasets highlights the critical role of MLT in healthcare, where precise segmentation is crucial for diagnosis and treatment planning. Datasets like COVID-19 CT images emphasize the need for robust segmentation methods that are invariant to contrast and lighting variations in real-world clinical scenarios.
  • Challenges: The diverse characteristics of these datasets (e.g., different image sizes, textures, content, presence of noise or inhomogeneities) underscore the challenges in developing MLT methods that are both accurate and robust across various imaging conditions.

5.2. Evaluation Metrics

The reviewed papers employ a range of evaluation metrics to quantitatively assess the performance and quality of multi-level thresholding segmentation methods.

The following figure (Figure 10 from the original paper) presents a comprehensive analysis of the prevailing evaluation metrics:

Fig. 10 Classification of evaluation metrics Fig. 11 Summarized advantages over reviewed papers 该图像是柱状图,展示了多级阈值分割方法的评估指标分类的结果。图中标出了多个指标(如 PSNR、SSIM、CPU TIME、FSIM 等)对应的分数,反映了各指标在研究中的重要性。柱状图的高度代表了不同评估指标的值,PSNR和SSIM的分数最高,分别为59和45,显示其在图像处理中占据重要地位。其他指标如MSE、准确率和稳定性等的分数则相对较低。

Here's a breakdown of the most frequently used metrics:

5.2.1. Mean Square Error (MSE)

  • Conceptual Definition: Mean Square Error (MSE) is a common metric in image processing used to quantify the average squared difference between the pixel values of an original (reference) image and a segmented (test) image. It measures the quality of an estimator. A lower MSE value indicates better similarity between the two images and thus better segmentation accuracy.
  • Mathematical Formula: For two images, f(x, y) (original image) and g(x, y) (test image), with dimensions M×NM \times N: MSE(f,g)=1MNi=0M1j=0N1(fijgij)2 MSE(f, g) = \frac{1}{M \cdot N} \sum_{i=0}^{M-1} \sum_{j=0}^{N-1} (f_{ij} - g_{ij})^2
  • Symbol Explanation:
    • MM: Number of rows in the image.
    • NN: Number of columns in the image.
    • fijf_{ij}: Pixel intensity value at row ii and column jj in the original image.
    • gijg_{ij}: Pixel intensity value at row ii and column jj in the test (segmented) image.

5.2.2. Peak Signal to Noise Ratio (PSNR)

  • Conceptual Definition: Peak Signal to Noise Ratio (PSNR) is a metric often used to measure the quality of reconstruction of lossy compression codecs or to compare segmented images against a ground truth. It defines the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation. A higher PSNR value (expressed in decibels, dB) indicates a higher quality image, meaning less noise or distortion relative to the maximum possible signal.
  • Mathematical Formula: PSNR is expressed in decibels and is calculated based on the MSE and the maximum possible pixel value (BD - Bit Depth). For an 8-bit image, the maximum pixel value is 281=2552^8 - 1 = 255. PSNR=10log10((2BD1)2MSE) PSNR = 10 \log_{10} \left( \frac{(2^{BD} - 1)^2}{MSE} \right)
  • Symbol Explanation:
    • BD: The bit depth of the image (e.g., 8 for an 8-bit grayscale image). The term (2BD1)(2^{BD} - 1) represents the maximum possible pixel value (e.g., 255 for 8-bit images).
    • MSE: The Mean Square Error between the original and segmented images.

5.2.3. Structural Similarity Index Method (SSIM)

  • Conceptual Definition: Structural Similarity Index Method (SSIM) is a perceptually-based metric that quantifies the similarity between two images. Unlike MSE or PSNR, SSIM attempts to model the human visual system (HVS) by considering three key factors: luminance (brightness), contrast, and structure. A higher SSIM value (ranging from 0 to 1) indicates greater similarity between the images, with 1 representing identical images.
  • Mathematical Formula: SSIM is defined as a product of three comparison functions: luminance comparison l(f,g), contrast comparison c(f,g), and structure comparison s(f,g). SSIM(f,g)=l(f,g)c(f,g)s(f,g) SSIM(f, g) = l(f, g) c(f, g) s(f, g) The individual comparison functions are: {l(f,g)=2μfμg+C1μf2+μg2+C1c(f,g)=2σfσg+C2σf2+σg2+C2s(f,g)=σfg+C3σfσg+C3 \left\{ \begin{array}{l l} \displaystyle l(f, g) = \frac{2 \mu_f \mu_g + C_1}{\mu_f^2 + \mu_g^2 + C_1} \\ \displaystyle c(f, g) = \frac{2 \sigma_f \sigma_g + C_2}{\sigma_f^2 + \sigma_g^2 + C_2} \\ \displaystyle s(f, g) = \frac{\sigma_{fg} + C_3}{\sigma_f \sigma_g + C_3} \end{array} \right.
  • Symbol Explanation:
    • ff: Original image.
    • gg: Segmented (test) image.
    • μf\mu_f: Mean (average) intensity of image ff.
    • μg\mu_g: Mean (average) intensity of image gg.
    • σf\sigma_f: Standard deviation (contrast) of image ff.
    • σg\sigma_g: Standard deviation (contrast) of image gg.
    • σfg\sigma_{fg}: Covariance between images ff and gg.
    • C1,C2,C3C_1, C_2, C_3: Small positive constants to prevent division by zero and provide stability (e.g., C1=(K1L)2C_1 = (K_1 \cdot L)^2, C2=(K2L)2C_2 = (K_2 \cdot L)^2, C3=C2/2C_3 = C_2/2, where K1=0.01K_1=0.01, K2=0.03K_2=0.03, and LL is the dynamic range of pixel values, like 255 for 8-bit images).

5.2.4. Feature Similarity Index Method (FSIM)

  • Conceptual Definition: Feature Similarity Index Method (FSIM) assesses how similar two images are by comparing their distinctive features. It correlates well with human perception of image quality. FSIM is primarily based on two low-level features of the human visual system: Phase Congruency (PC) and Gradient Magnitude (GM). PC is effective at detecting image features (like edges and corners) regardless of lighting conditions or contrast, as it emphasizes features in the frequency domain. GM quantifies the rate of intensity change, representing image gradients. By combining these, FSIM provides a comprehensive evaluation of image similarity that considers a wide range of visual attributes and structural features.
  • Mathematical Formula: The paper describes FSIM conceptually but does not provide its explicit formula. As per instructions, I will provide the standardized formula from authoritative sources. FSIM is typically calculated as: FSIM(X,Y)=xΩPCm(x)SL(x)xΩPCm(x) FSIM(X, Y) = \frac{\sum_{x \in \Omega} PC_m(x) \cdot S_L(x)}{\sum_{x \in \Omega} PC_m(x)} where PCm(x)=max(PCX(x),PCY(x))PC_m(x) = \max(PC_X(x), PC_Y(x)) is the maximum of the Phase Congruency maps of the two images at location xx, and SL(x)S_L(x) is the local similarity map which combines Phase Congruency and Gradient Magnitude similarity. The local similarity SL(x)S_L(x) is often calculated as: SL(x)=SPC(x)SGM(x) S_L(x) = S_{PC}(x) \cdot S_{GM}(x) And the component similarities are: SPC(x)=2PCX(x)PCY(x)+T1PCX2(x)+PCY2(x)+T1 S_{PC}(x) = \frac{2 PC_X(x) PC_Y(x) + T_1}{PC_X^2(x) + PC_Y^2(x) + T_1} SGM(x)=2GMX(x)GMY(x)+T2GMX2(x)+GMY2(x)+T2 S_{GM}(x) = \frac{2 GM_X(x) GM_Y(x) + T_2}{GM_X^2(x) + GM_Y^2(x) + T_2}
  • Symbol Explanation:
    • X, Y: The two images being compared (original and segmented).
    • Ω\Omega: The spatial domain (all pixel locations) of the images.
    • xx: A specific pixel location.
    • PCX(x),PCY(x)PC_X(x), PC_Y(x): Phase Congruency value at location xx for images XX and YY respectively.
    • GMX(x),GMY(x)GM_X(x), GM_Y(x): Gradient Magnitude value at location xx for images XX and YY respectively.
    • T1,T2T_1, T_2: Small positive constants to ensure stability.

5.2.5. Wilcoxon Test

  • Conceptual Definition: The Wilcoxon signed-rank test is a non-parametric statistical test used to compare two related samples, matched samples, or repeated measurements on a single sample. It is used when the assumption of normality (required for parametric tests like Student's t-test) cannot be met or is not desired. This test assesses whether there is a significant difference in the median or mean ranks between the two sets of observations, rather than their means directly. It helps determine if two samples have the same distribution.
  • Mathematical Formula: The Wilcoxon signed-rank test doesn't have a single simple formula like image quality metrics. It involves several steps:
    1. Calculate the differences between paired observations.
    2. Take the absolute values of these differences.
    3. Rank the absolute differences from smallest to largest.
    4. Assign the original signs to the ranks.
    5. Sum the positive ranks (W+W_+) and negative ranks (WW_-).
    6. The test statistic is usually the smaller of W+W_+ and WW_-, or a standardized Z-score derived from it.
  • Symbol Explanation:
    • did_i: Difference between paired observations for the ii-th pair.
    • di|d_i|: Absolute difference.
    • RiR_i: Rank of the ii-th absolute difference.
    • W+W_+: Sum of positive ranks.
    • WW_-: Sum of negative ranks.
    • The exact calculation involves critical values or p-values to determine statistical significance.

5.3. Baselines

The reviewed papers extensively compare their proposed multi-level thresholding methods against a wide array of existing meta-heuristic algorithms and traditional thresholding techniques. Table 6 in the paper provides a detailed list of comparisons made.

Common baselines include:

  • Other Meta-Heuristic Algorithms: Many papers compare their new or improved meta-heuristic with established MAs. The most frequently compared algorithms include Particle Swarm Optimization (PSO), Whale Optimization Algorithm (WOA), Differential Evolution (DE), Genetic Algorithm (GA), Sine Cosine Algorithm (SCA), Gray Wolf Optimization (GWO), Bat Algorithm (BA), Artificial Bee Colony (ABC), and Moth Flame Optimization (MFO). These are representative because they are widely recognized and applied in optimization problems, including image segmentation.

  • Variants and Hybrid Algorithms: Comparisons often extend to variants of common MAs (e.g., QPSO - Quantum-behaved PSO, DPSO - Darwinian PSO, FODPSO - Fractional-Order Darwinian PSO) and hybrid algorithms (e.g., HHO-DE, MPAMFO, ABCSCA), which combine elements from multiple MAs to enhance performance.

  • Classic Thresholding Methods: Even when meta-heuristics are involved, the core objective functions are often compared (e.g., Otsu's method vs. Kapur's entropy vs. Tsallis entropy). Sometimes, a proposed MLT method is compared against Otsu's bi-level or Gaussian Otsu methods to show the benefits of multi-level and optimization.

  • Other Segmentation Techniques: In some specialized applications (e.g., medical imaging), comparisons might also be made against non-thresholding segmentation methods or classifiers, such as FCM (Fuzzy C-Means), BF (Bacterial Foraging), CNNs or SVMs when the MLT is part of a larger classification pipeline.

    The choice of baselines aims to demonstrate the superiority or specific advantages (e.g., accuracy, speed, stability) of the proposed MLT technique in relation to both foundational and state-of-the-art optimization and segmentation approaches.

6. Results & Analysis

6.1. Core Results Analysis

The paper, being a survey, synthesizes findings from 79 reviewed articles (2017-2023) to provide a comprehensive overview of multi-level thresholding image segmentation. The analysis covers general characteristics, evaluation metrics, objective functions, advantages, disadvantages, and execution times.

General Characteristics (Table 6 & Figure 9):

  • Computational Environment: Over 55% of the reviewed articles utilized a computing environment with 8 GB RAM or higher, indicating that MLT algorithms, especially those involving meta-heuristics and high threshold levels, often require substantial computational resources.

  • Programming Language: MATLAB is overwhelmingly the most frequently used programming language (as depicted in Figure 9), suggesting its dominance in image processing research, likely due to its extensive toolboxes and ease of prototyping. Python and Java are used to a lesser extent.

  • Meta-heuristic Popularity: PSO and its hybrid versions were compared 53 times, WOA 30 times, and DE 29 times, making them the most common meta-heuristic baselines or components in proposed methods.

  • Case Studies: Approximately 27% of research focused on medical datasets, underscoring the high impact of MLT in medical imaging. Satellite imaging and underwater imaging ranked second and third, respectively, highlighting other crucial application areas.

    The following are the results from Table 6 of the original paper, summarizing general characteristics:

    Paper Environment Proposed Method Compared with Programming language Case Study
    [86] 2.20 GHz Pentium IVPC,4G RAM SCQPSO SunCQPSO, CCQPSO, QPSO MATLAB Medical
    [87] MFA FA, BFA, LFA MATLAB
    [88] Intel(R) core i7 PC,2.93 GHz CPU, 2 GB RAM CDPSO CCS, CHS, CPSO, CDE MATLAB Satellite Images
    [89] Intel Core i5, 2.5 GHz processor GA, QGA, DE, ARKFCM MATLAB
    [90] Windows7-64bit Intel Core2Duo 1.66 GHz processor, 2 GB memory WOA, MFO SCA, HS, SSO, FASSO, FA MATLAB
    [91] Intel core -i7 CPU @ 3.40 GHz GWO PSO, BFO MATLAB
    [92] Geometrical features
    [93] FA MATLAB
    [94] windows 7 3.2 GHz CPU, 4G RAM FWA PSO, ABC, AFSA, BSO MATLAB
    [95] Intel(R) Core i3, 2.93 GHz CPU, 2 GB RAM SCS MCS MATLAB
    [96] Windows 7 Intel (R) Core i3, 2.20 GHz, 4.00 GB RAM MOPSO PSO MATLAB
    [97] 2.80 GHz Intel(R) core i3 processor, 2 GB RAM 2DNLMeKGSA 2DNLMGSA, 2DNLMDE, 2DNLMABC, 2DNLM-cKGSA MATLAB
    [98] Windows 7, 2.4 GHz CPU, 1 GB RAM VMD PSO, FCM, BF, MPS, CPS MATLAB
    [99] ABC DE, PSO, QPSO, ABC, gABC, IAB, QABC, OLABC, FGABC
    [100] a core of 2.7 GHz Intel Core i5 on a MacBook Pro, 8 GB RAM GA Python & MATLAB
    [101] 2.67 GHz Intel core-i5 PC SADFO HSO, BO, FFO MATLAB digital images
    [102] Intel Core i5-2400 Duo 3.10 GHz processor, 4 GB RAM AWDO RGA, GA, Nelder—Mead simplex, PSO, BF, ABF, WDO MATLAB Medical
    [103] DSA based Otsu's, PSO based, Otsu's, Original images
    [104] Intel Core i7-5820 K processor 3.30 GHz, 64 GB RAM QPSO PSO, DPSO, FODPSO Medical
    [105] GA Otsu & Gaussian Otsu's methods Python
    [106] FODPSO PSO, DPSO Medical
    [107] Windows764bit Intel Core2Duo 1.66 GHz processor, 2 GB memory LFMVO GWO, PSO, WOA, MVO MATLAB
    [108] sor, 2 GB memory SS SSO, MFO, PSO, GWO, SHO, WOA, MVO, ABC, FA, HS, ABCSCA, FAABC, FASSO, ABC, SCA MATLAB
    [109] Windows 7-64bit Intel Core2Duo 1.66 GHz processor, 2 GB memory KnEA NSGA-III, RVEA, LMEA, IMMOEA MATLAB
    [110] Windows 7 Intel core 3, 2.5 GHz, 4 GB RAM ALDE hjDE, SDE, BDE MATLAB Medical
    [111] MS Windows 7 64bit, Intel Core i3 3.2 GHz processor MOMVO MOEAD, MOEADR, MOPSO MATLAB
    [112] MCET-HHO Medical
    [113] Windows 64-bit Intel i7 2.6 GHz processor,16 GB memory WOA, ALO SSO, FA, FASSO MATLAB
    [114] Intel Corei3 processor, 4 GB RAM, 64-bit operating system FODPSO PSO, DPSO MATLAB Medical
    [115] Windows 10 64-bit Pentium(R) Dual core T4500 @ 2.30 GHz,2 GB memory HHO-DE HHO, DE, SCA, BA, HSO, PSO, DA MATLAB
    [116] Windows 7 64-bit AMD A10-8700P processor, 8 GB RAM SAMFO-TH MVO, WOA, FPA, SCA, ACO, PSO, ABC, MFO MATLAB
    [117] MGOA GOA, WOA, FPA, PSO, BA
    [118] Intel core 2 Duo Processor 3 GHz, 2 GB RAM EMA BFO, PSO, GA, FA, HBMO, TLBO MATLAB
    [119] IEPO EPO, WOA, MVO, PSO, BA, 3DOtsu, FCP, 3DPCNN MATLAB
    [120] EKH KH I, KH II, KHIV, MFA, MGOA, BA, WCA MATLAB
    [121] HMS GA, PSO, DE, FF,BA, GSA, TLB
    [122] MPAMFO MPA, HHO, CS, GWO, GOA, SSO, PSO, MFO Medical
    [123] ABCSCA ABC, SCA, FASSO, SSA, WOA, GWO, SSO, FASSO, WOAPSO
    [124] Windows7, 3.4 GHz Intel Core i-7 CPU, 16 GB RAM BFA ABC, MFO, GWO, WOA MATLAB
    [125] Amazon AWS server, 20 GB RAM Different K folds (0, 10, 20, 30, 40, 50) and classifiers (KNN, SVM, DT, NN) Java Medical
    [126] Windows 10 PC with a Core i5 CPU, 4 GB RAM DI-TLBO TLBO, LETLBO, ITLBO, BSA, I-TLBO, GWO Python
    [127] Windows7-64bit Intel Core 2 Duo 1.66 GHz processor 2 GB memory HHB, HHUB SCA, ABC, SSO, FASSO, FAABC, ABCSCA MATLAB
    [128] Windows10 i7-8750H 2.2 GHz processor, 8 GB memory SCA BA, FPA, PSO, WOA Underwater segmentation
    [129] CLAHE Different classifiers (i.e., KNN, SVM, DT, RF, CART) Medical
    [130] WOA CSA, GOA Satellite images
    [131] Windows 7 AMD A8-7410 APU with AMD Radeon R5 Graphics @2.20 GHz, LCBMO MABC, CSA, GOA, CS, EO, MPA, IDSA, TLBO, WOA-TH, BDE MATLAB
    [132] A commercially available computer workstation, Synapse 3D version 3.5 Medical
    [133] BMPA NSGA-III, MOPSO, MOMVO, MOEA-DD, MPA, NPSO, GSOBFO, KnEAE, FCM, PCNN, DLA, SSD, OAD-BSP, AFD Electrical Engineering
    [134] MPA-OBL LSHADE_SPACMA-OBL, CMA_ES_OBL, DE-OBL, HHO-OBL, SCA-OBL, SSA-OBL, MPA
    [135] Windows 7 (64-bit) Intel (R) Core i3-8130U 2.20 GHz, 4 GB RAM SPBO PSO, DA, SMA, MVO, GOA, HMRF, IMRF, CMRF MATLAB Medical
    [136] Windows 10 PC, Core i7 CPU, 8 GB RAM DASMA MAs, SSA, DE, IWOA, CLPSO, IGWO, CS, MFO MATLAB Medical
    [137] Windows 7 Pentium(R) Dual core T4500 @ 2.30 GHz, 2 GB RAM KHO BF, PSO, GA, MFO MATLAB
    [138] LSHADE JADE, SHADE, DE
    [139] Windows 10 64bit, Intel Core i7, 8 GB RAM IChOA GWO, MFO, WOA, SCA, SSA, EO, ChOA MATLAB Medical
    [140] 8th generation intel processor core i5-8250u, with a clock speed 1.60 GHz,8 GB internal memory HSA HAS_O, HAS_K, PSO, EMO_O, EMO_K MATLAB
    [141] Windows 8.1- 64bit Intel CoreI5 processor, 8 GB memory BWO GWO, MFO, WOA, SCA, SSA, EO, MATLAB
    [142] Intel(R) Core i7 PC 2.93 GHz CPU, 2 GB RAM ACS CS MATLAB Satellite images
    [143] Windows 10 Intel (R) Core i5-7200 CPU @ 2.50 GHz-2.71 GHz, 8 GB RAM OB-L-EO EO, SCA, HS, WOA, MFO, SSO, FASSO, FA MATLAB
    [144] Windows 7 Intel Core i5 3230MCPU@26 GHz, 12 GB RAM CLWOA WOA, SOS, ABC, QPSO, BAT MATLAB
    [145] Windows 7 Intel Core i5-3230 M CPU@2.6 GHz, 12 GB RAM HAQPSO PSO, QPSO,AQPSO MATLAB
    [146] i7-8750H 2.2 GHzprocessor, 8 GB RAM WOA BA, FPA, MFO, MSA, PSO, WWO MATLAB Underwater segmentation
    [147] Windows Server 2008 R2, Intel (R) Xeon(R) CPU E5-2660 v3 (2.60 GHz), 16 GB RAM ASMA CMFO, IWOA, OBLGWO, RCBA, ALCPSO, CWOA, GWO, SMA, WOA, DE, BA, ABC, SSA, CS, BLPSO, SCA, IGWO, PSO, LGCMFO, SCADE, CEBA MATLAB Medical
    [148] Intel core i7-3770 @ 3.40 GHz CPU, 8 GB RAM CSA WDO, BFO, BDE, ABC MATLAB
    [149] Intel(R) Xeon (R) CPU E5-2660 v3 @ 2.60 GH, 16 GB RAM MDE DE, SMA, MVO, CS, HHO, CGPSO, IGWO, CLPSO, mSCA, MGSMA, CESCA, CMFO, CWOA MATLAB Medical
    [150] Intel(R) Xeon(R) Gold 6230R CPU @ 2.10 GHz, 64.0 GB RAM RAV-WOA ALCPSO, MFO, GWO, WOA GA, PSO, GWO, SSA, WOA, LSSA MATLAB
    [151] Windows Server 2008R2, Intel(R) Xeon(R) CPUE5-2660v3 @2.60 GHz, 16 GB RAM CCMVO MVO, DE, SCA, HHO, CBA, SCADE, IGWO, ACWOA, ASCA_PSO, WOA, SCA, HHO, BLPSO, IGWO, IWOA MATLAB Medical
    [152] Intel(R) Core i5-6500 CPU@ 3.20 GHz, 8 GB RAM Watermarking approaches
    [153] Intel(R) i3 processor 1.8 GHz clock speed, 4 GBRAM CS IDE, MMFO, Modified Bat, cuckoo search MATLAB Medical
    [154] Intel(R) Core i7-4700MQ CPU 2.40 GHz, 32 GB RAM HWOA SCA, WOA, MSSA, IMPA, CS, CSMC, EO MATLAB
    [155] BMO GWO, EMO, HSO
    [156] Intel i5-9300H CPU @2.4 GHz, 8 GB RAM Fusion based technique + BAT algorithm PSO, FF,BBO, BAT, FPA, GWO MATLAB Medical
    [157] HBA, CBOA
    [158] HPSO MASI-ENG-MSA
    [159] OSAFEM-PLDD HCF_QSVM, ACNN, CNN_LVQ, HCF_SVM, VGG_16 CNN, INCEPTION V3 Plant Leaf Disease Diagn
    [160] Ryzen 5 processor,8 GB RAM TsNMRO SHADE, OB-L-EO, SOGW, IWOA, DADE, JADE, NMRA, TSO, MA-ES, GWO, HCSO, BPSO, HIWOA, MBA, IFAGA, PDO MATLAB
    [161] Windows10, AMDRyzen9390012-Core 3.09 GHz, RAM 32 GB CRWOA MATLAB
    [162] Intel(R) Xeon(R) CPU E5-2620 v4@2.10 GHz, 16 GB RAM HLDDE DE, HHO, SCA, mSCA, IGWO, IWOA, SCADE, BWOA, CEBA, BA, ACOR, TSA, MVO, MFO, GBO, HBO, PO MATLAB Medical
    [163] Windows 7 Enterprise 64-bit PC, Intel Core i7-4510U, 2.6 GHz CPU, 8.00 GB RAM AVOA FFA, SMO, WOA, MPA MATLAB
    [164] FOL-AOA AOA, RSA, ChOA, SCA, TLBO

Objective Functions (Table 7):

  • Otsu's Entropy (or between-class variance) and Kapur's Entropy are by far the most widely used objective functions, appearing in 44 and 34 papers respectively. This indicates their proven effectiveness and popularity in MLT.

  • Tsallis Entropy, Cross Entropy, Renyi's Entropy, Fuzzy Entropy, and Shannon's Entropy are used less frequently, suggesting potential areas for further exploration.

    The following are the results from Table 7 of the original paper, detailing objective functions:

    Objective Functions Papers
    Kapur's entropy [87, 9, 101, 102, 107, 109, 111, 115, 116, 11921, 126, 128, 130, 133, 134, 137139, 141, 147, 19, 15, 153, 155157, 159, 161, 162]
    Otsu's entropy [8, 87, 9093, 95, 98100, 102111, 113116, 120, 121, 123127, 129, 134, 137143, 150, 152, 155, 156, 161]
    Tsallis entropy [88, 89, 92, 109, 117, 120, 143, 153, 156]
    Shannon's entropy [135, 153]
    Renyi's entropy [89, 97, 109, 137, 145]
    Cross entropy [87, 88, 94, 112, 118, 148, 164]
    Fuzzy entropy [109, 122, 123]

Advantages & Disadvantages (Table 8 & Figures 11, 12):

  • Advantages (Figure 11): The key strengths of MLT methods include better performance in terms of metrics (e.g., PSNR, SSIM), better objective function scores, high convergence speed, stable segmentation, optimal thresholding, balanced exploration & exploitation (for meta-heuristics), enhanced search capacity, and better segmentation quality. Color-Image-Thresholding and Gray-Scale-Thresholding capabilities are also noted.

  • Disadvantages (Figure 12): Common limitations include static thresholding settings (lack of automatic threshold level determination), inadequate evaluation (limited comparisons), time complexity (especially for high threshold levels), limited data points (small datasets), local optima trap (for meta-heuristics), noise sensitivity, inefficient advanced segmentation (for complex images), feature constraints, multi-objective limitation, and limited to grayscale testing.

    The following figure (Figure 11 from the original paper) summarizes the advantages over reviewed papers:

    Fig. 12 Summarized disadvantages over reviewed papers 该图像是一个示意图,展示了多级阈值分割方法的优缺点总结。这些缺点包括低效的高级分割、不充分比较、未知的时间复杂度以及对噪声的敏感性等,旨在引导未来的研究方向。

The following figure (Figure 12 from the original paper) summarizes the disadvantages over reviewed papers:

Fig. 12 Summarized disadvantages over reviewed papers

Execution Time Analysis (Table 10):

  • Execution times varied significantly, from milliseconds to over 100 seconds, depending on the algorithm, number of thresholds, image complexity, and computational environment.

  • Paper [91] noted that Kapur thresholding (34.2 ms for min threshold 2) was slightly more computationally expensive than Otsu thresholding (28.3 ms for min threshold 2) for their specific images and setup, though this varies across studies.

    The following are the results from Table 10 of the original paper, detailing execution time achieved by reviewed papers:

    Paper (Min- Max) Threshold Average CPU Time (Min-Max) Reference Dataset
    [86] Min Max NR NR Four stomach CT images
    [87] Min 4 7 NR Ten standard test color images
    [88] Max Min 3 MCE: 5.4487 and Tsallis entropy: 15.4447 (seconds) Ten different chaotic maps
    Max 5 MCE: 7.995 and Tsallis entropy: 17.8756 (seconds)
    [89] Min Max 2 5 ≤ 5(s) Twenty images from
    [90] Min 2 WOA: 3.74 and MFO: 3.57 (seconds) Eight grayscale images from BSD
    Max 5 WOA: 4.78 and MFO: 5.60 (seconds)
    [91] Min Max 2 5 Kapur: 34.2 and Otsu: 28.3 (milliseconds) Five images from USC-SIPI database and Three images from BSD500
    Kapur: 147.3 and Otsu: 106.8 (milliseconds)
    [92] Min NR NR NR
    [93] Max Min NR NR Two sample images
    Max
    [94] Min 2 1.06 (seconds) Three sample images from BSD dataset
    [95] Max Min 4 1.14 in (seconds) Two sample images (Lena, Port)
    Max 2 7.779 (NR)
    [96] 7 11.992 (NR) Four sample images
    [97] Min 7 6261 (seconds) sample images from BSD300
    Max 10 6527.5 (seconds)
    [98] Min NR NR Twenty sample images
    Max
    [99] Min 2 < 1.23 (seconds) Two sample images from BSD300
    [100] Max 4 NR Three sample images
    Min NR
    Max
    [101] Min 2 NR Six images from BSD and Six medical images of eyes, liver, head and tongue
    Max 5
    [102] Min 2 Otsu: 3.6812 and Kapur: 4.2291 (seconds) T2-weighted MRI brain images
    Max 5 Otsu: 6.1672 and Kapur: 7.7805 (seconds)
    [103] Min 2 NR CASIA v3 interval and UBIRIS and MMU1
    Max 3
    [104] Min 6 3.16 (seconds) Twelve dental radiographs images
    Max
    [105] Min NR NR Set of coins, Cameraman, Circles with different colors an
    Max Soil sample
    [106] Min 5.1941 (NR) Three Mammogram images
    2
    Max 8 8.5295 (NR)
    [107] Min 2 0.1779 (seconds) Four sample images from BSD500
    Max 5 0.4053 (seconds)
    [108] Min 2 0.37 (seconds) Six images from BSD300
    Max 30 0.57 (seconds)
    [109] Min 2 Around 250 (seconds) Six grayscale images from BSD
    Max 20
    [110] Min 4 1.3090 (seconds) MRI brain images from ABIDE
    Max 6 1.3339 (seconds)
    [111] Min 2 32.66 (NR) Eleven grayscale images from BSD
    Max 20 105.30 (NR)
    [112] Min 4 CC: 2.035 and MLO: 12.129 (seconds) Digital Database for Screening Mammography (DDSM)
    Max 12 CC: 6.338 and MLO: 97.847 (seconds) 2,500 studies
    [113] Min 2 NR Eight grayscale images from BSD
    Max 5
    [114] Min 2 8.5238 (seconds) DICOM CT images
    Max 5
    [115] Min 4 NR Five images from Berkeley (BSD) and five satellite images
    Max 12
    [116] Min 4 Otsu: 1.1693 and Kapur: 1.5409 (seconds) Six color images taken from USC-SIPI and Berkeley segmentation dataset (BSDS500) and Four satellite images
    Max 10 Otsu: 1.3529 and Kapur: 2.5405 (seconds)
    [117] Min 4 NR Eight color test images from BSD300 and plant stomata images
    Max 12
    [118] Min 2 8.6888 (seconds) Ten images from BSD and 2 weighted brain magnetic resonance images
    Max 5 10.960 (seconds)
    [119] Min 4 2.237 (NR) BSD dataset, Satellite images and plant canopy images
    Max 12
    [120] Min 3 NR Ten images from Berkeley (BSD)
    Max 6
    [121] Min 2 NR Twelve Berkeley images (BSD) and 256 grey levels
    Max 5
    [122] Min 6 NR Ten images and CheX aka, OpenI, Google, PC aka Pad- Chest, NIH aka Chest X-ray14, and MIMIC-CXR
    Max 25
    [123] Min 2 1.9 (seconds) Eight images from the Art Explosion database and eleven images from BSD
    Max 5 4.0 (seconds)
    [124] Min 2 NR Eight grayscale images from USC-SIPI
    Max 5 NR
    [125] Min NR SCI image database taken from Orange image diagnostic centre
    Max
    [126] Min 4 Kapur: 0.273 and Otsu: 0.248 (seconds) Eight sample images from BSD dataset
    Max 5 Kapur: 0.3 and Otsu: 0.264 (seconds)
    [127] Min 6 0.253 (seconds) Twelve sample images from BSD
    Max 30 0.402 (seconds)
    [128] Min 2 2.53.5 (seconds) Six original test images
    Max 6 3.0661 (seconds)
    [129] Min NR NR Kaggle brain MRI dataset
    Max
    [130] Min 8 NR Ten satellite images from www.earthobservatory.nasa.gov
    Max 10
    [131] Min 4 2.941 (seconds) Two sets of twelve color images are selected from BSD an NASA landsat image
    Max 16 4.209 (seconds)
    [132] Min NR NR COVID-19 CT images
    Max
    [133] Min 4 36.83 (seconds) 201 insulator infrared images
    Max 20 47.01 (seconds)
    [134] Min 2 Otsu: 0.6385 and Kapur: 1.0423 (NR) Samples from BSD dataset
    Max 5 Otsu: 1.6707 and Kapur: 2.4884 (NR)
    [135] Min NR 2.4959 (seconds) 300 Sagittal T2-Weighted DCE-MRI
    Max 2D slices
    [136] Min 5 NR BSD and medical images of COPD
    Max 8
    [137] Min 2 Kapur: 2.0507 and Otsu: 2.0408 (seconds) Six images from BSD300
    Max 5 Kapur: 2.1022 and Otsu 2.1005 (seconds)
    [138] Min 2 NR Six sample images
    Max 5
    [139] Min 2 NR Thermography images (DMR-IR)
    Max 5
    [140] Min 2 9.440 (seconds) Eight sample images
    Max 5 18.494 (seconds)
    [141] Min 2 NR Ten sample images
    Max 5
    [142] Min 5 Otsu: 7.1835 and Tsallis: 5.6704 (NR) Three satellite images
    Max 11 Otsu: 10.8379 and Tsallis: 8.1308 (NR)
    [143] Min 2 NR Eight grayscale images from BSD
    Max 5
    [144] Min 2 NR Two images from BSD300 and four color images from Zigong dinosaur lantern festival
    Max 5
    [145] Min 2 NR Three grayscale images from USC-SIPI and a sport grayscale image
    Max 5 2.3824 (seconds) Fourteen test images selected from the experimental pool
    [146] Min Max 4 37.540 (seconds) of Harbin Engineering University
    [147] Min 8 NR BSDS500
    Max 2
    [148] 20 Twenty complex Background crop images
    Min 2 9.2229 (seconds) BIDC images
    Max 16 11.0381 (seconds)
    [149] Min 2 NR
    Max 20 4 grayscale images
    [150] Min 2 0.1427 (seconds)
    Max 10 0.5333 (seconds) COVID-19 dataset
    [151] Min 2 NR
    Max 20
    [152] Min NR NR TCIA dataset
    [153] Max 5000 biomedical images and 250 standard test images
    Min 3 6.1698 (seconds)
    Max 9 12.1628 (seconds)
    [154] Min 3 NR Ten images from Berkeley dataset
    Max 70
    [155] Min 2 NR Three sample images
    Max 5
    [156] Min NR NR Ten images from brain tumor datasets
    [157] Min 4 NR Nine standard benchmark images
    Max 7
    [158] Min Max 3 10 80.64 (NR) Sample images from Kodim and Berkeley datasets
    84.35 (NR)
    [159] Min NR NR Plant leaf disease dataset
    Max
    [160] Min 3 NR Ten benchmark images with diverse features and complexities
    Max 7
    [161] Min 2 Otsu: 0.2 and Kapur: 0.3 (seconds) Berkeley segmentation dataset
    Max 8 Otsu: 0.7 and Kapur: 0.8 (seconds)
    [162] Min 2 NR Seven 512×512 pixels IDC images obtained by hematoxylin -eosin staining
    Max 6
    [163] Min 5 6.0653 (NR) Landsat Imagery Courtesy of NASA Goddard Space Flight
    Max 11 8.9399 (NR) Center and the U.S. Geological Survey dataset
    [164] Min 4 NR Six grayscale images from BSD
    Max 16

6.2. Data Presentation (Tables)

The paper uses numerous tables to summarize information from the reviewed articles.

The following are the results from Table 1 of the original paper, providing symbols and related definitions:

Symbol Definition
ABC Artificial Bee Colony
ABCSCA Artificial Bee Colony with Sine—Cosine Algorithm
ABF Adaptive Bacterial Foraging
ACO Ant Colony Optimization
ACS Adaptive Cuckoo Search
ALDE Adaptive Differential Evolution with Levy Distribution
ALO Antlion Optimization Algorithm
AS Association Strategy
ASMA An Improved Slime Mould Algorithm
AVOA African Vultures Optimization Algorithm
AWDO Adaptive Wind Driven Optimization
BA Bat Algorithms
BGS BoltzmannGibbs
BLT Bi-Level Thresholding
BSD Berkeley Segmentation Dataset
ChOA Chimp Optimization Algorithm
CMRF Conventional Markov Random Field
CNN Convolutional Neural Network
CP Contrast Pattern-based classification
CPS Cross-Point Search
CRWOA Whale Optimization Algorithm with Combined mutation and Removing similarity
CSA Cuckoo Search Algorithm
CSO Chicken Swarm Optimization
DC-MRI Dynamic Contrast-Enhanced Magnetic Resonance Imaging
DE Differential Evolution
DM Diffusion Mechanism
DS Differential Search
EAs Evolutionary Algorithms
EM Expectation Maximization
EMA Exchange Market Algorithm
EO Equilibrium Optimizer
EPO Emperor Penguin Optimization
FASSO Fuzzy Adaptive Swallow Swarm Optimization
FCNs Fully Convolutional Networks
FF Fuzzy Filtering
FODPSO Fractional-Order Darwinian Particle Swarm Optimization
FPA Flower Pollination Algorithm
FSIM Feature Similarity Index Method
GA Genetic Algorithm
GAC Geodesic Active Contour
GANs Generative Adversarial Networks
GGOs Ground-Glass Opacities
GM Gradient Magnitude
GOA Grasshopper Optimization Algorithm
GWO Gray Wolf Optimization
HB Human-Based
HHO Harris Hawks Optimization
HMRF Hidden Markov Random Field
HVS Human Visual System
IIS Interactive Image Segmentation
IChOA Lévy Flight Chimp Optimizer
JPEG Joint Photographic Experts Group
KnEA Knee Evolutionary Algorithm
LF Levy Flight
MAs Meta-Heuristic Algorithms
MCE Minimum Cross Entropy
MFA Moth Flame Algorithm
MLTIS Multi-Level Image Segmentation Algorithm
MPA Marine Predator Algorithm
MPS Minimum Point Search
MSE Mean Square Error
MSSIM Mean Structural Similarity Index Method
MVO Multi-Verse Optimizer
OBL Opposition-based Learning
OSAFEM-PLDD Optimal Segmentation with Alexnet Based Feature Extraction for Plant Leaf Disease Diagnosis
PFs Pareto Fronts
PRI Probabilistic Rand Index
PSNR Peak Signal to Noise Ratio
QRG Quantum Rotation Gate
SA Simulated Annealing
SADFO Self-Adaptive Dragonfly Optimization
SCI Spinal Cord Injury
SHADE Adaptive Differential Evolution based on Success History
SS Swarm Selection
SSO Social Spider Optimization
VMD Variational Mode Decomposition

The following are the results from Table 2 of the original paper, detailing research questions:

No Research Question Motivation Address
1 What are the advantages of multi-level thresholding over other thresholding methods? Various advantages of multi-level thresholding are discussed in the response to this inquiry Section 1
2 In the current study, how were the data collected? In order to evaluate the reliability and validity of the research findings, it is essential to gain insight into the research process and the methods used to collect data Section 1.3
3 Which thresholding approaches are most commonly used and how are they implemented? Researchers can gain a better understanding of segmentation objective functions by answering this question Section 2.4 and 4.3
4 Which areas of image segmentation have been the focus of previous surveys? Identify recurring topics in image segmentation, such as segmentation algorithms, applications, challenges, and evaluation methods, to understand the main research Section 2.5 and 3
5 How can meta-heuristics assist in thresholding image segmentation? areas Meta-heuristic algorithms play an important role in multi-level thresholding segmentation using the presented framework Section 2.6
6 How is thresholding segmentation commonly used? Many aspects of multi-level thresholding segmentation are better understood by using this method Section 2.7
7 Which operating environment and programming language was used to implement each method? The purpose of this question is to gain an understanding of how each study is implemented and what programming language is used Section 4.1
8 How can the efficiency of multi-level thresholding image segmentation be measured, and what metrics are used for this purpose? Researchers evaluate segmentation techniques for thresholding based on PSNR, SSIM, FSIM, CPU time, among others Section 4.2
9 Are there any advantages or disadvantages to different thresholding segmentation approaches? It is evident that each method has advantages and limitations Section 4.4
10 Which datasets are used most often over the proposed methods? To evaluate and validate a technique, researchers use benchmark datasets in a particular field or domain Section 4.5
11 Which algorithm performed best based on execution time? To better understand the power of the algorithm, the performance of the methods is reported based on the execution time compared to a threshold level Section 4.6
12 What are the current challenges in thresholding segmentation? This question will lead to future research conducted by researchers Section 5

The following are the results from Table 3 of the original paper, summarizing online dataset sources:

Publication Name URL Articles Received
Science Direct https://www.sciencedirect.com/ 43
IEEexplore http://ieeexplore.ieee.org/ 34
Springer http://www.springer.com/ 2
Total Articles 79

The following are the results from Table 8 of the original paper, detailing positive aspects and limitations:

Paper Positive Aspects Limitations
[86] High convergence speed Unknown Time Complexity Limited Data Points (4 medical gray-scale images)
[87] Improved exploiting contextual information High convergence speed Better performance in terms of metrics Better Objective Function Scores Limited Data Points (10 images) Unknown Time Complexity Static thresholding level Time complexity
[88] High convergence speed Better segmentation quality Static thresholding level Limited Data Points
[89] Tested on images with different type of noise effects (Gaussian, Salt & Pepper, Rician, Shadow, Reflection) Optimal Thresholding High accuracy Inadequate comparison Limited Data Points (20 images) Static thresholding level Time complexity
[90] Balanced Exploration & Exploitation Better performance in terms of metrics Time complexity Static thresholding level Limited to Grayscale Testing Noise Sensitivity
[91] Simple Implementation Optimized time complexity Better performance in terms of metrics Better segmentation quality Feature Constraints Unknown Time Complexity Inadequate Comparison Static thresholding level Time complexity
[92] Stable Segmentation Static thresholding level
[93] Optimized time complexity Static thresholding level
[94] -Better performance in terms of metrics High convergence speed Better Objective Function Scores Inadequate Comparison Static thresholding level Time complexity
[95] Better Objective Function Scores Inadequate Comparison Static thresholding level Time complexity
[96] Rich dataset (300 images) Better performance in terms of metrics Time complexity Static thresholding level
[97] Optimized time complexity High efficiency on grayscale and color images Limited Data Points (20 images) Inadequate comparison Static thresholding level Only one objective function is used to show the algorithm
[98] High accuracy High convergence speed Balanced Exploration & Exploitation performance Unknown Time Complexity Limited Data Points (31 images)
[99] Better performance in terms of metrics Limited Data Points (3 images) Inadequate comparison Static thresholding level
[100] Stable Segmentation Better performance in terms of metrics Better Objective Function Scores Unknown Time Complexity Static thresholding level Inadequate Comparison Static thresholding level Parameter Sensitivity Time complexity Unknown Time Complexity Static thresholding level
[101] Optimal Thresholding Limited Data Points (4 images)
[102] Better performance in terms of metrics Thresholding levels are set statically
[103] High accuracy Parameter Sensitivity Static thresholding level Local Optima Trap Time complexity Static thresholding level Limited to Grayscale Testing
[104] Better segmentation quality Static thresholding level Limited Data Points (23 images) Unable to dealing with low quality images Static thresholding level Limited Data Points (8 images) Static thresholding level Noise Sensitivity Inefficient Advanced Segmentation (such as medical images containing intensity inhomogeneity) Time complexity Time complexity Parameter Sensitivity Inadequate Evaluation Unknown Time Complexity Static thresholding level Inadequate Evaluation Static thresholding level Limited to Grayscale Testing Static thresholding level Limited Data Points (6 images) Inadequate comparison Limited Data Points (6 images) Static thresholding level Unknown Time Complexity Noise Sensitivity Limited Data Points
[105] Optimal Thresholding
[106] Better performance in terms of metrics
[107] Stable Segmentation Better segmentation quality
[108] Better performance in terms of metrics
[109] Better performance in terms of metrics Superiority in Pareto front approximation
[110] Better performance in terms of metrics Better Objective Function Scores
[111] Optimized time complexity Better performance in terms of metrics
[112] Ability to adjust region of interest of tumor automatically
[113] Better segmentation quality
[114] High accuracy Optimized time complexity Better performance in terms of metrics
[115] Stable Segmentation
[116] High accuracy High convergence speed Stable Segmentation
[117] Better performance in terms of metrics
[118] Increased search capacity Optimized time complexity
[119] High efficiency on grayscale and color images Optimized time complexity Higher accuracy
[120] High convergence speed Better Objective Function Scores
[121] Better Objective Function Scores Best functionality in case of high-dimensionality
[122] Better performance in terms of metrics Increased search capacity
[123] Better performance in terms of metrics Sufficient Compared Method
[124] High convergence speed Better performance in terms of metrics
[125] Higher accuracy, TP rate and precision Better performance in terms of metrics Rich dataset (over 500 images)
[126] Stable Segmentation High accuracy
[127] Optimized time complexity Better performance in terms of metrics
[128] Better segmentation quality High convergence speed High accuracy
[129] High accuracy
[130] Better performance in terms of metrics
[131] High accuracy High convergence speed
[132] Reliable Disease Assessment Strong association with the clinical seriousness of the illness Missing Longitudinal Data
[133] Better performance in terms of metrics Strong robustness High fault diagnosis accuracy Time complexity Inadequate comparison
[134] Simple Implementation Better segmentation quality Balanced Exploration & Exploitation High convergence speed High accuracy Inefficient Advanced Segmentation Time complexity Noise Sensitivity
[135] High convergence speed Optimized time complexity Optimal Thresholding Increased search capacity The optimal number of thresholds is indicated statically
[136] Better performance in terms of metrics Avoids Local optima Limited Data Points (9 for first stage and 6 for medical stage) Unknown Time Complexity
[137] Optimized time complexity Better Objective Function Scores Limited Data Points (6 images) Inadequate comparison
[138] High accuracy Better Objective Function Scores Limited Data Points (6 images) Static thresholding level Unknown Time Complexity
[139] High accuracy High convergence speed Better segmentation quality Balanced Exploration & Exploitation Optimal Thresholding High efficiency on grayscale and color images Avoids Local optima Simple Implementation Sufficient Compared Method Noise Sensitivity Limited Data Points
[140] Better performance in terms of metrics Limited to Grayscale Testing Static thresholding level
[141] Better Objective Function Scores Inadequate Evaluation Static thresholding level Limited to Grayscale Testing
[142] Optimized time complexity Sufficient Compared Method Better performance in terms of metrics Static thresholding level Inadequate Evaluation
[143] Better performance in terms of metrics Inadequate Evaluation Static thresholding level
[144] High accuracy Stable Segmentation Optimization ability Time complexity Static thresholding level
[145] Increased search capacity Anti-noise performance Static thresholding level Unknown Time Complexity
[146] High accuracy Better performance in terms of metrics High convergence speed Limited to Grayscale Testing Static thresholding level
[147] Better segmentation quality Prevents Overfitting Time complexity Limited Data Points (8 images)
[148] Efficient algorithm for MLTS Optimized time complexity High accuracy Better segmentation quality Robustness for image illumination and complex backgrounds Inadequate comparison Limited Data Points (20 images)
[149] High convergence speed Avoids Local optima Better segmentation quality Time complexity Local Optima Trap
[150] High efficiency on grayscale and color images High accuracy High convergence speed Multi-Objective Limitation Limited Data Points (8 images) Time complexity
[151] Increased search capacity Stagnation mitigation Avoids Local optima Time complexity Limited Data Points (10 images)
[152] Strong robustness Limited Data Points (4 images)
[153] Rich dataset Optimal Thresholding Static thresholding level
[154] Better performance in terms of metrics Static thresholding level
[155] Optimization ability Better segmentation quality Inadequate Comparison No Real-World Testing
[156] Optimized time complexity Better segmentation quality Inadequate Comparison No Real-World Testing
[157] Better Objective Function Scores Better segmentation quality Static thresholding level Inadequate comparison
[158] Optimized time complexity Better segmentation quality Inadequate comparison Static thresholding level
[159] High accuracy Unknown Time Complexity Inadequate comparison
[160] Better performance in terms of metrics High convergence speed Local Optima Trap Inadequate comparison
[161] Balanced Exploration & Exploitation Population diversity and balance Improving the possibility of more excellent individuals within the population Limited to Grayscale Testing Limited Data Points (10 images)
[162] Better performance in terms of metrics High convergence speed Prevents Overfitting Time complexity Incomplete Threshold Analysis
[163] Balanced Exploration & Exploitation Static thresholding level Time complexity
[164] Better performance in terms of metrics Time complexity Static thresholding level

6.3. Ablation Studies / Parameter Analysis

As a survey paper, this article does not present its own ablation studies or detailed parameter analysis. Instead, it synthesizes the findings from the reviewed literature regarding common strengths and weaknesses. The recurring limitations highlighted in Table 8 and Figure 12, such as parameter sensitivity and the prevalence of static thresholding levels, indirectly reflect the need for more robust ablation studies and parameter analysis in individual research papers to develop adaptive and automatically tuned MLT algorithms. The identification of "Manual Parameter Tuning" and "Determination of Optimal Number of Clusters" as open issues in Section 5 further underscores that these aspects are critical areas for future research rather than well-addressed areas in the current literature.

7. Conclusion & Reflections

7.1. Conclusion Summary

This comprehensive review provides a thorough understanding of multi-level thresholding segmentation, from its foundational concepts to its advanced applications and current limitations. It systematically covers the background of image segmentation, various multi-level thresholding approaches (like Otsu, Kapur, Tsallis entropy, Fuzzy entropy, Minimum Cross Entropy, and Renyi's entropy), and the crucial role of meta-heuristic algorithms in optimizing threshold selection.

The review highlights that multi-level thresholding offers significant advantages over simpler thresholding methods, including enhanced flexibility, adaptability to complex intensity distributions, and the ability to capture finer image details. Meta-heuristic algorithms are shown to be instrumental in reducing computational costs and improving the accuracy of threshold determination. By analyzing datasets, evaluation metrics, programming languages, and the advantages/disadvantages of different techniques, the paper offers valuable insights into the current state of the field.

7.2. Limitations & Future Work

The paper explicitly identifies several open issues and challenges that represent limitations in current multi-level thresholding research and suggest promising directions for future work:

  • Manual Parameter Tuning: A significant limitation is the reliance on manual tuning of optimization algorithm parameters. Future research should focus on automated methods for parameter tuning to enhance objectivity and efficiency.
  • Automatic Determination of Threshold Levels: The static setting of threshold levels is a common drawback. Future work should develop adaptive thresholding methods and statistical measures that can automatically determine the optimal number of thresholds.
  • Limited Testing with Diverse Datasets: Many algorithms are tested on specific or small datasets, limiting their generalizability. Future studies should employ more diverse and extensive datasets to ensure robustness and reliability.
  • Hybridization with Meta-heuristic Algorithms: Further hybridization of meta-heuristic algorithms (e.g., with opposition-based learning) is encouraged to improve segmentation results by leveraging the strengths of different techniques.
  • Comprehensive Performance Comparison: There is a need for more comprehensive comparisons against a wider range of state-of-the-art algorithms to accurately assess performance.
  • Sensitivity to Noise and Efficiency: Current algorithms can be sensitive to noise and inefficient for complex image segmentation tasks. Future research should integrate additional image features or advanced processing techniques to mitigate noise and improve efficiency.
  • Computational Complexity: Optimization is needed to reduce the computational complexity and processing time of algorithms, especially for real-time applications.
  • Extension to Other Image Processing Problems: Exploring the application of existing MLT algorithms to other image processing problems like image registration, denoising, and quality enhancement could expand their utility.
  • Determination of Optimal Number of Clusters: Automatically setting the optimal cluster size is challenging, particularly for biomedical images with varying modalities and anatomical regions.
  • Handling Inhomogeneities and Complex Images: Developing robust techniques for handling inhomogeneous and complex images is crucial to ensure accurate segmentation.
  • Multi-Objective Optimization: Further application of multi-objective optimization can improve performance by simultaneously considering multiple objectives (e.g., Otsu's fuzzy entropy or minimum cross-entropy).
  • Color Image Segmentation and Texture Analysis: Algorithms need to be enhanced to effectively handle color images and incorporate texture properties, which are vital for distinguishing regions.
  • Longitudinal Assessment: In medical imaging, there is a lack of longitudinal assessment of disease progression, which limits the evaluation of long-term effectiveness.
  • Interpretability of Results: Developing techniques to provide insights and explanations into how image processing algorithms make decisions is crucial for understanding and trusting their outputs, especially in critical applications like medical diagnosis.

7.3. Personal Insights & Critique

This survey is a valuable resource for anyone entering or working within the field of multi-level thresholding image segmentation. Its strength lies in its focused scope and the systematic organization of a large body of recent literature (2017-2023). The detailed breakdown of objective functions and their underlying mathematical principles, along with the comprehensive lists of datasets, evaluation metrics, and meta-heuristic algorithms, provides a solid foundation for understanding the domain. The explicit identification of open issues serves as a clear roadmap for aspiring researchers.

However, a critical perspective might note a few points:

  • Reproducibility Challenge: While the survey highlights the importance of setup environment details (Table 6), many reviewed papers still lack this information (NR in the table). This indicates a broader issue in the field regarding reproducibility, which the survey effectively points out without being able to solve.

  • MATLAB Dominance: The heavy reliance on MATLAB (Figure 9) in the reviewed literature suggests a potential bias or limitation. While MATLAB is excellent for prototyping, Python with its open-source libraries (e.g., OpenCV, scikit-image, TensorFlow, PyTorch) has become increasingly prevalent for deployment and large-scale applications due to its flexibility and community support. Future research might shift towards Python-based implementations.

  • Depth of Fuzzy Entropy Formula: The severe corruption in Equation 23 for Fuzzy Entropy in the original paper is a notable flaw. While the survey correctly reproduces it as per instructions, it underscores the occasional challenges in relying solely on published formulas and highlights the need for rigorous self-correction and clarity in academic writing.

  • Meta-heuristic vs. Deep Learning Balance: The survey primarily focuses on meta-heuristic optimization for thresholding. While it acknowledges deep learning in the Related Works section, it doesn't deeply integrate how MLT might combine with deep learning approaches, beyond using MLT as a preprocessing step. Future research might explore hybrid models that blend the strengths of meta-heuristics for optimal threshold selection with deep learning for feature extraction or semantic understanding in a more integrated manner.

  • Practical Applicability: The recurring limitations like time complexity and static thresholding settings indicate that while many methods achieve high performance on metrics, their practical deployment in real-time or dynamic scenarios remains challenging. The call for automatic determination of threshold levels and handling complex images is crucial for translating research into real-world impact.

    Overall, this paper provides an invaluable, structured overview that effectively condenses a fragmented body of knowledge, making it significantly easier for newcomers to grasp the complexities and current frontiers of multi-level thresholding segmentation. Its direct critique of existing literature, especially on limitations and future work, is a strong contribution to guiding the field forward.

Similar papers

Recommended via semantic vector search.

No similar papers found yet.