A Comprehensive Survey of Multi‑Level Thresholding Segmentation Methods for Image Processing
TL;DR Summary
The paper reviews multi-level thresholding methods in image processing, focusing on capturing image complexity through multi-range intensity partitioning. It discusses metaheuristic algorithms for optimizing threshold values and outlines advantages, limitations, and future resear
Abstract
In image processing, multi-level thresholding is a sophisticated technique used to delineate regions of interest in images by identifying intensity levels that differentiate different structures or objects. Multi-range intensity partitioning captures the complexity and variability of an image. The aim of metaheuristic algorithms is to find threshold values that maximize intra-class differences and minimize inter-class differences. Various approaches and algorithms are reviewed and their advantages, limitations, and challenges are discussed in this paper. In addition, the review identifies future research areas such as handling complex images and inhomogeneous data, determining thresholding levels automatically, and addressing algorithm interpretation. The comprehensive review provides insights for future advancements in multilevel thresholding techniques that can be used by researchers in the field of image processing.
Mind Map
In-depth Reading
English Analysis
1. Bibliographic Information
1.1. Title
A Comprehensive Survey of Multi-Level Thresholding Segmentation Methods for Image Processing
1.2. Authors
-
Mohammad Amiriebrahimabadi
-
Zhina Rouhil
-
Najme Mansouri
The affiliations are not explicitly stated for all authors, but Mohammad Amiriebrahimabadi and Najme Mansouri are listed with a "1" superscript, suggesting a common primary affiliation.
1.3. Journal/Conference
The paper was published online by Springer on March 27, 2024. Although the specific journal is not explicitly named in the provided text, Springer is a highly reputable publisher for scientific and technical content, particularly in fields like image processing, computer science, and engineering. Publication with Springer suggests a peer-reviewed process and contributes to the paper's credibility and visibility within the academic community.
1.4. Publication Year
2024
1.5. Abstract
This paper presents a comprehensive survey of multi-level thresholding segmentation methods used in image processing. The technique is crucial for delineating regions of interest by identifying intensity levels that distinguish different structures and objects, effectively capturing image complexity through multi-range intensity partitioning. The survey reviews various approaches, particularly those employing metaheuristic algorithms, which aim to find optimal threshold values by maximizing intra-class differences and minimizing inter-class differences. The authors discuss the advantages, limitations, and challenges of these methods. Furthermore, the review identifies critical future research areas, including handling complex and inhomogeneous image data, automating the determination of thresholding levels, and improving algorithm interpretability. This extensive review provides valuable insights for researchers, guiding future advancements in multi-level thresholding techniques within the image processing domain.
1.6. Original Source Link
/files/papers/692b228e4114e99a4cde874e/paper.pdf (This is a relative path; the actual link would be a full URL if available).
The paper was published online on March 27, 2024, indicating it is an officially published work.
2. Executive Summary
2.1. Background & Motivation
The core problem the paper addresses is image segmentation, specifically focusing on multi-level thresholding. Image segmentation is the process of dividing an image into meaningful, segmentally coherent regions or objects, which is foundational for numerous applications such as object recognition, medical imaging, autonomous systems, virtual/augmented reality, video surveillance, and environmental monitoring.
While thresholding is a simple and direct method for segmentation, Bi-Level Thresholding (BLT) is often insufficient for images with more than two classes or complex intensity distributions. This limitation motivates the need for Multi-Level Thresholding (MLT), which divides an image into multiple intensity levels, offering more precise segmentation by accommodating intensity variations and capturing finer details.
A significant challenge in MLT is determining the optimal threshold values. Traditional methods can be computationally expensive, especially as the number of thresholds increases. This is where metaheuristic algorithms (MAs) become crucial. They offer stochastic search capabilities to find optimal solutions efficiently, reducing computational effort and avoiding local optima. The paper is motivated by the need for a comprehensive resource that reviews these MLT techniques, analyzes their strengths and weaknesses, and identifies future research directions to enhance image segmentation in various real-world applications.
2.2. Main Contributions / Findings
The paper makes several significant contributions by providing a comprehensive survey of multi-level thresholding image segmentation methods:
-
Comprehensive Review: It offers an extensive review of
multi-level thresholding image segmentationtechniques, consolidating knowledge in this specific field. -
Detailed Taxonomy: The paper provides a detailed taxonomy of various
thresholding approaches, including prominent methods likeOtsu,Kapur,Tsallis,Fuzzy entropy,Minimum Cross Entropy (MCE), andRenyi's entropy. -
Classified Applications and Case Studies: It classifies
thresholding applicationsand providescase studiesacross diverse fields such as medical imaging, remote sensing, plant pathology, and industrial quality control. -
Comparative Analysis: The survey compares
different datasetsused in reviewed studies and describes existingimage segmentation techniques, highlighting theiradvantages and disadvantages. -
Technical Details: It presents
simulation environments,programming languages, and discussesevaluation metricsused for proposed algorithms in detail. -
Identification of Research Gaps: Crucially, it identifies
research gapsandchallengesformulti-level thresholdinginimage segmentation, outlining areas for future research.The key findings revolve around the demonstrated effectiveness of
multi-level thresholdingfor flexible and adaptive segmentation, its ability to handle complex intensity distributions, and the role ofmetaheuristic algorithmsin optimizing threshold selection to reduce computational costs. The survey concludes by emphasizing the need to address identified challenges to further advance these techniques.
3. Prerequisite Knowledge & Related Work
3.1. Foundational Concepts
To fully understand this paper, a reader needs to grasp several fundamental concepts in image processing and optimization:
-
Image Segmentation:
- Conceptual Definition:
Image segmentationis a fundamental task in computer vision and image processing that involves partitioning a digital image into multiple segments (sets of pixels, also known asimage objects). The goal is to simplify and/or change the representation of an image into something more meaningful and easier to analyze. It divides an image intomeaningful, segmentally coherent regions or objects. - Purpose: The output of
image segmentationis a set of segments that collectively cover the entire image, or a set of contours extracted from the image. Each pixel in a region is similar with respect to some characteristic or computed property, such as color, intensity, or texture. Adjacent regions are significantly different with respect to the same characteristics. - Granularity Levels: The paper mentions
image segmentationcan occur atcoarse,medium, andfinelevels of granularity. - Classification (Figure 1):
Image segmentationtechniques are broadly classified into:-
Thresholding: Based on pixel intensity values. -
Clustering: Grouping pixels based on similarity. -
Edge-based: Detecting boundaries between regions. -
Region-based: Growing regions from seed points. -
Artificial Neural Networks (ANNs)/Deep Learning: Learning complex patterns for segmentation.The following figure (Figure 1 from the original paper) illustrates how segmentation techniques can be classified:
该图像是一个示意图,展示了图像分割技术的分类,包括阈值法、聚类法、基于边缘的方法和基于区域的方法,表明了这些方法之间的相互关系。
-
- Conceptual Definition:
-
Thresholding:
- Conceptual Definition:
Thresholdingis a simple and direct method ofimage segmentationthat separates the foreground of an image from its background based on pixel intensity values. It involves converting a grayscale image into a binary image, where pixels are assigned to one of two classes based on whether their intensity is above or below a certain threshold. - Bi-Level Thresholding (BLT): This is the simplest form, where a single threshold value is used to divide an image into two classes:
foregroundandbackground. Pixels with intensity values greater than belong to one class, and those less than or equal to belong to the other. - Multi-Level Thresholding (MLT): This technique extends
BLTby using multiple threshold values () to divide an image into distinct classes or regions. This allows for more precise segmentation, especially for images with complex intensity distributions or multiple objects of interest. For example, three thresholds would divide an image into four classes. - Types of Thresholding:
Global Thresholding: Uses a single threshold value for the entire image.Local Thresholding: Selects individual thresholds for different regions of the image, adapting to local intensity variations.Parametric Thresholding: Assumes a statistical model (e.g., Gaussian distribution) for the image's intensity values and estimates parameters to determine thresholds. Can be time-consuming.Non-Parametric Thresholding: Uses statistical criteria (e.g., entropy, variance) directly from the image histogram to determine threshold values, without assuming a specific model.
- Conceptual Definition:
-
Meta-Heuristic Algorithms (MAs):
- Conceptual Definition:
Meta-heuristic algorithmsare high-level problem-solving procedures designed to find optimal or near-optimal solutions to complex optimization problems, especially those where exact methods are computationally infeasible. They often employstochastic(randomized) search strategies and are capable of avoidinglocal optima(suboptimal solutions that are better than their immediate neighbors but not the absolute best) by exploring the search space broadly. - Characteristics:
Derivative-free: They do not require gradient information, making them suitable for non-differentiable or discontinuous objective functions.Simplicity: Often easy to understand and implement.Flexibility: Can be adapted to various problem types.Exploration vs. Exploitation: They balanceexploration(searching diverse regions of the solution space) andexploitation(intensifying the search around promising solutions).
- Examples (Figure 7): The paper categorizes
MAsinto:-
Evolution-Based (EB): Mimic natural evolution (e.g.,Genetic Algorithm (GA),Differential Evolution (DE)). -
Swarm-Based (SB): Mimic social behavior of animal swarms (e.g.,Particle Swarm Optimization (PSO),Whale Optimization Algorithm (WOA),Ant Colony Optimization (ACO),Cuckoo Search Algorithm (CSA),Firefly Algorithm (FA),Slime Mould Algorithm (SMA),Gray Wolf Optimization (GWO)). -
Physics/Chemistry-Based (PCB): Inspired by physical or chemical phenomena (e.g.,Multi-Verse Optimizer (MVO),Sine Cosine Algorithm (SCA)). -
Human-Based (HB): Inspired by human behavior. -
Others.The following figure (Figure 7 from the original paper) illustrates the classification of meta-heuristic approaches:
该图像是一个示意图,展示了不同类型的元启发式算法的分类,包括基于进化、物理、群体和人类的算法。这些算法在多级阈值分割方法中应用广泛,能够帮助提高图像处理的效果。
-
- Conceptual Definition:
-
Objective Functions (or Fitness Functions):
- Conceptual Definition: In
optimization problems, anobjective function(also known as afitness functioninmeta-heuristics) is a mathematical function that ameta-heuristic algorithmaims to eithermaximizeorminimize. Formulti-level thresholding, theobjective functionquantifies the "goodness" of a set of chosen threshold values. - Role in MLT: The
threshold valuesare thedecision variablesthat themeta-heuristic algorithmadjusts to optimize theobjective function. For example,Otsu's methoduses an objective function that maximizes the variance between classes, whileKapur's entropymaximizes the sum of entropies of the segmented regions.
- Conceptual Definition: In
-
Entropy:
- Conceptual Definition: In information theory,
entropyis a measure of the uncertainty or randomness associated with a random variable. Inimage processing, it can be used to quantify the information content or uniformity of different regions within an image. Higher entropy often indicates more information or less predictability. - Types in MLT: The paper discusses
Kapur's Entropy,Tsallis Entropy(a generalization ofShannon's Entropy),Fuzzy Entropy, andRenyi's Entropy, all adapted asobjective functionsforMLT.
- Conceptual Definition: In information theory,
3.2. Previous Works
The paper dedicates Section 2.5 to reviewing related survey papers, emphasizing that while many surveys exist in computer vision and image processing, none specifically and comprehensively cover multi-level thresholding image segmentation to the same extent as the proposed work. The key prior surveys mentioned and their focus are:
-
Nakane et al. [55] (2020): Focused on the application of
Evolutionary Algorithms (EAs)andSwarm Algorithms (SAs)(specificallyGenetic Algorithms (GA),Differential Evolution (DE),Particle Swarm Optimization (PSO), andAnt Colony Optimization (ACO)) tocomputer visionproblems. -
Zhang et al. [56] (2022): Reviewed
image analysis methodsformicroorganism counting, from classicalimage processingtodeep learning. -
Agrawal and Choudhary [57] (2023): Systematically surveyed
segmentationandclassificationmethods forchest radiography, includingGenerative Adversarial Networks (GANs)for lung segmentation and disease detection. -
Mittal et al. [58] (2022): Investigated
clustering-based image segmentation techniques, categorizing them intohierarchicalandpartitionalmethods, with a focus onhistogram-based,K-means-based, andmeta-heuristics-based partitional clustering. -
Punn and Agarwal [59] (2022): Described the
U-Net frameworkand its variants forbiomedical image segmentation, particularly in the context ofSARS-CoV-2andCOVID-19. -
Loyola-Gonzalez et al. [60] (2020): Reviewed
Contrast Pattern-based classification (CP)and its challenges, includingexhaustive-search-based algorithmsanddecision tree-based approaches. -
Iqbal et al. [61] (2022): Provided a comprehensive study on the application of
GANstomedical image segmentation, discussing various models, metrics, loss functions, and datasets. -
Ramadan et al. [62] (2020): Surveyed
Interactive Image Segmentation (IIS)methods, also known asforeground/background separation. -
Liu et al. [63] (2020): Reviewed
deep learning-based achievementsinobject detection, covering frameworks, features, proposals, context modeling, and evaluation. -
Rai et al. [64] (2022): Focused on
Nature-Inspired Optimization Algorithms (NIOA)formultilevel thresholding problems, highlighting challenges in developingmulti-thresholding models. This survey is the closest in topic to the current paper. -
Borji et al. [65] (2019): Reviewed advances in
salient object detection, including its relation togeneric scene segmentationandobject proposal generation. -
Aljuaid and Anwar [66] (2022): Surveyed
supervised learning techniquesformedical image processing, covering methods likeConvolutional Neural Networks (CNNs),region-based CNNs,Fully Convolutional Networks (FCNs), andU-Net architectures. -
Sasmal and Dhal [67] (2023): Compared
superpixel imagesandclustering techniquesforimage segmentation, discussingsuperpixel generationandpartitional clustering. -
Aliabugah et al. [68] (2023): Investigated
multilevel threshold image segmentation-based metaheuristic optimization methods, outlining definitions, procedures, optimization methods, and performance analysis using benchmark images. This is also a very close survey, but the current paper claims to have a "higher presentation quality" and a more specific, comprehensive focus. -
Bagwari et al. [69] (2023): Provided a comprehensive analysis of
satellite image segmentation techniques, includingdeep learning approaches.The following are the results from Table 4 of the original paper, summarizing related surveys:
Article Year Study area Main Focus Author Title Keywords Proposed survey 2023 Multi-level thresholding segmentation in image processing Review the most useful and effective methods and innovations presented in the past few years for multi-level thresholding segmentation in image processing Amiriebrahimabadi et al A comprehensive survey of multi-level thresholding segmentation methods for image processing Thresholding, Segmentation, Deep learning, Survey [55] 2020 Survey of EAs and SAs adopted to solving computer vision problems Summary of GA, DE, PSO and ACO swarm algorithms performance in computer vision field Nakane et al Application of evolutionary and swarm optimization in computer vision: a literature survey Evolutionary algorithms, Swarm algorithms, Computer vision, Literature survey [56] 2022 Microorganism counting The process of identifying future trends in microorganism counting and offering systematic guidelines for the deployment of comprehensive microorganism counting systems is being developed Zhang et al A comprehensive review of image analysis methods for microorganism counting: from classical image processing to deep learning approaches Microorganism counting, Digital image processing, Microscopic images, Image analysis, Image segmentation [57] 2023 Chest radiography The study explores the segmentation of lungs and the identification and categorization of lung diseases using datasets that are publicly accessible Agrawal and Choudhary Segmentation and classification on chest radiography: a systematic survey Deep convolutional neural network, Computer vision, Lung segmentation, Multiclass classification, Nodule, TB, COVID-19, Pneumothorax detection, GAN [58] 2021 Image segmentation clustering methods In the field of image segmentation, examining and comparing clustering methods related performance parameters Mittal et al A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets Image segmentation, Clustering methods, Performance parameters, Benchmark datasets [59] 2022 Biomedical Through the categorization of U-Net variants into inter-modality and intra-modality, they are able to deepen their comprehension of the problems and potential solutions related to U-Net Punn and Agarwal Modality specific U-Net variants for biomedical image segmentation: a survey Biomedical image segmentation, Deep learning, U-Net [60] 2020 Contrast Patterns A study of supervised classification based on CP and its applications Loyola-Gonzalez A Review of Supervised Classification based on Contrast Patterns: Applications, Trends, and Challenges Supervised classification, Contrast patterns, Review, Taxonomy [61] 2022 Biomedical This is a summary of various models based on GANs, performance indicators, loss functions, datasets, methods of augmentation, implementations of research papers, and source code used in the field of medical image segmentation Iqbal et al Generative adversarial networks and its applications in the biomedical image segmentation: a comprehensive survey Generative adversarial network, GANs applications, GANs in medical image segmentation [62] 2020 Interactive-based image segmentation methods Object extraction based on user interaction, often called foregroundbackground separation in IIS Ramadan et al A survey of recent interactive image segmentation methods Interactive image segmentation, user interaction, label propagation, deep learning, superpixels [63] 2019 Generic object detection based on deep learning methods Purpose of mentioned is to provide a comprehensive survey of the recent developments in deep learning in this field Liu et al Deep Learning for Generic Object Detection: A Survey Object detection, Deep learning, Convolutional neural networks, Object recognition [64] 2022 multi-thresholding image segmentation Review of the challenges encountered when developing image multi-threshold models based on NIOA in recent years (20192021) Rai et al Nature-inspired optimization algorithms and their significance in multi-thresholding image segmentation: an inclusive review Multilevel Thresholding, Nature-Inspired Optimization Algorithms (NIOA), Exponential, Nonlinear, Combinatorial, Nondeterministic, Image segmentation [65] 2019 segmenting salient objects This is an extensive review of the latest advancements in the detection of salient objects, positioning this domain in relation to other closely associated fields such as general scene segmentation, generation of object proposals, and saliency in prediction of fixation Borji et al Salient object detection: A survey Salient object detection, salient object detection, survey [66] 2022 Medical Images Medical image processing tasks were supervised using learning methods and metrics Aljuaid and Anwar Survey of Supervised Learning for Medical Image Processing Deep learning, Convolutional neural network (CNN), Fast R-CNN, Faster R-CNN, FCN, Mas R-CNN, Medical image learning, U-Net [67] 2023 Superpixel image clustering based Determine the effectiveness of integrating superpixel and partitional Sasmal and Dhal A survey on the utilization of Superpixel image for clustering-based image Superpixel, Clustering, Image segmentation, Optimization, Leaf images, Oral histopathology [68] 2022 Multilevel thresholding segmentation using meta-heuristic Presents the application of Multilevel thresholding as the whale optimization algorithm, particle swarm optimization, and others, to multilevel thresholding for image segmentation Multilevel thresholding optimization algorithms: comparative analysis, open challenges and new trends Multilevel threshold- optimization algorithms, Real-world problems, Optimization problems, Survey [69] 2023 Satellite Images It aims to address the challenges and limitations in satellite image processing and segmentation methodologies Bagwari et al A Comprehensive Review on Segmentation Techniques for Satellite Images
The following are the results from Table 5 of the original paper, summarizing comparisons made in reviewed articles:
| Article | Highest order evaluation metric Analysis | Comparative Taxonomy | Presenting a comparative chart | setup environment details | Datasets analysis | Reviewed Articles | Motivation or open Issues |
|---|---|---|---|---|---|---|---|
| Proposed review | ✓ | ✓ | ✓ | ✓ | ✓ | 79 | ✓ |
| [55] | X | X | ✓ | X | X | 109 | X |
| [56] | X | ✓ | V | X | X | 144 | ✓ |
| [57] | ✓ | V | ✓ | X | X | 85 | X |
| [58] | X | ✓ | ✓ | X | X | X | |
| [59] | X | X | √ | ✓ | X | 57 | ✓ |
| [60] | X | ✓ | ✓ | X | X | 105 | X |
| [61] | √ | ✓ | √ | X | X | 138 | X |
| [62] | X | ✓ | ✓ | X | X | 150 | X |
| [63] | √ | √ | √ | ✓ | X | 300 | ✓ |
| [64] | ✓ | √ | √ | X | X | 65 | ✓ |
| [65] | X | √ | √ | X | X | 228 | ✓ |
| [66] | √ | ✓ | √ | √ | X | 36 | X |
| [67] | √ | ✓ | √ | ✓ | X | 34 | X |
| [68] | ✓ | √ | √ | ✓ | X | 80 | ✓ |
| [69] | ✓ | ✓ | ✓ | ✓ | X | 56 | ✓ |
3.3. Technological Evolution
The evolution of image segmentation techniques has progressed from simple thresholding methods to more sophisticated approaches. Initially, Bi-Level Thresholding (BLT) provided a quick way to separate foreground from background. However, as images became more complex with varying lighting conditions and multiple objects, Multi-Level Thresholding (MLT) emerged to offer finer-grained segmentation by dividing images into several intensity ranges.
The main challenge for MLT has been finding the optimal set of thresholds, which becomes a computationally intensive search problem as the number of thresholds increases. This led to the integration of optimization algorithms. Deterministic optimization methods were often slow or prone to local optima. The next significant step was the adoption of meta-heuristic algorithms (MAs). These MAs, inspired by natural phenomena (e.g., evolutionary algorithms, swarm intelligence), offered efficient stochastic search capabilities to explore vast solution spaces and find near-optimal thresholds, significantly reducing computation time and improving segmentation quality.
More recently, the field has seen advancements in:
-
Hybrid Approaches: Combining different
meta-heuristicsor integratingmeta-heuristicswithobjective functions(likeOtsuorKapurentropy) to leverage their respective strengths. -
Deep Learning Integration: While this survey focuses on
thresholdingandmeta-heuristics, other surveys mentioned (e.g., [57], [59], [63], [66]) highlight the rise ofdeep learningmodels (likeCNNsandU-Nets) for complexsegmentationtasks, particularly inmedical imaging. -
Addressing Specific Challenges: Research has also focused on improving
robustness to noise, handlinginhomogeneous data, and developingadaptiveorautomatic threshold determinationmethods.This paper's work fits within this timeline by consolidating the advancements in
MLTprimarily driven bymeta-heuristic optimizationover recent years (2017-2023), providing a foundation for future research in these areas.
3.4. Differentiation Analysis
Based on the provided Related Works section (Section 2.5) and Table 5, the current survey differentiates itself from prior studies through its specific focus and comprehensive analytical scope:
-
Specific Focus: Unlike many previous surveys that cover broader topics like
evolutionary/swarm optimization in computer vision[55],microorganism counting[56],chest radiography[57],clustering methods[58],U-Net variants[59],contrast patterns[60],GANs in medical imaging[61],interactive image segmentation[62],generic object detection[63],supervised learning for medical images[66],superpixel clustering[67], orsatellite image segmentation[69], this paper exclusively concentrates onMulti-Level Thresholding Segmentationforimage processing. While Rai et al. [64] and Aliabugah et al. [68] also covermulti-thresholding, this survey claims to offer a "higher presentation quality" and a more specific, comprehensive analysis. -
Comprehensive Analytical Depth: The paper claims to be more comprehensive by providing:
-
Detailed Taxonomy of Thresholding Approaches: It specifically delves into various
objective functionslikeOtsu,Kapur,Tsallis,Fuzzy,MCE, andRenyientropies, which is a core component missing or less detailed in other surveys. -
Setup Environment Details:
Table 5indicates that this proposed review providessetup environment details(✓), which many other surveys (e.g., [55], [56], [57], [58], [59], [60], [61], [62], [64], [65]) marked with . This detail is valuable for reproducibility. -
Datasets Analysis: The survey provides an in-depth analysis of
datasetsused across the reviewed studies, which is critical for understanding applicability and limitations. -
Identification of Research Gaps and Open Issues: The paper explicitly identifies
research gapsandchallenges, offering insights for future work, a feature also present in some other surveys (e.g., [56], [59], [63], [64], [65], [68], [69]), but here specifically tailored tomulti-level thresholding.In essence, while the general area of image segmentation and meta-heuristics has been surveyed, this paper's differentiation lies in its dedicated, in-depth, and multi-faceted analysis of
multi-level thresholding segmentation, presenting detailed comparisons and outlining specific challenges and opportunities for this particular technique.
-
4. Methodology
4.1. Principles
The core principle of Multi-Level Thresholding Segmentation is to divide an image into multiple regions based on different intensity levels, rather than just two (foreground and background). This is achieved by identifying optimal threshold values () that partition the image's grayscale histogram into distinct classes or segments. Each pixel is then assigned to a class based on which intensity range its value falls into.
The process typically involves:
-
Image Preprocessing: Often includes converting to grayscale if a color image, and potentially noise reduction.
-
Histogram Calculation: Generating the intensity histogram of the image, which shows the frequency of each pixel intensity level.
-
Objective Function Definition: Choosing a mathematical
objective function(e.g.,Otsu's method,Kapur's entropy) that quantifies the "goodness" of a set of threshold values. This function typically aims to maximize theseparabilitybetween classes (e.g., maximize inter-class variance) or minimizewithin-class similarity(e.g., minimize cross-entropy). -
Optimization: Employing an
optimization algorithm(oftenmeta-heuristics) to search for the threshold values thatmaximizeorminimizethe chosenobjective function. These threshold values act asdecision variablesfor theoptimization algorithm. -
Image Segmentation: Applying the optimal threshold values to the image to classify each pixel into its corresponding segment.
The following figure (Figure 8 from the original paper) provides a complete overview of the framework of thresholding image segmentation:
该图像是一个示意图,展示了多级阈值分割的框架,包括输入原始图像、定义阈值数、转换为灰度图、输出分割图像等步骤。该框架突出了通过元启发式和目标函数(如Kapur或Otsu)寻找最佳阈值的过程。
4.2. Core Methodology In-depth (Layer by Layer)
The paper elaborates on several specific thresholding approaches that serve as objective functions for multi-level thresholding. These methods use pixel intensity values to segment an image into different regions.
4.2.1. Otsu's Method
Otsu's method, developed in 1979 [47], is an automated threshold selection method for image segmentation that works by maximizing the between-class variance (or inter-class variance) of the pixel intensities. This variance measures the separability between the different classes created by the threshold(s). A higher between-class variance indicates better separation between the segmented regions.
Summary of Otsu's method for bi-level thresholding [48]: Let's assume an image has pixels and can be represented in gray levels ranging from 1 to . Let be the number of pixels at gray level . Then, the total number of pixels is . The probability of gray level occurrence is defined by the following equation: where is the probability of a pixel having gray level , is the number of pixels with gray level , and is the total number of pixels in the image.
For bi-level thresholding, if we choose a threshold , the pixels are divided into two classes, and . The cumulative probabilities (also known as class probabilities) are computed as follows:
where is the probability of the first class (pixels with intensity from 1 to ), and is the probability of the second class (pixels with intensity from to ).
The means (average intensity values) of these two classes are calculated using the following equation:
where is the mean intensity of the first class, and is the mean intensity of the second class.
The mean level of the entire image (global mean) is given by:
where is the total mean intensity of the image.
The between-class variance for bi-level thresholding is represented by the objective function f(t):
where \sigma_0 = \omega_0 (\mu_0 - \mu_T)^2 and \sigma_1 = \omega_1 (\mu_1 - \mu_T)^2. These terms represent the variance contributions from each class. The goal is to maximize this sum.
The process of determining the optimal threshold involves maximizing the inter-class variance:
where is the optimal single threshold value.
Extension to Multi-Level Thresholding:
For multi-level thresholding with thresholds (), the image can be divided into classes. The extended between-class variance is calculated as shown in the following equation:
where f(t) is the objective function to be maximized.
The sigma terms for each class are determined using the following equation:
where is the variance contribution for class , is the probability of class , and is the mean intensity of class . Note that the paper uses M-1 for the last class index, which would imply classes, so thresholds.
The mean levels for each of the classes are calculated by the following equation:
where are the pixel probabilities, and are the cumulative probabilities for each class .
It is necessary to maximize the variance between classes in order to determine the optimal thresholds:
where represents the vector of optimal thresholds .
4.2.2. Kapur's Entropy
Kapur's entropy method [49] is an entropy-based thresholding technique. In information theory, this method aims to determine optimal thresholds by maximizing the entropy (or sum of entropies) of the segmented regions. The idea is that a good segmentation will result in regions that are as "random" or "informative" as possible, meaning their intensity distribution is spread out rather than concentrated.
For thresholds (), which divide the image into classes, the objective function is defined as the sum of the entropies of these classes: where are the entropies of the classes.
The individual entropies and their corresponding cumulative probabilities for a grayscale image with distinct intensity levels (0 to L-1) are calculated as follows:
Here, denotes the probability of a pixel having an intensity value . Each is the cumulative probability of pixels within class , and is the Shannon entropy of class , normalized by .
This objective function determines the optimal thresholds as follows: where is the vector of optimal thresholds. The segmentation can be performed independently for , , and channels for color images.
4.2.3. Tsallis Entropy
Tsallis's entropy is a generalization of Shannon's entropy, often used in multi-fractal theory and Boltzmann-Gibbs (BGS) statistics [50, 51]. It introduces a parameter , known as the entropic index or degree of non-extensivity.
The general expression for Tsallis entropy for a system with possibilities is:
where is the probability of possibility , and is the entropic index.
For bi-level thresholding, Tsallis entropy is described as follows [50, 51]:
where is the single threshold value, and and are the Tsallis entropies for the two classes (foreground A and background B). The term accounts for the non-extensivity.
Here, and are the cumulative probabilities for classes A and B, respectively:
where is the probability of pixel intensity , is the number of gray levels, and the fractions and are the conditional probabilities within each class. This method maximizes the information measures between objects and backgrounds.
A multilevel thresholding method based on the Tsallis entropy criterion with thresholds is described below:
where represents the vector of thresholds , and are the Tsallis entropies for the classes. The product term extends the non-extensivity property to multiple classes.
The cumulative probabilities for the respective classes are formulated as follows: where is the probability of pixel intensity , and is the maximum gray level.
4.2.4. Fuzzy Entropy
Fuzzy entropy approaches use membership functions instead of sharp thresholding to define the degree to which a pixel belongs to a certain class (e.g., foreground or background). These membership functions are considered indicators of foreground and background strength. The method aims to maximize the total entropy based on these fuzzy membership grades [52].
For bi-level thresholding with a single threshold th, the objective function f(th) is defined as:
where and are the entropies of the two classes (foreground and background).
The entropies of classes and are calculated as follows:
where is the histogram probability of gray level , and are the membership functions for the first and second classes at threshold th, respectively, and and are distribution functions.
Each distribution function is defined as follows:
where and represent the fuzzy sum of probabilities weighted by their membership degrees for each class over all gray levels .
The calculation of the membership functions and is performed as follows, typically using triangular or trapezoidal fuzzy functions:
Here, and are fuzzy parameters that define the shape of the membership functions. The threshold th is often the midpoint of and :
The fuzzy entropy objective function for multi-level image segmentation with classes is defined as follows:
where th represents the vector of thresholds .
The entropies of classes are computed similarly for multi-level segmentation. For the first two classes: The paper presents an extremely corrupted formula for subsequent entropies, which I will reproduce exactly as it appears in the original text, followed by a note: Note: The formula above (part of Equation 23 in the original paper) appears corrupted in the source PDF, showing repetitive terms and incomplete expressions. It has been reproduced exactly as presented in the original text, but its intended meaning for the membership functions in a multi-level context is unclear due to this corruption.
The last class entropy is:
Each class has the following cumulative distribution function:
The membership functions are computed as follows (the paper shows blank space here, implying the definition would be an extension of the bi-level case but not explicitly written out in full):
The threshold values are computed by the following equation:
where is the -th threshold, and are the fuzzy parameters for the -th class.
4.2.5. Minimum Cross Entropy (MCE)
The Minimum Cross Entropy (MCE) method [53] works by reducing the cross entropy between the original image and its segmented counterpart. Cross entropy is a measure of the difference between two probability distributions. In this context, it quantifies how much information is lost when the original image's pixel distribution is approximated by the segmented image's class-based distributions. The goal is to minimize a specific objective function.
The objective function for multi-level thresholding with thresholds is given by:
where is a constant term, and represent the cross entropies of the distinct classes.
The individual terms are defined as: where is the probability of gray level , and is the total number of gray levels. This term is constant and does not depend on the thresholds.
For each class , the cross entropy is calculated based on its mean and cumulative probability :
Here, are the threshold values, is the probability of gray level , is the mean intensity of class , and is the cumulative probability of class .
Since is a constant, the objective function to be minimized for MCE can be expressed as:
Substituting the expressions for :
To simplify, let and . Then the objective function can be written more compactly:
The objective function of MCE is minimized to determine optimal threshold values. However, the paper uses arg max in the following equation, which implies maximization of the negative of the function or an alternative formulation:
This typically means minimizing the cross entropy, so maximising the negative cross-entropy.
4.2.6. Renyi's Entropy
Renyi's entropy is a generalized form of Shannon's entropy that introduces an adjustable parameter [54]. It provides a more versatile way to measure information coolness. Notably, when equals 1, Renyi's entropy aligns with Shannon's entropy.
Let be a grayscale image to be segmented, with gray levels and probability distribution . For bi-level thresholding with a single threshold , the image is divided into a target class and a background class .
The probabilities of occurrence for and are and , respectively, such that . Their definitions are:
where p(i) is the probability of gray level .
The definitions of Renyi's entropy for the background and target of an image are stated as follows:
where and are the Renyi's entropies for classes and , respectively.
The total Renyi's entropy for bi-level thresholding is:
To determine the optimal threshold , the objective function T(t) must be maximized:
Extension to Multi-Level Thresholding:
For multi-level thresholding with thresholds , the histogram is divided into regions. The gray probabilities for these regions are:
The gray probabilities of other thresholds and the gray probabilities of the last threshold are calculated as follows (generalizing for from 1 to ):
The Renyi's entropy for each class is calculated by the following equation:
where is a number between 1 and , and is the cumulative probability for class .
The total Renyi's entropy for multi-level thresholding is calculated using the following equation:
The selected optimal thresholds should meet the following condition:
4.2.7. Meta-Heuristic Algorithms (MAs) in MLT
Once an objective function (like Otsu, Kapur, Tsallis entropy, etc.) is chosen, meta-heuristic algorithms are employed to find the optimal set of threshold values that either maximize or minimize this objective function. The general framework (as depicted in Figure 8) involves:
-
Input Original Image: The raw image to be segmented.
-
Define Threshold Number: The user or an adaptive mechanism specifies , the number of desired thresholds.
-
Grayscale Transformation: If the input is a color image, it's converted to grayscale (or channels are processed independently).
-
Histogram Creation: The grayscale histogram is generated, showing pixel intensity distributions.
-
Meta-Heuristic Optimization: An
MA(e.g.,PSO,WOA,GA,DE, etc.) is initialized with a population of candidate threshold sets. -
Objective Function Evaluation: For each candidate set of thresholds, the chosen
objective function(e.g.,Kapur's entropyorOtsu's method) is calculated. The output of this function determines thefitnessof the candidate solution. -
Iterative Update: The
MAiteratively updates its population of candidate threshold sets based on theirfitness values, using its specific rules (e.g.,particle movementinPSO,crossover/mutationinGA). This process continues until astopping criterion(e.g., maximum iterations, convergence) is met. -
Optimal Thresholds: The best set of thresholds found by the
MAis selected. -
Segmented Image Output: The image is then segmented using these optimal thresholds.
The use of
MAsis crucial becauseMLToften involves searching a high-dimensional, non-linear, and possibly multimodalsearch space(especially for high numbers of thresholds), wheremeta-heuristicsexcel at findingglobal optimaefficiently.
5. Experimental Setup
The paper is a survey, so it reviews the experimental setups of the articles it analyzed rather than presenting its own experimental setup. This section summarizes the common practices and findings regarding datasets, evaluation metrics, and baselines used in the reviewed literature.
5.1. Datasets
The reviewed papers utilized a wide variety of datasets, reflecting the diverse applications of multi-level thresholding segmentation. This includes both small-scale and large-scale datasets, and images from different domains and modalities.
The following are the results from Table 9 of the original paper, detailing the datasets used:
| Dataset | Data Type | Samples |
|---|---|---|
| COVID-19 | CT images | 163 |
| TCIA | MRI, CT and digital histopathology | 4 |
| Biomedical images | Digital images | 5000 |
| Insulator infrared images | Real insulator infrared images | 500/201 |
| DCE-MRI | MRIs (2D) | 30 |
| Berkeley segmentation dataset | Ground truth images | 500/300 |
| Weighted brain magnetic resonance images | MRIs | 2 |
| Plant canopy image & Satellite images | Phenotype image & remote sensing data | 2/8 |
| Stomach CT images | CT | 4 |
| Pleiades satellite imaginary | multi-spectral images | 2 |
| CheX aka CheXpert, OpenI, Google, PC aka PadChest, NIH aka Chest X-ray14, MIMIC-CXR | COVID-19 CT images | 13 |
| SCI image (Taken from Orange image diagnostic centre) | MRIs | 500 |
| Landsat Imagery Courtesy of NASA Goddard Space Flight Center and U.S. Geological Survey 41,004,176,035, 225,017, 241,004, 385,028, 388,016, 2092, 14,037, 55,067, 169,012 | Natural images | 10 |
| DMR-IR | Thermography images | 10 |
| ABIDE (Autism Brain Imaging Data Exchange, International Neuroimaging Data-sharing) | T2-weighted MRI axial brain images | 12 |
| Eyes, Liver, Head and Tongue | Medical images | 6 |
| USC-SIPI | Grayscale images (uint8) | 5 |
| BT10 and BRATS 2019 | T1-weighted contrast-enhanced (T1c) images & FLAIR brain images | 10 |
| Kodim | Color images (JPEG) | 3 |
| Plant leaf disease | Tomato leaf images | 5512 |
| Zigong dinosaur lantern festvial | Color images | 4 |
| Kaggle brain MRI | MRIs | 98/155 |
| (Normal class images & Tumor images) | ||
| Random samples from earthobservatory.nasa.gov | Satellite images | 10 |
| NASA landsat image | Color images (JPG) | 6 |
| Digital Database for Screening Mammography (DDSM) | DICOM | 2500 |
| Real-time DICOM CT images of the abdomen | DICOM | 7 |
| Plant stomata images | Color images | 2 |
| CASIA v3 Interval, MMU1, and UBIRIS | Digital images | 4195 |
| Dental radiographs | Digital images (X-Ray) | 12 |
| MIAS | DICOM | 322 |
| Histopathological image | Digital images | 10 |
| Skin cancer images | Digital images | 10 |
| Art Explosion | Grayscale images | 8 |
Characteristics and Choices:
- Diversity: The datasets span
medical imaging(e.g.,COVID-19 CT images,MRIbrain images,DCE-MRI,DDSM mammograms,dental radiographs,histopathological images),remote sensing(e.g.,satellite images,Pleiades satellite imagery),natural images(e.g.,Berkeley segmentation dataset,USC-SIPI),plant pathology(e.g.,tomato leaf images,plant stomata images), andindustrial applications(e.g.,insulator infrared images). This variety demonstrates the broad applicability ofMLTtechniques. - Scale: Sample sizes range from very small (e.g., 2 plant stomata images) to very large (e.g., 5512 tomato leaf images, 5000 biomedical images, 2500 DDSM mammograms). This indicates that
MLTmethods are adapted for various data scales, although small datasets can limit generalizability. - Domain Specificity: The extensive use of
medical imagingdatasets highlights the critical role ofMLTin healthcare, where precisesegmentationis crucial for diagnosis and treatment planning. Datasets likeCOVID-19 CT imagesemphasize the need forrobust segmentationmethods that are invariant tocontrastandlighting variationsin real-world clinical scenarios. - Challenges: The diverse characteristics of these datasets (e.g., different image sizes, textures, content, presence of noise or inhomogeneities) underscore the challenges in developing
MLTmethods that are both accurate and robust across various imaging conditions.
5.2. Evaluation Metrics
The reviewed papers employ a range of evaluation metrics to quantitatively assess the performance and quality of multi-level thresholding segmentation methods.
The following figure (Figure 10 from the original paper) presents a comprehensive analysis of the prevailing evaluation metrics:
该图像是柱状图,展示了多级阈值分割方法的评估指标分类的结果。图中标出了多个指标(如 PSNR、SSIM、CPU TIME、FSIM 等)对应的分数,反映了各指标在研究中的重要性。柱状图的高度代表了不同评估指标的值,PSNR和SSIM的分数最高,分别为59和45,显示其在图像处理中占据重要地位。其他指标如MSE、准确率和稳定性等的分数则相对较低。
Here's a breakdown of the most frequently used metrics:
5.2.1. Mean Square Error (MSE)
- Conceptual Definition:
Mean Square Error (MSE)is a common metric in image processing used to quantify the average squared difference between the pixel values of an original (reference) image and a segmented (test) image. It measures the quality of an estimator. A lowerMSEvalue indicates better similarity between the two images and thus better segmentation accuracy. - Mathematical Formula: For two images,
f(x, y)(original image) andg(x, y)(test image), with dimensions : - Symbol Explanation:
- : Number of rows in the image.
- : Number of columns in the image.
- : Pixel intensity value at row and column in the original image.
- : Pixel intensity value at row and column in the test (segmented) image.
5.2.2. Peak Signal to Noise Ratio (PSNR)
- Conceptual Definition:
Peak Signal to Noise Ratio (PSNR)is a metric often used to measure the quality of reconstruction of lossy compression codecs or to compare segmented images against a ground truth. It defines the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation. A higherPSNRvalue (expressed in decibels, dB) indicates a higher quality image, meaning less noise or distortion relative to the maximum possible signal. - Mathematical Formula:
PSNRis expressed in decibels and is calculated based on theMSEand the maximum possible pixel value (BD- Bit Depth). For an 8-bit image, the maximum pixel value is . - Symbol Explanation:
BD: The bit depth of the image (e.g., 8 for an 8-bit grayscale image). The term represents the maximum possible pixel value (e.g., 255 for 8-bit images).MSE: The Mean Square Error between the original and segmented images.
5.2.3. Structural Similarity Index Method (SSIM)
- Conceptual Definition:
Structural Similarity Index Method (SSIM)is a perceptually-based metric that quantifies the similarity between two images. UnlikeMSEorPSNR,SSIMattempts to model the human visual system (HVS) by considering three key factors:luminance(brightness),contrast, andstructure. A higherSSIMvalue (ranging from 0 to 1) indicates greater similarity between the images, with 1 representing identical images. - Mathematical Formula:
SSIMis defined as a product of three comparison functions:luminance comparisonl(f,g),contrast comparisonc(f,g), andstructure comparisons(f,g). The individual comparison functions are: - Symbol Explanation:
- : Original image.
- : Segmented (test) image.
- : Mean (average) intensity of image .
- : Mean (average) intensity of image .
- : Standard deviation (contrast) of image .
- : Standard deviation (contrast) of image .
- : Covariance between images and .
- : Small positive constants to prevent division by zero and provide stability (e.g., , , , where , , and is the dynamic range of pixel values, like 255 for 8-bit images).
5.2.4. Feature Similarity Index Method (FSIM)
- Conceptual Definition:
Feature Similarity Index Method (FSIM)assesses how similar two images are by comparing their distinctivefeatures. It correlates well with human perception of image quality.FSIMis primarily based on two low-level features of the human visual system:Phase Congruency (PC)andGradient Magnitude (GM).PCis effective at detecting image features (like edges and corners) regardless of lighting conditions or contrast, as it emphasizes features in the frequency domain.GMquantifies the rate of intensity change, representing image gradients. By combining these,FSIMprovides a comprehensive evaluation of image similarity that considers a wide range of visual attributes and structural features. - Mathematical Formula: The paper describes
FSIMconceptually but does not provide its explicit formula. As per instructions, I will provide the standardized formula from authoritative sources.FSIMis typically calculated as: where is the maximum of thePhase Congruencymaps of the two images at location , and is the local similarity map which combinesPhase CongruencyandGradient Magnitudesimilarity. The local similarity is often calculated as: And the component similarities are: - Symbol Explanation:
X, Y: The two images being compared (original and segmented).- : The spatial domain (all pixel locations) of the images.
- : A specific pixel location.
- :
Phase Congruencyvalue at location for images and respectively. - :
Gradient Magnitudevalue at location for images and respectively. - : Small positive constants to ensure stability.
5.2.5. Wilcoxon Test
- Conceptual Definition: The
Wilcoxon signed-rank testis a non-parametric statistical test used to compare two related samples, matched samples, or repeated measurements on a single sample. It is used when the assumption of normality (required for parametric tests like Student's t-test) cannot be met or is not desired. This test assesses whether there is a significant difference in themedianormean ranksbetween the two sets of observations, rather than their means directly. It helps determine if two samples have the same distribution. - Mathematical Formula: The
Wilcoxon signed-rank testdoesn't have a single simple formula like image quality metrics. It involves several steps:- Calculate the differences between paired observations.
- Take the absolute values of these differences.
- Rank the absolute differences from smallest to largest.
- Assign the original signs to the ranks.
- Sum the positive ranks () and negative ranks ().
- The test statistic is usually the smaller of and , or a standardized Z-score derived from it.
- Symbol Explanation:
- : Difference between paired observations for the -th pair.
- : Absolute difference.
- : Rank of the -th absolute difference.
- : Sum of positive ranks.
- : Sum of negative ranks.
- The exact calculation involves critical values or p-values to determine statistical significance.
5.3. Baselines
The reviewed papers extensively compare their proposed multi-level thresholding methods against a wide array of existing meta-heuristic algorithms and traditional thresholding techniques. Table 6 in the paper provides a detailed list of comparisons made.
Common baselines include:
-
Other Meta-Heuristic Algorithms: Many papers compare their new or improved
meta-heuristicwith establishedMAs. The most frequently compared algorithms includeParticle Swarm Optimization (PSO),Whale Optimization Algorithm (WOA),Differential Evolution (DE),Genetic Algorithm (GA),Sine Cosine Algorithm (SCA),Gray Wolf Optimization (GWO),Bat Algorithm (BA),Artificial Bee Colony (ABC), andMoth Flame Optimization (MFO). These are representative because they are widely recognized and applied in optimization problems, includingimage segmentation. -
Variants and Hybrid Algorithms: Comparisons often extend to variants of common
MAs(e.g.,QPSO- Quantum-behavedPSO,DPSO- DarwinianPSO,FODPSO- Fractional-Order DarwinianPSO) andhybrid algorithms(e.g.,HHO-DE,MPAMFO,ABCSCA), which combine elements from multipleMAsto enhance performance. -
Classic Thresholding Methods: Even when
meta-heuristicsare involved, the coreobjective functionsare often compared (e.g.,Otsu's methodvs.Kapur's entropyvs.Tsallis entropy). Sometimes, a proposedMLTmethod is compared againstOtsu's bi-levelorGaussian Otsumethods to show the benefits ofmulti-levelandoptimization. -
Other Segmentation Techniques: In some specialized applications (e.g.,
medical imaging), comparisons might also be made against non-thresholdingsegmentationmethods or classifiers, such asFCM(Fuzzy C-Means),BF(Bacterial Foraging),CNNsorSVMswhen theMLTis part of a larger classification pipeline.The choice of baselines aims to demonstrate the superiority or specific advantages (e.g., accuracy, speed, stability) of the proposed
MLTtechnique in relation to both foundational and state-of-the-artoptimizationandsegmentationapproaches.
6. Results & Analysis
6.1. Core Results Analysis
The paper, being a survey, synthesizes findings from 79 reviewed articles (2017-2023) to provide a comprehensive overview of multi-level thresholding image segmentation. The analysis covers general characteristics, evaluation metrics, objective functions, advantages, disadvantages, and execution times.
General Characteristics (Table 6 & Figure 9):
-
Computational Environment: Over 55% of the reviewed articles utilized a computing environment with 8 GB RAM or higher, indicating that
MLTalgorithms, especially those involvingmeta-heuristicsandhigh threshold levels, often require substantial computational resources. -
Programming Language:
MATLABis overwhelmingly the most frequently used programming language (as depicted in Figure 9), suggesting its dominance in image processing research, likely due to its extensive toolboxes and ease of prototyping.PythonandJavaare used to a lesser extent. -
Meta-heuristic Popularity:
PSOand its hybrid versions were compared 53 times,WOA30 times, andDE29 times, making them the most commonmeta-heuristicbaselines or components in proposed methods. -
Case Studies: Approximately 27% of research focused on
medical datasets, underscoring the high impact ofMLTin medical imaging.Satellite imagingandunderwater imagingranked second and third, respectively, highlighting other crucial application areas.The following are the results from Table 6 of the original paper, summarizing general characteristics:
Paper Environment Proposed Method Compared with Programming language Case Study [86] 2.20 GHz Pentium IVPC,4G RAM SCQPSO SunCQPSO, CCQPSO, QPSO MATLAB Medical [87] MFA FA, BFA, LFA MATLAB [88] Intel(R) core i7 PC,2.93 GHz CPU, 2 GB RAM CDPSO CCS, CHS, CPSO, CDE MATLAB Satellite Images [89] Intel Core i5, 2.5 GHz processor − GA, QGA, DE, ARKFCM MATLAB [90] Windows7-64bit Intel Core2Duo 1.66 GHz processor, 2 GB memory WOA, MFO SCA, HS, SSO, FASSO, FA MATLAB [91] Intel core -i7 CPU @ 3.40 GHz GWO PSO, BFO MATLAB [92] − − Geometrical features [93] FA MATLAB [94] windows 7 3.2 GHz CPU, 4G RAM FWA PSO, ABC, AFSA, BSO MATLAB [95] Intel(R) Core i3, 2.93 GHz CPU, 2 GB RAM SCS MCS MATLAB [96] Windows 7 Intel (R) Core i3, 2.20 GHz, 4.00 GB RAM MOPSO PSO MATLAB [97] 2.80 GHz Intel(R) core i3 processor, 2 GB RAM 2DNLMeKGSA 2DNLMGSA, 2DNLMDE, 2DNLMABC, 2DNLM-cKGSA MATLAB [98] Windows 7, 2.4 GHz CPU, 1 GB RAM VMD PSO, FCM, BF, MPS, CPS MATLAB [99] ABC DE, PSO, QPSO, ABC, gABC, IAB, QABC, OLABC, FGABC [100] a core of 2.7 GHz Intel Core i5 on a MacBook Pro, 8 GB RAM GA Python & MATLAB [101] 2.67 GHz Intel core-i5 PC SADFO HSO, BO, FFO MATLAB digital images [102] Intel Core i5-2400 Duo 3.10 GHz processor, 4 GB RAM AWDO RGA, GA, Nelder—Mead simplex, PSO, BF, ABF, WDO MATLAB Medical [103] − DSA based Otsu's, PSO based, Otsu's, Original images [104] Intel Core i7-5820 K processor 3.30 GHz, 64 GB RAM QPSO PSO, DPSO, FODPSO Medical [105] GA Otsu & Gaussian Otsu's methods Python [106] FODPSO PSO, DPSO Medical [107] Windows764bit Intel Core2Duo 1.66 GHz processor, 2 GB memory LFMVO GWO, PSO, WOA, MVO MATLAB [108] sor, 2 GB memory SS SSO, MFO, PSO, GWO, SHO, WOA, MVO, ABC, FA, HS, ABCSCA, FAABC, FASSO, ABC, SCA MATLAB [109] Windows 7-64bit Intel Core2Duo 1.66 GHz processor, 2 GB memory KnEA NSGA-III, RVEA, LMEA, IMMOEA MATLAB [110] Windows 7 Intel core 3, 2.5 GHz, 4 GB RAM ALDE hjDE, SDE, BDE MATLAB Medical [111] MS Windows 7 64bit, Intel Core i3 3.2 GHz processor MOMVO MOEAD, MOEADR, MOPSO MATLAB [112] MCET-HHO Medical [113] Windows 64-bit Intel i7 2.6 GHz processor,16 GB memory WOA, ALO SSO, FA, FASSO MATLAB [114] Intel Corei3 processor, 4 GB RAM, 64-bit operating system FODPSO PSO, DPSO MATLAB Medical [115] Windows 10 64-bit Pentium(R) Dual core T4500 @ 2.30 GHz,2 GB memory HHO-DE HHO, DE, SCA, BA, HSO, PSO, DA MATLAB [116] Windows 7 64-bit AMD A10-8700P processor, 8 GB RAM SAMFO-TH MVO, WOA, FPA, SCA, ACO, PSO, ABC, MFO MATLAB [117] MGOA GOA, WOA, FPA, PSO, BA [118] Intel core 2 Duo Processor 3 GHz, 2 GB RAM EMA BFO, PSO, GA, FA, HBMO, TLBO MATLAB [119] IEPO EPO, WOA, MVO, PSO, BA, 3DOtsu, FCP, 3DPCNN MATLAB [120] EKH KH I, KH II, KHIV, MFA, MGOA, BA, WCA MATLAB [121] HMS GA, PSO, DE, FF,BA, GSA, TLB [122] MPAMFO MPA, HHO, CS, GWO, GOA, SSO, PSO, MFO Medical [123] ABCSCA ABC, SCA, FASSO, SSA, WOA, GWO, SSO, FASSO, WOAPSO [124] Windows7, 3.4 GHz Intel Core i-7 CPU, 16 GB RAM BFA ABC, MFO, GWO, WOA MATLAB [125] Amazon AWS server, 20 GB RAM − Different K folds (0, 10, 20, 30, 40, 50) and classifiers (KNN, SVM, DT, NN) Java Medical [126] Windows 10 PC with a Core i5 CPU, 4 GB RAM DI-TLBO TLBO, LETLBO, ITLBO, BSA, I-TLBO, GWO Python [127] Windows7-64bit Intel Core 2 Duo 1.66 GHz processor 2 GB memory HHB, HHUB SCA, ABC, SSO, FASSO, FAABC, ABCSCA MATLAB [128] Windows10 i7-8750H 2.2 GHz processor, 8 GB memory SCA BA, FPA, PSO, WOA Underwater segmentation [129] CLAHE Different classifiers (i.e., KNN, SVM, DT, RF, CART) Medical [130] WOA CSA, GOA Satellite images [131] Windows 7 AMD A8-7410 APU with AMD Radeon R5 Graphics @2.20 GHz, LCBMO MABC, CSA, GOA, CS, EO, MPA, IDSA, TLBO, WOA-TH, BDE MATLAB [132] A commercially available computer workstation, Synapse 3D version 3.5 − Medical [133] BMPA NSGA-III, MOPSO, MOMVO, MOEA-DD, MPA, NPSO, GSOBFO, KnEAE, FCM, PCNN, DLA, SSD, OAD-BSP, AFD Electrical Engineering [134] MPA-OBL LSHADE_SPACMA-OBL, CMA_ES_OBL, DE-OBL, HHO-OBL, SCA-OBL, SSA-OBL, MPA [135] Windows 7 (64-bit) Intel (R) Core i3-8130U 2.20 GHz, 4 GB RAM SPBO PSO, DA, SMA, MVO, GOA, HMRF, IMRF, CMRF MATLAB Medical [136] Windows 10 PC, Core i7 CPU, 8 GB RAM DASMA MAs, SSA, DE, IWOA, CLPSO, IGWO, CS, MFO MATLAB Medical [137] Windows 7 Pentium(R) Dual core T4500 @ 2.30 GHz, 2 GB RAM KHO BF, PSO, GA, MFO MATLAB [138] LSHADE JADE, SHADE, DE [139] Windows 10 64bit, Intel Core i7, 8 GB RAM IChOA GWO, MFO, WOA, SCA, SSA, EO, ChOA MATLAB Medical [140] 8th generation intel processor core i5-8250u, with a clock speed 1.60 GHz,8 GB internal memory HSA HAS_O, HAS_K, PSO, EMO_O, EMO_K MATLAB [141] Windows 8.1- 64bit Intel CoreI5 processor, 8 GB memory BWO GWO, MFO, WOA, SCA, SSA, EO, MATLAB [142] Intel(R) Core i7 PC 2.93 GHz CPU, 2 GB RAM ACS CS MATLAB Satellite images [143] Windows 10 Intel (R) Core i5-7200 CPU @ 2.50 GHz-2.71 GHz, 8 GB RAM OB-L-EO EO, SCA, HS, WOA, MFO, SSO, FASSO, FA MATLAB [144] Windows 7 Intel Core i5 3230MCPU@26 GHz, 12 GB RAM CLWOA WOA, SOS, ABC, QPSO, BAT MATLAB [145] Windows 7 Intel Core i5-3230 M CPU@2.6 GHz, 12 GB RAM HAQPSO PSO, QPSO,AQPSO MATLAB [146] i7-8750H 2.2 GHzprocessor, 8 GB RAM WOA BA, FPA, MFO, MSA, PSO, WWO MATLAB Underwater segmentation [147] Windows Server 2008 R2, Intel (R) Xeon(R) CPU E5-2660 v3 (2.60 GHz), 16 GB RAM ASMA CMFO, IWOA, OBLGWO, RCBA, ALCPSO, CWOA, GWO, SMA, WOA, DE, BA, ABC, SSA, CS, BLPSO, SCA, IGWO, PSO, LGCMFO, SCADE, CEBA MATLAB Medical [148] Intel core i7-3770 @ 3.40 GHz CPU, 8 GB RAM CSA WDO, BFO, BDE, ABC MATLAB [149] Intel(R) Xeon (R) CPU E5-2660 v3 @ 2.60 GH, 16 GB RAM MDE DE, SMA, MVO, CS, HHO, CGPSO, IGWO, CLPSO, mSCA, MGSMA, CESCA, CMFO, CWOA MATLAB Medical [150] Intel(R) Xeon(R) Gold 6230R CPU @ 2.10 GHz, 64.0 GB RAM RAV-WOA ALCPSO, MFO, GWO, WOA GA, PSO, GWO, SSA, WOA, LSSA MATLAB [151] Windows Server 2008R2, Intel(R) Xeon(R) CPUE5-2660v3 @2.60 GHz, 16 GB RAM CCMVO MVO, DE, SCA, HHO, CBA, SCADE, IGWO, ACWOA, ASCA_PSO, WOA, SCA, HHO, BLPSO, IGWO, IWOA MATLAB Medical [152] Intel(R) Core i5-6500 CPU@ 3.20 GHz, 8 GB RAM − Watermarking approaches [153] Intel(R) i3 processor 1.8 GHz clock speed, 4 GBRAM CS IDE, MMFO, Modified Bat, cuckoo search MATLAB Medical [154] Intel(R) Core i7-4700MQ CPU 2.40 GHz, 32 GB RAM HWOA SCA, WOA, MSSA, IMPA, CS, CSMC, EO MATLAB [155] BMO GWO, EMO, HSO [156] Intel i5-9300H CPU @2.4 GHz, 8 GB RAM Fusion based technique + BAT algorithm PSO, FF,BBO, BAT, FPA, GWO MATLAB Medical [157] HBA, CBOA [158] HPSO MASI-ENG-MSA [159] OSAFEM-PLDD HCF_QSVM, ACNN, CNN_LVQ, HCF_SVM, VGG_16 CNN, INCEPTION V3 Plant Leaf Disease Diagn [160] Ryzen 5 processor,8 GB RAM TsNMRO SHADE, OB-L-EO, SOGW, IWOA, DADE, JADE, NMRA, TSO, MA-ES, GWO, HCSO, BPSO, HIWOA, MBA, IFAGA, PDO MATLAB [161] Windows10, AMDRyzen9390012-Core 3.09 GHz, RAM 32 GB CRWOA MATLAB [162] Intel(R) Xeon(R) CPU E5-2620 v4@2.10 GHz, 16 GB RAM HLDDE DE, HHO, SCA, mSCA, IGWO, IWOA, SCADE, BWOA, CEBA, BA, ACOR, TSA, MVO, MFO, GBO, HBO, PO MATLAB Medical [163] Windows 7 Enterprise 64-bit PC, Intel Core i7-4510U, 2.6 GHz CPU, 8.00 GB RAM AVOA FFA, SMO, WOA, MPA MATLAB [164] FOL-AOA AOA, RSA, ChOA, SCA, TLBO −
Objective Functions (Table 7):
-
Otsu's Entropy(orbetween-class variance) andKapur's Entropyare by far the most widely usedobjective functions, appearing in 44 and 34 papers respectively. This indicates their proven effectiveness and popularity inMLT. -
Tsallis Entropy,Cross Entropy,Renyi's Entropy,Fuzzy Entropy, andShannon's Entropyare used less frequently, suggesting potential areas for further exploration.The following are the results from Table 7 of the original paper, detailing objective functions:
Objective Functions Papers Kapur's entropy [87, 9, 101, 102, 107, 109, 111, 115, 116, 11921, 126, 128, 130, 133, 134, 137139, 141, 147, 19, 15, 153, 155157, 159, 161, 162] Otsu's entropy [8, 87, 9093, 95, 98100, 102111, 113116, 120, 121, 123127, 129, 134, 137143, 150, 152, 155, 156, 161] Tsallis entropy [88, 89, 92, 109, 117, 120, 143, 153, 156] Shannon's entropy [135, 153] Renyi's entropy [89, 97, 109, 137, 145] Cross entropy [87, 88, 94, 112, 118, 148, 164] Fuzzy entropy [109, 122, 123]
Advantages & Disadvantages (Table 8 & Figures 11, 12):
-
Advantages (Figure 11): The key strengths of
MLTmethods includebetter performance in terms of metrics(e.g., PSNR, SSIM),better objective function scores,high convergence speed,stable segmentation,optimal thresholding,balanced exploration & exploitation(formeta-heuristics),enhanced search capacity, andbetter segmentation quality.Color-Image-ThresholdingandGray-Scale-Thresholdingcapabilities are also noted. -
Disadvantages (Figure 12): Common limitations include
static thresholding settings(lack of automatic threshold level determination),inadequate evaluation(limited comparisons),time complexity(especially for high threshold levels),limited data points(small datasets),local optima trap(formeta-heuristics),noise sensitivity,inefficient advanced segmentation(for complex images),feature constraints,multi-objective limitation, andlimited to grayscale testing.The following figure (Figure 11 from the original paper) summarizes the advantages over reviewed papers:
该图像是一个示意图,展示了多级阈值分割方法的优缺点总结。这些缺点包括低效的高级分割、不充分比较、未知的时间复杂度以及对噪声的敏感性等,旨在引导未来的研究方向。
The following figure (Figure 12 from the original paper) summarizes the disadvantages over reviewed papers:

Execution Time Analysis (Table 10):
-
Execution timesvaried significantly, from milliseconds to over 100 seconds, depending on the algorithm, number of thresholds, image complexity, and computational environment. -
Paper [91] noted that
Kapur thresholding(34.2 ms for min threshold 2) was slightly more computationally expensive thanOtsu thresholding(28.3 ms for min threshold 2) for their specific images and setup, though this varies across studies.The following are the results from Table 10 of the original paper, detailing execution time achieved by reviewed papers:
Paper (Min- Max) Threshold Average CPU Time (Min-Max) Reference Dataset [86] Min Max NR NR Four stomach CT images [87] Min 4 7 NR Ten standard test color images [88] Max Min 3 MCE: 5.4487 and Tsallis entropy: 15.4447 (seconds) Ten different chaotic maps Max 5 MCE: 7.995 and Tsallis entropy: 17.8756 (seconds) [89] Min Max 2 5 ≤ 5(s) Twenty images from [90] Min 2 WOA: 3.74 and MFO: 3.57 (seconds) Eight grayscale images from BSD Max 5 WOA: 4.78 and MFO: 5.60 (seconds) [91] Min Max 2 5 Kapur: 34.2 and Otsu: 28.3 (milliseconds) Five images from USC-SIPI database and Three images from BSD500 Kapur: 147.3 and Otsu: 106.8 (milliseconds) [92] Min NR NR NR [93] Max Min NR NR Two sample images Max [94] Min 2 1.06 (seconds) Three sample images from BSD dataset [95] Max Min 4 1.14 in (seconds) Two sample images (Lena, Port) Max 2 7.779 (NR) [96] 7 11.992 (NR) Four sample images [97] Min 7 6261 (seconds) sample images from BSD300 Max 10 6527.5 (seconds) [98] Min NR NR Twenty sample images Max [99] Min 2 < 1.23 (seconds) Two sample images from BSD300 [100] Max 4 NR Three sample images Min NR Max [101] Min 2 NR Six images from BSD and Six medical images of eyes, liver, head and tongue Max 5 [102] Min 2 Otsu: 3.6812 and Kapur: 4.2291 (seconds) T2-weighted MRI brain images Max 5 Otsu: 6.1672 and Kapur: 7.7805 (seconds) [103] Min 2 NR CASIA v3 interval and UBIRIS and MMU1 Max 3 [104] Min 6 3.16 (seconds) Twelve dental radiographs images Max [105] Min NR NR Set of coins, Cameraman, Circles with different colors an Max Soil sample [106] Min 5.1941 (NR) Three Mammogram images 2 Max 8 8.5295 (NR) [107] Min 2 0.1779 (seconds) Four sample images from BSD500 Max 5 0.4053 (seconds) [108] Min 2 0.37 (seconds) Six images from BSD300 Max 30 0.57 (seconds) [109] Min 2 Around 250 (seconds) Six grayscale images from BSD Max 20 [110] Min 4 1.3090 (seconds) MRI brain images from ABIDE Max 6 1.3339 (seconds) [111] Min 2 32.66 (NR) Eleven grayscale images from BSD Max 20 105.30 (NR) [112] Min 4 CC: 2.035 and MLO: 12.129 (seconds) Digital Database for Screening Mammography (DDSM) Max 12 CC: 6.338 and MLO: 97.847 (seconds) 2,500 studies [113] Min 2 NR Eight grayscale images from BSD Max 5 [114] Min 2 8.5238 (seconds) DICOM CT images Max 5 [115] Min 4 NR Five images from Berkeley (BSD) and five satellite images Max 12 [116] Min 4 Otsu: 1.1693 and Kapur: 1.5409 (seconds) Six color images taken from USC-SIPI and Berkeley segmentation dataset (BSDS500) and Four satellite images Max 10 Otsu: 1.3529 and Kapur: 2.5405 (seconds) [117] Min 4 NR Eight color test images from BSD300 and plant stomata images Max 12 [118] Min 2 8.6888 (seconds) Ten images from BSD and 2 weighted brain magnetic resonance images Max 5 10.960 (seconds) [119] Min 4 2.237 (NR) BSD dataset, Satellite images and plant canopy images Max 12 [120] Min 3 NR Ten images from Berkeley (BSD) Max 6 [121] Min 2 NR Twelve Berkeley images (BSD) and 256 grey levels Max 5 [122] Min 6 NR Ten images and CheX aka, OpenI, Google, PC aka Pad- Chest, NIH aka Chest X-ray14, and MIMIC-CXR Max 25 [123] Min 2 1.9 (seconds) Eight images from the Art Explosion database and eleven images from BSD Max 5 4.0 (seconds) [124] Min 2 NR Eight grayscale images from USC-SIPI Max 5 NR [125] Min NR SCI image database taken from Orange image diagnostic centre Max [126] Min 4 Kapur: 0.273 and Otsu: 0.248 (seconds) Eight sample images from BSD dataset Max 5 Kapur: 0.3 and Otsu: 0.264 (seconds) [127] Min 6 0.253 (seconds) Twelve sample images from BSD Max 30 0.402 (seconds) [128] Min 2 2.53.5 (seconds) Six original test images Max 6 3.0661 (seconds) [129] Min NR NR Kaggle brain MRI dataset Max [130] Min 8 NR Ten satellite images from www.earthobservatory.nasa.gov Max 10 [131] Min 4 2.941 (seconds) Two sets of twelve color images are selected from BSD an NASA landsat image Max 16 4.209 (seconds) [132] Min NR NR COVID-19 CT images Max [133] Min 4 36.83 (seconds) 201 insulator infrared images Max 20 47.01 (seconds) [134] Min 2 Otsu: 0.6385 and Kapur: 1.0423 (NR) Samples from BSD dataset Max 5 Otsu: 1.6707 and Kapur: 2.4884 (NR) [135] Min NR 2.4959 (seconds) 300 Sagittal T2-Weighted DCE-MRI Max 2D slices [136] Min 5 NR BSD and medical images of COPD Max 8 [137] Min 2 Kapur: 2.0507 and Otsu: 2.0408 (seconds) Six images from BSD300 Max 5 Kapur: 2.1022 and Otsu 2.1005 (seconds) [138] Min 2 NR Six sample images Max 5 [139] Min 2 NR Thermography images (DMR-IR) Max 5 [140] Min 2 9.440 (seconds) Eight sample images Max 5 18.494 (seconds) [141] Min 2 NR Ten sample images Max 5 [142] Min 5 Otsu: 7.1835 and Tsallis: 5.6704 (NR) Three satellite images Max 11 Otsu: 10.8379 and Tsallis: 8.1308 (NR) [143] Min 2 NR Eight grayscale images from BSD Max 5 [144] Min 2 NR Two images from BSD300 and four color images from Zigong dinosaur lantern festival Max 5 [145] Min 2 NR Three grayscale images from USC-SIPI and a sport grayscale image Max 5 2.3824 (seconds) Fourteen test images selected from the experimental pool [146] Min Max 4 37.540 (seconds) of Harbin Engineering University [147] Min 8 NR BSDS500 Max 2 [148] 20 Twenty complex Background crop images Min 2 9.2229 (seconds) BIDC images Max 16 11.0381 (seconds) [149] Min 2 NR Max 20 4 grayscale images [150] Min 2 0.1427 (seconds) Max 10 0.5333 (seconds) COVID-19 dataset [151] Min 2 NR Max 20 [152] Min NR NR TCIA dataset [153] Max 5000 biomedical images and 250 standard test images Min 3 6.1698 (seconds) Max 9 12.1628 (seconds) [154] Min 3 NR Ten images from Berkeley dataset Max 70 [155] Min 2 NR Three sample images Max 5 [156] Min NR NR Ten images from brain tumor datasets [157] Min 4 NR Nine standard benchmark images Max 7 [158] Min Max 3 10 80.64 (NR) Sample images from Kodim and Berkeley datasets 84.35 (NR) [159] Min NR NR Plant leaf disease dataset Max [160] Min 3 NR Ten benchmark images with diverse features and complexities Max 7 [161] Min 2 Otsu: 0.2 and Kapur: 0.3 (seconds) Berkeley segmentation dataset Max 8 Otsu: 0.7 and Kapur: 0.8 (seconds) [162] Min 2 NR Seven 512×512 pixels IDC images obtained by hematoxylin -eosin staining Max 6 [163] Min 5 6.0653 (NR) Landsat Imagery Courtesy of NASA Goddard Space Flight Max 11 8.9399 (NR) Center and the U.S. Geological Survey dataset [164] Min 4 NR Six grayscale images from BSD Max 16
6.2. Data Presentation (Tables)
The paper uses numerous tables to summarize information from the reviewed articles.
The following are the results from Table 1 of the original paper, providing symbols and related definitions:
| Symbol | Definition |
|---|---|
| ABC | Artificial Bee Colony |
| ABCSCA | Artificial Bee Colony with Sine—Cosine Algorithm |
| ABF | Adaptive Bacterial Foraging |
| ACO | Ant Colony Optimization |
| ACS | Adaptive Cuckoo Search |
| ALDE | Adaptive Differential Evolution with Levy Distribution |
| ALO | Antlion Optimization Algorithm |
| AS | Association Strategy |
| ASMA | An Improved Slime Mould Algorithm |
| AVOA | African Vultures Optimization Algorithm |
| AWDO | Adaptive Wind Driven Optimization |
| BA | Bat Algorithms |
| BGS | BoltzmannGibbs |
| BLT | Bi-Level Thresholding |
| BSD | Berkeley Segmentation Dataset |
| ChOA | Chimp Optimization Algorithm |
| CMRF | Conventional Markov Random Field |
| CNN | Convolutional Neural Network |
| CP | Contrast Pattern-based classification |
| CPS | Cross-Point Search |
| CRWOA | Whale Optimization Algorithm with Combined mutation and Removing similarity |
| CSA | Cuckoo Search Algorithm |
| CSO | Chicken Swarm Optimization |
| DC-MRI | Dynamic Contrast-Enhanced Magnetic Resonance Imaging |
| DE | Differential Evolution |
| DM | Diffusion Mechanism |
| DS | Differential Search |
| EAs | Evolutionary Algorithms |
| EM | Expectation Maximization |
| EMA | Exchange Market Algorithm |
| EO | Equilibrium Optimizer |
| EPO | Emperor Penguin Optimization |
| FASSO | Fuzzy Adaptive Swallow Swarm Optimization |
| FCNs | Fully Convolutional Networks |
| FF | Fuzzy Filtering |
| FODPSO | Fractional-Order Darwinian Particle Swarm Optimization |
| FPA | Flower Pollination Algorithm |
| FSIM | Feature Similarity Index Method |
| GA | Genetic Algorithm |
| GAC | Geodesic Active Contour |
| GANs | Generative Adversarial Networks |
| GGOs | Ground-Glass Opacities |
| GM | Gradient Magnitude |
| GOA | Grasshopper Optimization Algorithm |
| GWO | Gray Wolf Optimization |
| HB | Human-Based |
| HHO | Harris Hawks Optimization |
| HMRF | Hidden Markov Random Field |
| HVS | Human Visual System |
| IIS | Interactive Image Segmentation |
| IChOA | Lévy Flight Chimp Optimizer |
| JPEG | Joint Photographic Experts Group |
| KnEA | Knee Evolutionary Algorithm |
| LF | Levy Flight |
| MAs | Meta-Heuristic Algorithms |
| MCE | Minimum Cross Entropy |
| MFA | Moth Flame Algorithm |
| MLTIS | Multi-Level Image Segmentation Algorithm |
| MPA | Marine Predator Algorithm |
| MPS | Minimum Point Search |
| MSE | Mean Square Error |
| MSSIM | Mean Structural Similarity Index Method |
| MVO | Multi-Verse Optimizer |
| OBL | Opposition-based Learning |
| OSAFEM-PLDD | Optimal Segmentation with Alexnet Based Feature Extraction for Plant Leaf Disease Diagnosis |
| PFs | Pareto Fronts |
| PRI | Probabilistic Rand Index |
| PSNR | Peak Signal to Noise Ratio |
| QRG | Quantum Rotation Gate |
| SA | Simulated Annealing |
| SADFO | Self-Adaptive Dragonfly Optimization |
| SCI | Spinal Cord Injury |
| SHADE | Adaptive Differential Evolution based on Success History |
| SS | Swarm Selection |
| SSO | Social Spider Optimization |
| VMD | Variational Mode Decomposition |
The following are the results from Table 2 of the original paper, detailing research questions:
| No Research Question | Motivation | Address | |
|---|---|---|---|
| 1 | What are the advantages of multi-level thresholding over other thresholding methods? | Various advantages of multi-level thresholding are discussed in the response to this inquiry | Section 1 |
| 2 | In the current study, how were the data collected? | In order to evaluate the reliability and validity of the research findings, it is essential to gain insight into the research process and the methods used to collect data | Section 1.3 |
| 3 | Which thresholding approaches are most commonly used and how are they implemented? | Researchers can gain a better understanding of segmentation objective functions by answering this question | Section 2.4 and 4.3 |
| 4 | Which areas of image segmentation have been the focus of previous surveys? | Identify recurring topics in image segmentation, such as segmentation algorithms, applications, challenges, and evaluation methods, to understand the main research | Section 2.5 and 3 |
| 5 | How can meta-heuristics assist in thresholding image segmentation? | areas Meta-heuristic algorithms play an important role in multi-level thresholding segmentation using the presented framework | Section 2.6 |
| 6 | How is thresholding segmentation commonly used? | Many aspects of multi-level thresholding segmentation are better understood by using this method | Section 2.7 |
| 7 | Which operating environment and programming language was used to implement each method? | The purpose of this question is to gain an understanding of how each study is implemented and what programming language is used | Section 4.1 |
| 8 | How can the efficiency of multi-level thresholding image segmentation be measured, and what metrics are used for this purpose? | Researchers evaluate segmentation techniques for thresholding based on PSNR, SSIM, FSIM, CPU time, among others | Section 4.2 |
| 9 | Are there any advantages or disadvantages to different thresholding segmentation approaches? | It is evident that each method has advantages and limitations | Section 4.4 |
| 10 | Which datasets are used most often over the proposed methods? | To evaluate and validate a technique, researchers use benchmark datasets in a particular field or domain | Section 4.5 |
| 11 | Which algorithm performed best based on execution time? | To better understand the power of the algorithm, the performance of the methods is reported based on the execution time compared to a threshold level | Section 4.6 |
| 12 | What are the current challenges in thresholding segmentation? | This question will lead to future research conducted by researchers | Section 5 |
The following are the results from Table 3 of the original paper, summarizing online dataset sources:
| Publication Name | URL | Articles Received |
|---|---|---|
| Science Direct | https://www.sciencedirect.com/ | 43 |
| IEEexplore | http://ieeexplore.ieee.org/ | 34 |
| Springer | http://www.springer.com/ | 2 |
| Total Articles | 79 |
The following are the results from Table 8 of the original paper, detailing positive aspects and limitations:
| Paper | Positive Aspects | Limitations |
|---|---|---|
| [86] | High convergence speed | Unknown Time Complexity Limited Data Points (4 medical gray-scale images) |
| [87] | Improved exploiting contextual information High convergence speed Better performance in terms of metrics Better Objective Function Scores | Limited Data Points (10 images) Unknown Time Complexity Static thresholding level Time complexity |
| [88] | High convergence speed Better segmentation quality | Static thresholding level Limited Data Points |
| [89] | Tested on images with different type of noise effects (Gaussian, Salt & Pepper, Rician, Shadow, Reflection) Optimal Thresholding High accuracy | Inadequate comparison Limited Data Points (20 images) Static thresholding level Time complexity |
| [90] | Balanced Exploration & Exploitation Better performance in terms of metrics | Time complexity Static thresholding level Limited to Grayscale Testing Noise Sensitivity |
| [91] | Simple Implementation Optimized time complexity Better performance in terms of metrics Better segmentation quality | Feature Constraints Unknown Time Complexity Inadequate Comparison Static thresholding level Time complexity |
| [92] | Stable Segmentation | Static thresholding level |
| [93] | Optimized time complexity | Static thresholding level |
| [94] | -Better performance in terms of metrics High convergence speed Better Objective Function Scores | Inadequate Comparison Static thresholding level Time complexity |
| [95] | Better Objective Function Scores | Inadequate Comparison Static thresholding level Time complexity |
| [96] | Rich dataset (300 images) Better performance in terms of metrics | Time complexity Static thresholding level |
| [97] | Optimized time complexity High efficiency on grayscale and color images | Limited Data Points (20 images) Inadequate comparison Static thresholding level Only one objective function is used to show the algorithm |
| [98] | High accuracy High convergence speed Balanced Exploration & Exploitation | performance Unknown Time Complexity Limited Data Points (31 images) |
| [99] | Better performance in terms of metrics | Limited Data Points (3 images) Inadequate comparison Static thresholding level |
| [100] | Stable Segmentation Better performance in terms of metrics Better Objective Function Scores | Unknown Time Complexity Static thresholding level Inadequate Comparison Static thresholding level Parameter Sensitivity Time complexity Unknown Time Complexity Static thresholding level |
| [101] | Optimal Thresholding | Limited Data Points (4 images) |
| [102] | Better performance in terms of metrics | Thresholding levels are set statically |
| [103] | High accuracy | Parameter Sensitivity Static thresholding level Local Optima Trap Time complexity Static thresholding level Limited to Grayscale Testing |
| [104] | Better segmentation quality | Static thresholding level Limited Data Points (23 images) Unable to dealing with low quality images Static thresholding level Limited Data Points (8 images) Static thresholding level Noise Sensitivity Inefficient Advanced Segmentation (such as medical images containing intensity inhomogeneity) Time complexity Time complexity Parameter Sensitivity Inadequate Evaluation Unknown Time Complexity Static thresholding level Inadequate Evaluation Static thresholding level Limited to Grayscale Testing Static thresholding level Limited Data Points (6 images) Inadequate comparison Limited Data Points (6 images) Static thresholding level Unknown Time Complexity Noise Sensitivity Limited Data Points |
| [105] | Optimal Thresholding | |
| [106] | Better performance in terms of metrics | |
| [107] | Stable Segmentation Better segmentation quality | |
| [108] | Better performance in terms of metrics | |
| [109] | Better performance in terms of metrics Superiority in Pareto front approximation | |
| [110] | Better performance in terms of metrics Better Objective Function Scores | |
| [111] | Optimized time complexity Better performance in terms of metrics | |
| [112] | Ability to adjust region of interest of tumor automatically | |
| [113] | Better segmentation quality | |
| [114] | High accuracy Optimized time complexity Better performance in terms of metrics | |
| [115] | Stable Segmentation | |
| [116] | High accuracy High convergence speed Stable Segmentation | |
| [117] | Better performance in terms of metrics | |
| [118] | Increased search capacity Optimized time complexity | |
| [119] | High efficiency on grayscale and color images Optimized time complexity Higher accuracy | |
| [120] | High convergence speed Better Objective Function Scores | |
| [121] | Better Objective Function Scores Best functionality in case of high-dimensionality | |
| [122] | Better performance in terms of metrics Increased search capacity | |
| [123] | Better performance in terms of metrics Sufficient Compared Method | |
| [124] | High convergence speed Better performance in terms of metrics | |
| [125] | Higher accuracy, TP rate and precision Better performance in terms of metrics Rich dataset (over 500 images) | |
| [126] | Stable Segmentation High accuracy | |
| [127] | Optimized time complexity Better performance in terms of metrics | |
| [128] | Better segmentation quality High convergence speed High accuracy | |
| [129] | High accuracy | |
| [130] | Better performance in terms of metrics | |
| [131] | High accuracy High convergence speed | |
| [132] | Reliable Disease Assessment Strong association with the clinical seriousness of the illness | Missing Longitudinal Data |
| [133] | Better performance in terms of metrics Strong robustness High fault diagnosis accuracy | Time complexity Inadequate comparison |
| [134] | Simple Implementation Better segmentation quality Balanced Exploration & Exploitation High convergence speed High accuracy | Inefficient Advanced Segmentation Time complexity Noise Sensitivity |
| [135] | High convergence speed Optimized time complexity Optimal Thresholding Increased search capacity | The optimal number of thresholds is indicated statically |
| [136] | Better performance in terms of metrics Avoids Local optima | Limited Data Points (9 for first stage and 6 for medical stage) Unknown Time Complexity |
| [137] | Optimized time complexity Better Objective Function Scores | Limited Data Points (6 images) Inadequate comparison |
| [138] | High accuracy Better Objective Function Scores | Limited Data Points (6 images) Static thresholding level Unknown Time Complexity |
| [139] | High accuracy High convergence speed Better segmentation quality Balanced Exploration & Exploitation Optimal Thresholding High efficiency on grayscale and color images Avoids Local optima Simple Implementation Sufficient Compared Method | Noise Sensitivity Limited Data Points |
| [140] | Better performance in terms of metrics | Limited to Grayscale Testing Static thresholding level |
| [141] | Better Objective Function Scores | Inadequate Evaluation Static thresholding level Limited to Grayscale Testing |
| [142] | Optimized time complexity Sufficient Compared Method Better performance in terms of metrics | Static thresholding level Inadequate Evaluation |
| [143] | Better performance in terms of metrics | Inadequate Evaluation Static thresholding level |
| [144] | High accuracy Stable Segmentation Optimization ability | Time complexity Static thresholding level |
| [145] | Increased search capacity Anti-noise performance | Static thresholding level Unknown Time Complexity |
| [146] | High accuracy Better performance in terms of metrics High convergence speed | Limited to Grayscale Testing Static thresholding level |
| [147] | Better segmentation quality Prevents Overfitting | Time complexity Limited Data Points (8 images) |
| [148] | Efficient algorithm for MLTS Optimized time complexity High accuracy Better segmentation quality Robustness for image illumination and complex backgrounds | Inadequate comparison Limited Data Points (20 images) |
| [149] | High convergence speed Avoids Local optima Better segmentation quality | Time complexity Local Optima Trap |
| [150] | High efficiency on grayscale and color images High accuracy High convergence speed | Multi-Objective Limitation Limited Data Points (8 images) Time complexity |
| [151] | Increased search capacity Stagnation mitigation Avoids Local optima | Time complexity Limited Data Points (10 images) |
| [152] | Strong robustness | Limited Data Points (4 images) |
| [153] | Rich dataset Optimal Thresholding | Static thresholding level |
| [154] | Better performance in terms of metrics | Static thresholding level |
| [155] | Optimization ability Better segmentation quality | Inadequate Comparison No Real-World Testing |
| [156] | Optimized time complexity Better segmentation quality | Inadequate Comparison No Real-World Testing |
| [157] | Better Objective Function Scores Better segmentation quality | Static thresholding level Inadequate comparison |
| [158] | Optimized time complexity Better segmentation quality | Inadequate comparison Static thresholding level |
| [159] | High accuracy | Unknown Time Complexity Inadequate comparison |
| [160] | Better performance in terms of metrics High convergence speed | Local Optima Trap Inadequate comparison |
| [161] | Balanced Exploration & Exploitation Population diversity and balance Improving the possibility of more excellent individuals within the population | Limited to Grayscale Testing Limited Data Points (10 images) |
| [162] | Better performance in terms of metrics High convergence speed Prevents Overfitting | Time complexity Incomplete Threshold Analysis |
| [163] | Balanced Exploration & Exploitation | Static thresholding level Time complexity |
| [164] | Better performance in terms of metrics | Time complexity Static thresholding level |
6.3. Ablation Studies / Parameter Analysis
As a survey paper, this article does not present its own ablation studies or detailed parameter analysis. Instead, it synthesizes the findings from the reviewed literature regarding common strengths and weaknesses. The recurring limitations highlighted in Table 8 and Figure 12, such as parameter sensitivity and the prevalence of static thresholding levels, indirectly reflect the need for more robust ablation studies and parameter analysis in individual research papers to develop adaptive and automatically tuned MLT algorithms. The identification of "Manual Parameter Tuning" and "Determination of Optimal Number of Clusters" as open issues in Section 5 further underscores that these aspects are critical areas for future research rather than well-addressed areas in the current literature.
7. Conclusion & Reflections
7.1. Conclusion Summary
This comprehensive review provides a thorough understanding of multi-level thresholding segmentation, from its foundational concepts to its advanced applications and current limitations. It systematically covers the background of image segmentation, various multi-level thresholding approaches (like Otsu, Kapur, Tsallis entropy, Fuzzy entropy, Minimum Cross Entropy, and Renyi's entropy), and the crucial role of meta-heuristic algorithms in optimizing threshold selection.
The review highlights that multi-level thresholding offers significant advantages over simpler thresholding methods, including enhanced flexibility, adaptability to complex intensity distributions, and the ability to capture finer image details. Meta-heuristic algorithms are shown to be instrumental in reducing computational costs and improving the accuracy of threshold determination. By analyzing datasets, evaluation metrics, programming languages, and the advantages/disadvantages of different techniques, the paper offers valuable insights into the current state of the field.
7.2. Limitations & Future Work
The paper explicitly identifies several open issues and challenges that represent limitations in current multi-level thresholding research and suggest promising directions for future work:
- Manual Parameter Tuning: A significant limitation is the reliance on manual tuning of
optimization algorithmparameters. Future research should focus onautomated methodsfor parameter tuning to enhance objectivity and efficiency. - Automatic Determination of Threshold Levels: The static setting of threshold levels is a common drawback. Future work should develop
adaptive thresholding methodsandstatistical measuresthat can automatically determine the optimal number of thresholds. - Limited Testing with Diverse Datasets: Many algorithms are tested on specific or small datasets, limiting their generalizability. Future studies should employ more
diverse and extensive datasetsto ensure robustness and reliability. - Hybridization with Meta-heuristic Algorithms: Further
hybridizationofmeta-heuristic algorithms(e.g., withopposition-based learning) is encouraged to improve segmentation results by leveraging the strengths of different techniques. - Comprehensive Performance Comparison: There is a need for more comprehensive comparisons against a wider range of
state-of-the-art algorithmsto accurately assess performance. - Sensitivity to Noise and Efficiency: Current algorithms can be sensitive to noise and inefficient for
complex image segmentationtasks. Future research should integrateadditional image featuresoradvanced processing techniquesto mitigate noise and improve efficiency. - Computational Complexity: Optimization is needed to reduce the
computational complexityand processing time of algorithms, especially forreal-time applications. - Extension to Other Image Processing Problems: Exploring the application of existing
MLT algorithmsto otherimage processing problemslikeimage registration,denoising, andquality enhancementcould expand their utility. - Determination of Optimal Number of Clusters: Automatically setting the optimal cluster size is challenging, particularly for
biomedical imageswith varying modalities and anatomical regions. - Handling Inhomogeneities and Complex Images: Developing robust techniques for handling
inhomogeneousandcomplex imagesis crucial to ensure accurate segmentation. - Multi-Objective Optimization: Further application of
multi-objective optimizationcan improve performance by simultaneously considering multiple objectives (e.g.,Otsu's fuzzy entropyorminimum cross-entropy). - Color Image Segmentation and Texture Analysis: Algorithms need to be enhanced to effectively handle
color imagesand incorporatetexture properties, which are vital for distinguishing regions. - Longitudinal Assessment: In
medical imaging, there is a lack oflongitudinal assessmentof disease progression, which limits the evaluation of long-term effectiveness. - Interpretability of Results: Developing techniques to provide
insights and explanationsinto howimage processing algorithmsmake decisions is crucial for understanding and trusting their outputs, especially in critical applications likemedical diagnosis.
7.3. Personal Insights & Critique
This survey is a valuable resource for anyone entering or working within the field of multi-level thresholding image segmentation. Its strength lies in its focused scope and the systematic organization of a large body of recent literature (2017-2023). The detailed breakdown of objective functions and their underlying mathematical principles, along with the comprehensive lists of datasets, evaluation metrics, and meta-heuristic algorithms, provides a solid foundation for understanding the domain. The explicit identification of open issues serves as a clear roadmap for aspiring researchers.
However, a critical perspective might note a few points:
-
Reproducibility Challenge: While the survey highlights the importance of
setup environment details(Table 6), many reviewed papers still lack this information (NRin the table). This indicates a broader issue in the field regarding reproducibility, which the survey effectively points out without being able to solve. -
MATLAB Dominance: The heavy reliance on
MATLAB(Figure 9) in the reviewed literature suggests a potential bias or limitation. WhileMATLABis excellent for prototyping,Pythonwith its open-source libraries (e.g., OpenCV, scikit-image, TensorFlow, PyTorch) has become increasingly prevalent for deployment and large-scale applications due to its flexibility and community support. Future research might shift towards Python-based implementations. -
Depth of
Fuzzy EntropyFormula: The severe corruption in Equation 23 forFuzzy Entropyin the original paper is a notable flaw. While the survey correctly reproduces it as per instructions, it underscores the occasional challenges in relying solely on published formulas and highlights the need for rigorous self-correction and clarity in academic writing. -
Meta-heuristic vs. Deep Learning Balance: The survey primarily focuses on
meta-heuristic optimizationforthresholding. While it acknowledgesdeep learningin theRelated Workssection, it doesn't deeply integrate howMLTmight combine withdeep learningapproaches, beyond usingMLTas a preprocessing step. Future research might explorehybrid modelsthat blend the strengths ofmeta-heuristicsfor optimal threshold selection withdeep learningfor feature extraction or semantic understanding in a more integrated manner. -
Practical Applicability: The recurring
limitationsliketime complexityandstatic thresholding settingsindicate that while many methods achieve high performance on metrics, their practical deployment in real-time or dynamic scenarios remains challenging. The call forautomatic determination of threshold levelsandhandling complex imagesis crucial for translating research into real-world impact.Overall, this paper provides an invaluable, structured overview that effectively condenses a fragmented body of knowledge, making it significantly easier for newcomers to grasp the complexities and current frontiers of
multi-level thresholding segmentation. Its direct critique of existing literature, especially onlimitationsandfuture work, is a strong contribution to guiding the field forward.
Similar papers
Recommended via semantic vector search.