AiPaper
Status: completed

A Split-Merge DP-means Algorithm to Avoid Local Minima

Extension of DP-means Clustering AlgorithmSplit-Merge TechniqueLocal Minima AvoidanceNonparametric Bayesian ClusteringHard Clustering Approximation
Original Link
Price: 0.10
4 readers
This analysis is AI-generated and may not be fully accurate. Please refer to the original paper.

TL;DR Summary

This paper extends DP-means with a split-merge technique to avoid local minima by dynamically splitting dense clusters, enabling near-optimal cluster counts with improved robustness demonstrated on multiple datasets.

Abstract

We present an extension of the DP-means algorithm, a hard-clustering approximation of nonparametric Bayesian models. Although a recent work [6] reports that the DP-means can converge to a local minimum, the condition for the DP-means to converge to a local minimum is still unknown. This paper demonstrates one reason the DP-means converges to a local minimum: the DP-means cannot assign the optimal number of clusters when many data points exist within small distances. As a first attempt to avoid the local minimum, we propose an extension of the DP-means by the split-merge technique. The proposed algorithm splits clusters when a cluster has many data points to assign the number of clusters near to optimal. The experimental results with multiple datasets show the robustness of the proposed algorithm.

English Analysis

1. Bibliographic Information

  • Title: A Split-Merge DP-means Algorithm to Avoid Local Minima
  • Authors: Shigeyuki Odashima, Miwa Ueki, and Naoyuki Sawasaki. All authors are affiliated with Fujitsu Laboratories Ltd., Kawasaki, Japan.
  • Journal/Conference: The paper does not explicitly name the publication venue. Based on its structure and style, it is likely a conference paper in the field of data mining or machine learning.
  • Publication Year: The paper does not specify a publication year, but citations go up to 2017, suggesting it was published around or after that time.
  • Abstract: The paper presents an extension to the DP-means algorithm, a hard-clustering method derived from nonparametric Bayesian models. While a prior study [6] noted that DP-means can get trapped in local minima, the reasons were unknown. This paper identifies a key cause: DP-means fails to find the optimal number of clusters when many data points are concentrated in a small area. To address this, the authors propose a split-merge extension. This new algorithm splits clusters that become too dense, helping to find a near-optimal number of clusters. Experiments on multiple datasets demonstrate the proposed algorithm's improved robustness and ability to find better solutions.
  • Original Source Link: /files/papers/68f8705e81e5ec727a02b076/paper.pdf. The paper appears to be formally published, given its polished format and citations.

2. Executive Summary

  • Background & Motivation (Why):

    • The core problem is that DP-means, a popular algorithm for automatically determining the number of clusters in a dataset, can fail by converging to a local minimum. This means it finds a suboptimal clustering solution with fewer clusters than is ideal.
    • This problem is significant because DP-means is valued for its efficiency and ability to adapt model complexity (the number of clusters) to the data, a feature inherited from nonparametric Bayesian models. A previous paper [6] observed this failure but did not explain why it happens. This paper fills that gap.
    • The paper's fresh angle is to provide a theoretical analysis identifying a specific condition for this failure: DP-means gets stuck when a large number of data points exist within a small spatial distance. It cannot create new clusters in such dense regions, even if doing so would lead to a better overall solution.
  • Main Contributions / Findings (What):

    • The paper makes three primary contributions:
      1. Analysis of a Local Minimum Condition: It formally demonstrates that DP-means can converge to a solution with a single cluster in a dense region, even when a solution with multiple clusters would have a lower (better) cost.
      2. A Novel Split-Merge DP-means Algorithm: It proposes an extension to DP-means that incorporates split and merge operations. The algorithm splits a cluster when it becomes too dense (containing too many points within its range) and merges clusters that were over-split, allowing it to escape the identified local minima.
      3. Empirical Validation: It provides experimental results on four real-world datasets, showing that the proposed split-merge DP-means (SMD) algorithm consistently finds clustering solutions with lower cost than standard DP-means variants.

3. Prerequisite Knowledge & Related Work

  • Foundational Concepts:

    • Clustering: The unsupervised machine learning task of grouping a set of objects such that objects in the same group (called a cluster) are more similar to each other than to those in other groups.
    • k-means Clustering: A classic "hard-clustering" algorithm where each data point belongs to exactly one cluster. It aims to partition nn data points into kk clusters by minimizing the sum of squared distances from each point to its assigned cluster's center (centroid). Its main drawback is that the number of clusters, kk, must be specified in advance, and it is sensitive to the initial placement of centroids, often getting stuck in local minima.
    • Nonparametric Bayesian Models: Statistical models where the complexity of the model (e.g., the number of parameters or components) is not fixed in advance but is determined from the data. For clustering, this means the model can infer the number of clusters automatically. The Dirichlet Process (DP) is a common example used for this purpose. These models are flexible but often computationally expensive.
    • DP-means (Dirichlet Process Means): An algorithm that acts as a computationally efficient, hard-clustering approximation of a Dirichlet Process Mixture Model. It was derived using a technique called Small-Variance Asymptotics (SVA), which simplifies the complex probabilistic Bayesian model into a deterministic algorithm similar to k-means. Unlike k-means, DP-means can automatically determine the number of clusters by adding a penalty term λ\lambda for each new cluster. A new cluster is created if a data point is further than λ\lambda from all existing cluster centroids.
    • Split-Merge Techniques: A class of methods used to improve clustering algorithms. To escape local minima, a split operation divides one cluster into two (or more), while a merge operation combines two (or more) clusters into one. These are common in MCMC-based Bayesian inference but less so for SVA-based methods like DP-means.
  • Previous Works:

    • The paper positions itself relative to three areas:
      1. DP-means and extensions: Many extensions to DP-means exist for different applications (e.g., topic models, time-series data, subspace clustering) and for efficiency (distributed, online, coreset-based). However, the authors state that none of these have analyzed the local minimum problem. The work by Bachem et al. [6] is a key reference, as it first reported that DP-means can get stuck but did not explain the cause.
      2. Hard-clustering algorithms: k-means is the most famous example. Research in this area, like k-means++, has focused on better initialization to avoid local minima. This paper's analysis of DP-means local minima could inform similar improvements for this class of algorithms.
      3. Split-merge clustering algorithms: These techniques have been applied to both hard clustering and, more recently, to fully Bayesian models optimized with MCMC or variational inference. This paper is the first to apply a split-merge approach to a hard-clustering method derived from small-variance asymptotics.
  • Differentiation: The key innovation is twofold: (1) it is the first work to provide a theoretical analysis explaining why DP-means gets trapped in local minima, linking it to data density, and (2) it is the first to propose a split-merge algorithm specifically designed to solve this problem for DP-means and related SVA-based methods.

4. Methodology (Core Technology & Implementation)

The paper's methodology is divided into two parts: first, an analysis of why the standard DP-means algorithm fails, and second, the proposal of the split-merge solution.

Analysis of Existing DP-means Approaches

The DP-means algorithm aims to minimize the following cost function:

costDP(X,C)=xXminμCxμ2+λ2k \mathrm { c o s t } _ { D P } ( \mathcal { X } , \mathcal { C } ) = \sum _ { \pmb { x } \in \mathcal { X } } \operatorname* { m i n } _ { \pmb { \mu } \in \mathcal { C } } | | \pmb { x } - \pmb { \mu } | | ^ { 2 } + \lambda ^ { 2 } k

  • X\mathcal{X} is the set of nn data points.

  • C\mathcal{C} is the set of kk cluster centroids μ\pmb{\mu}.

  • xμ2||\pmb{x} - \pmb{\mu}||^2 is the squared Euclidean distance between a data point and a centroid.

  • λ2\lambda^2 is a hyperparameter that penalizes the creation of new clusters.

  • kk is the number of clusters.

    The first term is the standard k-means objective (quantization error), while the second term is a penalty that controls the number of clusters. DP-means works like k-means but adds a new cluster centered at a data point xi\pmb{x}_i if its distance to the nearest existing cluster centroid μc\pmb{\mu}_c is greater than λ\lambda (i.e., xiμc2>λ2||\pmb{x}_i - \pmb{\mu}_c||^2 > \lambda^2).

The authors introduce a simplified scenario to analyze the algorithm's behavior.

Definition 1 (easy case for DP-means clustering): Data is in the "easy case" if the maximum squared Euclidean distance between any two data points is less than λ2\lambda^2. This means all data points are contained within a hypersphere of radius λ\lambda.

Fig. 2. Analysis of DP-means. (a) Easy case. Although DP-means always converges solution with one cluster in easy case, (b) there exists case of solution with two clusters with less DP-means cost. (c… 该图像是论文中图2的示意图,展示了DP-means算法在不同数据分布情况下的聚类结果。图(a)为简单情况,样本点均匀分布在半径为λ的圆内;图(b)显示样本高度集中在两个点,且λ=10;图(c)比较不同样本数下的聚类样例,分别为20点和2000点。

  • Image 2 Analysis: This figure illustrates the core problem. (a) shows the "easy case," where all points are close. (b) presents a specific example where two groups of points are close to each other but distinct; DP-means would group them into one cluster. (c) shows that as the number of points in these two groups increases, our intuition (and the underlying Bayesian model) suggests they should be two separate clusters, but DP-means fails to do this.

    Based on this, the authors present two lemmas leading to their main theorem on local minima.

  • Lemma 1: In the "easy case," both batch and online DP-means will always converge to a single cluster whose centroid is the mean of all data points.

    • Reasoning: Since all points are within λ\lambda distance of each other, the condition to create a new cluster (pointcentroid2>λ2||\text{point} - \text{centroid}||^2 > \lambda^2) is never met after the first cluster is created. All subsequent points are simply assigned to this single cluster.
  • Lemma 2: In the "easy case," there can exist a situation where a solution with two clusters has a lower DP-means cost than the one-cluster solution.

    • Proof by Example: Consider 1000 points at (-1,0) and 1000 points at (1,0)(1,0), with λ2=100\lambda^2 = 100. This is an "easy case" since the maximum distance squared is (1(1))2=4(1 - (-1))^2 = 4, which is less than 100.
      • Cost with 1 cluster: The centroid is at (0,0)(0,0). The cost is 2000×12+λ2×1=2000+100=21002000 \times 1^2 + \lambda^2 \times 1 = 2000 + 100 = 2100.
      • Cost with 2 clusters: The centroids are at (-1,0) and (1,0)(1,0). The cost is 1000×02+1000×02+λ2×2=0+200=2001000 \times 0^2 + 1000 \times 0^2 + \lambda^2 \times 2 = 0 + 200 = 200.
    • Since 200<2100200 < 2100, the two-cluster solution is optimal.
  • Theorem 1: In the "easy case," DP-means can converge to a local minimum with fewer than the optimal number of clusters.

    • Proof: Lemma 1 shows DP-means finds one cluster. Lemma 2 shows the optimal solution can be two (or more) clusters. Therefore, DP-means gets stuck in the suboptimal one-cluster solution. The fundamental reason is that the algorithm doesn't account for the number of data points (density) when deciding whether to keep a region as a single cluster.

Split-Merge DP-means

To fix this, the authors propose an algorithm that explicitly considers cluster density to decide when to split.

3.1 Condition for Splitting One Cluster into Two

The authors derive a condition for when splitting a cluster is beneficial. They make a simplifying assumption that the data points within a cluster are uniformly distributed.

Consider a cluster with ww points and a range σ=(σ1,...,σd)\pmb{\sigma} = (\sigma_1, ..., \sigma_d) in each dimension.

  1. Expected cost with one cluster: The expected cost is derived from the variance of a uniform distribution. Exp(costDP(X,C1))=wl=1dσl212+λ2 \mathrm { E x p } \big ( \mathrm { c o s t } _ { D P } \big ( \mathcal X , \mathcal C _ { 1 } \big ) \big ) = w \sum_{l=1}^d \frac{\sigma_l^2}{12} + \lambda^2
  2. Expected cost with two clusters (split along dimension jj): If the cluster is split in half along dimension jj, the new clusters each have a range of σj/2\sigma_j/2 in that dimension. The expected cost becomes: exp(costDP(X,C2))=w(ljσl212+σj248)+2λ2 \exp ( \mathrm { c o s t } _ { D P } ( \mathcal { X } , \mathcal { C } _ { 2 } ) ) = w \left( \sum_{l \neq j} \frac{\sigma_l^2}{12} + \frac{\sigma_j^2}{48} \right) + 2\lambda^2
  3. Splitting Condition: A split is beneficial if Exp(cost with 1 cluster)>Exp(cost with 2 clusters)\mathrm{Exp}(\text{cost with 1 cluster}) > \mathrm{Exp}(\text{cost with 2 clusters}). This simplifies to: w>16(λσj)2 w > 16 \left( \frac { \lambda } { \sigma _ { j } } \right) ^ { 2 }
    • Interpretation: A cluster should be split if it has many data points (ww is large) or if it is wide (range σj\sigma_j is large). The algorithm splits along the dimension jj with the maximum range, as this satisfies the condition first.

3.2 Split DP-means

Algorithm 3 is an online algorithm that incorporates this splitting rule. It maintains not just the centroid μ\pmb{\mu} and point count ww for each cluster, but also its range σ\pmb{\sigma} and min/max boundaries p,q\pmb{p}, \pmb{q}. When a new data point arrives:

  1. It is assigned to the nearest cluster.
  2. The cluster's statistics (μ,w,σ,p,q\pmb{\mu}, w, \pmb{\sigma}, \pmb{p}, \pmb{q}) are updated.
  3. The algorithm checks if the split condition (Eq. 4) is now met for the updated cluster. If so, the cluster is split into two new clusters according to predefined rules (Eq. 6) that approximate the centroids and statistics of the two new sub-clusters.

3.3 Merge DP-means

The Split DP-means can be too aggressive and create too many clusters. The merge step (Algorithm 4) corrects this.

  1. Merging Condition: A condition is derived for when merging two clusters, CLC_L and CRC_R, into one, CMC_M, reduces the total DP-means cost. The change in cost from merging is: Δcostm(CL,CR)=wLdL2+wRdR2λ2 \Delta\mathrm{cost}_m(C_L, C_R) = w_L d_L^2 + w_R d_R^2 - \lambda^2

    • wL,wRw_L, w_R are the number of points in each cluster.
    • dL2,dR2d_L^2, d_R^2 are the squared distances from the original cluster centroids (μL,μR\pmb{\mu}_L, \pmb{\mu}_R) to the new merged centroid μM\pmb{\mu}_M.
    • A merge is performed if Δcostm<0\Delta\mathrm{cost}_m < 0.
  2. Algorithm: The Merge DP-means algorithm takes the clusters produced by the Split DP-means algorithm. It greedily and iteratively merges the pair of clusters that provides the largest cost reduction (most negative Δcostm\Delta\mathrm{cost}_m) until no more beneficial merges are found.

    Fig. 3. Example result of split-merge DP-means for synthetic 2D data. 该图像是图表,展示了论文中图3所示的针对合成二维数据的分裂-合并DP-means算法的示例结果。图中分别展示了输入数据点、分裂DP-means结果以及最终分裂-合并DP-means聚类的可视化效果,突出显示了算法处理不同规模数据点时的聚类情况。

  • Image 3 Analysis: This figure provides a visual example of the full split-merge process. The initial data has five natural groups. (a) Standard DP-means might find only two clusters because the groups are close. (b) The Split DP-means algorithm is run. As more data arrives, the dense clusters on the left and right satisfy the split condition and are broken down into smaller clusters, resulting in more clusters than necessary. (c) Finally, the Merge DP-means algorithm cleans up the result by merging the over-split clusters, leading to a final solution that correctly identifies the five underlying groups.

5. Experimental Setup

  • Datasets: Four real-world datasets with varying sizes and dimensions were used:

    1. USGS: 59,209 earthquake locations (3 dimensions).
    2. MNIST: 70,000 handwritten digit images, reduced to 10 dimensions via PCA.
    3. KDD2004BIO: 145,751 protein sequence features (74 dimensions).
    4. SUN SCENES 397: 198,500 image GIST features (512 dimensions).
  • Evaluation Metrics:

    1. DP-means Cost: The primary metric for solution quality.
      • Conceptual Definition: It measures the overall quality of the clustering by balancing the compactness of clusters (quantization error) against the number of clusters (model complexity). A lower cost is better.
      • Mathematical Formula: costDP(X,C)=xXminμCxμ2+λ2k \mathrm { c o s t } _ { D P } ( \mathcal { X } , \mathcal { C } ) = \sum _ { \pmb { x } \in \mathcal { X } } \operatorname* { m i n } _ { \pmb { \mu } \in \mathcal { C } } | | \pmb { x } - \pmb { \mu } | | ^ { 2 } + \lambda ^ { 2 } k
      • Symbol Explanation:
        • X\mathcal{X}: The set of all data points.
        • C\mathcal{C}: The set of all cluster centroids.
        • x\pmb{x}: An individual data point.
        • μ\pmb{\mu}: An individual cluster centroid.
        • kk: The total number of clusters.
        • λ2\lambda^2: The penalty parameter for creating a new cluster.
    2. Computation Time: Measured in seconds, to evaluate the efficiency of the algorithms.
  • Baselines: The proposed algorithms were compared against two standard DP-means implementations:

    1. BD: Batch DP-means (Algorithm 1).
    2. OD: Online DP-means (Algorithm 2). The proposed methods are:
    3. SD: Split DP-means (Algorithm 3).
    4. SMD: Split-merge DP-means (Algorithms 3 and 4 combined).

6. Results & Analysis

The experiments were run five times with different data orders to measure robustness.

Core Results (Tables 1-4)

The results across all four datasets consistently show that the proposed SMD algorithm achieves the lowest DP-means cost.

  • Manual Transcription of Table 1: DP-means cost and runtime comparison for USGS data.

    λ2\lambda^2 Cost (× 102) Computation time (s)
    BD OD SD SMD BD OD SD SMD
    0.1 4.68 ± 0.26 7.06 ± 1.13 1.54 ± 0.03 1.20 ± 0.01 2217.9 45.9 525.6 604.4
    0.32 18.0 ± 4.31 24.8 ± 3.35 3.39 ± 0.15 2.50 ± 0.06 507.1 17.2 404.1 446.6
    1.0 64.1 ± 3.15 80.7 ± 23.6 7.02 ± 0.40 5.07 ± 0.20 119.8 6.3 308.8 328.5
    3.2 460 ± 0 422 ± 105 14.5 ± 0.22 10.2 ± 0.22 1.7 2.1 234.3 242.5
  • Manual Transcription of Table 2: DP-means cost and runtime comparison for MNIST data.

    λ2\lambda^2 Cost (× 105) Computation time (s)
    BD OD SD SMD BD OD SD SMD
    8.0 × 100 1.31 ± 0.01 1.73 ± 0.04 1.14 ± 0.01 1.12 ± 0.00 58255.7 134.5 1739.0 2379.1
    4.0 × 101 7.00 ± 0 6.91 ± 0.26 1.86 ± 0.07 1.71 ± 0.04 2.1 2.6 1006.0 1180.3
    2.0 × 102 7.00 ± 0 7.00 ± 0 2.78 ± 0.13 2.50 ± 0.07 2.1 2.4 440.2 458.4
    1.0 × 103 7.01 ± 0 7.01 ± 0 3.96 ± 0.15 3.60 ± 0.12 2.1 2.4 154.4 155.7
  • Analysis of Tables 1 & 2:

    • In many cases, especially with larger λ2\lambda^2 values (e.g., MNIST with λ240\lambda^2 \ge 40), the Batch DP-means (BD) has a cost variance of ± 0. This indicates it gets stuck in the exact same local minimum regardless of data order, confirming the paper's central hypothesis.

    • The SMD algorithm consistently achieves a significantly lower cost, demonstrating its ability to escape these local minima.

    • The SD algorithm alone often improves over BD and OD, but its cost is higher than SMD, showing that the merge step is crucial for refining the solution.

    • Computation time for SD and SMD is often higher than OD and sometimes BD. This is expected, as SD tends to create many intermediate clusters, increasing the complexity of the nearest-neighbor search.

      (Tables 3 and 4 show similar trends and are transcribed below for completeness.)

  • Manual Transcription of Table 3: DP-means cost and runtime comparison for KDD2004BIO data.

    λ2\lambda^2 Cost (×1011) Computation time (s)
    BD OD SD SMD BD OD SD SMD
    3.2 × 107 1.91 ± 0.01 3.37 ± 0.13 1.57 ± 0.05 1.45 ± 0.03 30349.0 49.6 2340.9 2522.5
    1.0 × 108 2.75 ± 0.00 4.68 ± 0.03 2.12 ± 0.12 1.86 ± 0.06 10810.1 27.1 1543.8 1593.9
    3.2 × 108 3.19 ± 0.04 5.69 ± 0.34 2.89 ± 0.19 2.45 ± 0.09 7723.6 20.2 948.9 960.8
    1.0 × 109 7.33 ± 0.01 9.27 ± 3.62 4.49 ± 0.75 3.59 ± 0.43 255.5 10.2 661.3 665.5
  • Manual Transcription of Table 4: DP-means cost and runtime comparison for SUN SCENES 397 data.

    λ2\lambda^2 Cost (×104) Computation time (s)
    BD OD SD SMD BD OD SD SMD
    0.250 1.15 ± 0.01 1.37 ± 0.01 1.13 ± 0.00 1.13 ± 0.00 279623.7 605.5 5084.7 5268.8
    0.354 1.25 ± 0.01 1.47 ± 0.02 1.17 ± 0.00 1.17 ± 0.00 101439.4 173.3 4223.2 4346.9
    0.500 1.41 ± 0.02 1.56 ± 0.04 1.21 ± 0.00 1.21 ± 0.00 12746.6 58.1 3518.2 3593.1
    0.707 1.79 ± 0 1.81 ± 0.01 1.24 ± 0.01 1.24 ± 0.01 421.6 23.1 2966.5 3013.5

Number of Clusters (Table 5)

This table compares the number of clusters found by each method to the "optimal" number reported in a previous study [6], which used an exhaustive grid search.

  • Manual Transcription of Table 5: Numbers of clusters for each dataset in original data order.

    Dataset BD OD SD SMD Optimal [6]
    USGS (λ2=1.0) 8 6 529 312 156
    MNIST (λ2=1.0 × 103) 1 1 140 87 65
    KDD2004BIO (λ2=1.0 × 109) - 3 202 122 55
  • Analysis:

    • BD and OD find drastically fewer clusters than the optimal number, confirming they get stuck in local minima with too few clusters. For MNIST, they both find only 1 cluster when the optimum is 65.
    • SD (Split DP-means) overcompensates, creating far more clusters than optimal (e.g., 529 for USGS vs. an optimum of 156).
    • SMD (Split-merge DP-means) successfully corrects this over-splitting. The number of clusters it finds (e.g., 312 for USGS, 87 for MNIST) is much closer to the optimal number than any other DP-means variant, though still an overestimation. This demonstrates the synergy of the split and merge steps: splitting escapes the local minimum, and merging refines the solution towards the correct complexity.

7. Conclusion & Reflections

  • Conclusion Summary: The paper successfully diagnoses a key failure mode of the DP-means algorithm: its inability to create new clusters in dense regions of data, leading to convergence in local minima with a suboptimal number of clusters. The authors provide a clear theoretical analysis of this phenomenon. They then propose a novel split-merge DP-means algorithm that directly addresses this issue by splitting dense clusters and merging over-split ones. Empirical results convincingly show that this new algorithm finds better clustering solutions (lower cost) than standard DP-means across a variety of real-world datasets.

  • Limitations & Future Work: The authors acknowledge three main limitations in their proposed algorithm:

    1. The splitting and merging criteria are derived assuming a uniform distribution within clusters, which may not be accurate for real-world data.
    2. The online update rules lose detailed information about the data point distribution within a cluster, as only summary statistics (mean, range) are kept.
    3. The split operation is restricted to a single dimension (the one with the maximum range), which may not be optimal for clusters with complex, multi-dimensional shapes.
    • Future work aims to create more exact versions of the algorithm that relax these simplifying approximations.
  • Personal Insights & Critique:

    • Strengths: The paper's main strength is its clear, intuitive, and elegant analysis of the DP-means local minimum problem. The "easy case" scenario is a powerful tool for demonstrating the core issue. The proposed solution is practical and directly motivated by the analysis. It moves beyond just identifying a problem to offering a constructive algorithm to solve it.
    • Weaknesses: The reliance on the uniform distribution assumption is a significant simplification. Real-world clusters are rarely hyper-rectangles. A Gaussian assumption, as hinted by the authors, would be more realistic but also more complex to implement. The greedy nature of the merge step also does not guarantee finding the global optimum, but it is a standard and practical heuristic.
    • Overall Impact: This work is a valuable contribution to the literature on DP-means and hard clustering. It provides a deeper understanding of the algorithm's failure modes and offers a robust method to mitigate them. The insights could inspire similar split-merge or density-aware improvements for other clustering algorithms derived from small-variance asymptotics. The trade-off between higher computational cost and better solution quality is clearly highlighted, providing practical guidance for users.

Discussion

Leave a comment

Sign in to join the discussion.

No comments yet. Start the discussion!