Paper status: completed

MambaOut: Do We Really Need Mamba for Vision?

Published:05/14/2024

Long-Sequence Modeling in Vision Tasks (1)Analysis of Mamba Architecture (1)Image Classification Evaluation (1)Mamba Application in Detection and Segmentation (1)State Space Model-Based Vision Architectures (2)

Original Link PDF

Price: 0.100000

11 readers

This analysis is AI-generated and may not be fully accurate. Please refer to the original paper.

TL;DR Summary

This paper questions Mamba's necessity in vision, hypothesizing its core (SSM) is crucial only for long-sequence tasks. By creating `MambaOut` (Mamba without SSM), experiments showed it surpassed Mamba in image classification but underperformed in detection/segmentation, confirmi

Abstract

Mamba, an architecture with RNN-like token mixer of state space model (SSM), was recently introduced to address the quadratic complexity of the attention mechanism and subsequently applied to vision tasks. Nevertheless, the performance of Mamba for vision is often underwhelming when compared with convolutional and attention-based models. In this paper, we delve into the essence of Mamba, and conceptually conclude that Mamba is ideally suited for tasks with long-sequence and autoregressive characteristics. For vision tasks, as image classification does not align with either characteristic, we hypothesize that Mamba is not necessary for this task; Detection and segmentation tasks are also not autoregressive, yet they adhere to the long-sequence characteristic, so we believe it is still worthwhile to explore Mamba's potential for these tasks. To empirically verify our hypotheses, we construct a series of models named MambaOut through stacking Mamba blocks while removing their core token mixer, SSM. Experimental results strongly support our hypotheses. Specifically, our MambaOut model surpasses all visual Mamba models on ImageNet image classification, indicating that Mamba is indeed unnecessary for this task. As for detection and segmentation, MambaOut cannot match the performance of state-of-the-art visual Mamba models, demonstrating the potential of Mamba for long-sequence visual tasks. The code is available at https://github.com/yuweihao/MambaOut

Mind Map

In-depth Reading

English Analysis~11 min read · 14,757 chars

1. Bibliographic Information

Title: MambaOut: Do We Really Need Mamba for Vision?
Authors:
- Weihao Yu (National University of Singapore)
- Xinchao Wang (National University of Singapore)
Journal/Conference: The paper is available on arXiv, a preprint server. This means it has not yet undergone formal peer review for a conference or journal, but it allows for rapid dissemination of research findings.
Publication Year: 2024 (Initial submission in May 2024).
Abstract: The paper investigates the Mamba architecture, which uses a State Space Model (SSM) as its token mixer, for vision tasks. The authors observe that Mamba's performance in vision is often lackluster compared to established CNN and attention-based models. They conceptually argue that Mamba is best suited for tasks that are both long-sequence and autoregressive. Since image classification on ImageNet has neither characteristic, they hypothesize Mamba's core component (SSM) is unnecessary. For detection and segmentation, which are long-sequence but not autoregressive, they suggest Mamba still holds potential. To test this, they create MambaOut, a model that removes the SSM from Mamba blocks. Experiments show MambaOut surpasses visual Mamba models in ImageNet classification but falls short in detection and segmentation, thus supporting their hypotheses.
Original Source Link:
- arXiv Link: https://arxiv.org/abs/2405.07992
- PDF Link: http://arxiv.org/pdf/2405.07992v3
- Publication Status: Preprint on arXiv.

2. Executive Summary

Background & Motivation (Why):
- Core Problem: The Mamba architecture, originally successful in natural language processing for its linear-time complexity in handling long sequences, has been adapted for computer vision. However, these "visual Mamba" models have generally failed to outperform state-of-the-art Convolutional Neural Networks (CNNs) and Transformers.
- Importance & Gap: This performance gap raises a fundamental question: Is the core mechanism of Mamba, the State Space Model (SSM), truly suitable for standard vision tasks? Prior work focused on how to adapt Mamba for vision, but this paper asks if we should adapt it at all, or at least, for which tasks.
- Innovation: Instead of proposing a new, better visual Mamba, the paper takes a step back to perform a critical, first-principles analysis. The innovation lies in deconstructing the Mamba block, identifying its essential properties (long-sequence handling, autoregressive nature), and systematically evaluating whether these properties align with the demands of different vision tasks. The creation of MambaOut serves as an elegant ablation study to isolate and test the contribution of the SSM itself.
Main Contributions / Findings (What):
1. Conceptual Analysis: The paper provides a clear conceptual framework, arguing that Mamba's strengths are best utilized in tasks characterized by long sequences and an autoregressive nature (where output at a given step depends only on previous inputs).
2. Task-Specific Hypotheses: It applies this framework to vision, leading to two key hypotheses:
  - Hypothesis 1: The SSM is unnecessary for ImageNet classification, which involves short sequences and does not require autoregressive modeling.
  - Hypothesis 2: The SSM may be beneficial for object detection and segmentation, which involve processing high-resolution images (long sequences) even though they are not autoregressive.
3. Empirical Validation via MambaOut: The authors introduce MambaOut, a simple model architecture that is identical to a visual Mamba but with the core SSM component removed. Experiments validate the hypotheses:
  - On ImageNet classification, MambaOut outperforms all existing visual Mamba models, suggesting the SSM adds no value and can be detrimental.
  - On COCO detection and ADE20K segmentation, MambaOut is outperformed by the best visual Mamba models, confirming that the SSM's long-sequence modeling capability is indeed valuable for these tasks.

Foundational Concepts:
- Transformer: An influential deep learning architecture, originally from NLP, that relies on a mechanism called self-attention to process sequences of data (like words in a sentence or patches of an image). Its main drawback is that the computational cost of self-attention grows quadratically with the sequence length ( $O(L^2)$ ), making it inefficient for very long sequences.
- RNN (Recurrent Neural Network): A type of neural network designed for sequential data. It processes data one step at a time, maintaining a hidden state (or "memory") that summarizes past information. This makes it efficient for long sequences (linear complexity, O(L)), but it can struggle with long-range dependencies and is harder to train in parallel compared to Transformers.
- State Space Model (SSM): A concept from control theory adapted for deep learning. It models a sequence by mapping an input signal to an output signal through a latent hidden state. Modern SSMs, like those in Mamba, are structured to be computationally efficient while capturing long-range dependencies, combining the strengths of RNNs and CNNs.
- Mamba: An architecture that uses a selective SSM as its core building block. It achieves linear-time complexity like an RNN but can be trained in parallel like a Transformer. Its "selective" nature means it can dynamically decide which information to focus on or ignore, making it powerful for modeling complex data.
- Gated CNN (Gated Convolutional Network): A convolutional architecture that uses gating mechanisms (similar to LSTMs/GRUs in RNNs) to control the flow of information through the network. The paper highlights that the Mamba block is an extension of the Gated CNN block, with the SSM being the key addition.
Previous Works & Technological Evolution:
- The field of computer vision backbones has evolved from CNNs (e.g., ResNet, ConvNeXt) to Transformers (e.g., ViT, Swin Transformer). The main driver has been the search for more powerful ways to model relationships between different parts of an image.
- Transformers brought global context modeling but at a high computational cost. This led to a wave of "efficient Transformers" that tried to reduce this cost.
- More recently, Mamba emerged as a promising alternative from the NLP world, offering a potential "best of both worlds" solution: the linear scaling of RNNs and the modeling power of Transformers.
- This prompted a flurry of research to apply Mamba to vision, resulting in models like Vision Mamba (Vim), VMamba, and LocalMamba. These works focused on adapting Mamba's 1D sequence processing for 2D images, often by "flattening" image patches into a sequence and scanning them in different directions.
Differentiation:
- Unlike previous works that aimed to build better visual Mamba models, this paper asks a more fundamental question: Is Mamba even the right tool for the job?
- It stands out by performing a critical analysis rather than a constructive one. The proposed MambaOut model is not meant to be a new state-of-the-art architecture but rather an experimental tool—a baseline—to rigorously test their hypotheses about the necessity of SSMs in vision. This approach follows the principle of Occam's razor: do not use a more complex model (Mamba) if a simpler one (Gated CNN, i.e., MambaOut) suffices.

4. Methodology (Core Technology & Implementation)

The paper's methodology is divided into a conceptual discussion followed by the proposal of MambaOut for empirical verification.

Principles: What Tasks is Mamba Suitable For? The authors argue that Mamba's core component, the SSM, is fundamentally an RNN-like mechanism. This gives it two defining characteristics that determine its ideal use case.
1. Long-Sequence Processing:
  - The SSM updates its hidden state $h_t$ based on the previous state h_{t-1} and the current input $x_t$ . This recurrent update has constant computational complexity, regardless of how long the sequence gets.
  - In contrast, causal attention must store and access all previous keys and values, making its memory and computation grow with the sequence length.
  - Conclusion: Mamba's advantage over attention becomes significant only when dealing with long sequences where attention's quadratic complexity becomes prohibitive.
    
    $Figure 2: The mechanism illustration of causal attention and RNN-like models from memory perspective, where `x _ { i }` denotes the input token of $i$ -th step. (a) Causal attention stores all previo…$ 该图像是示意图，展示了因果注意力（Causal attention）与类RNN模型（RNN-like）在记忆机制上的区别。因果注意力通过不断累积所有历史令牌的键值对（k,v）实现无损记忆，但计算复杂度随序列长度增加；类RNN则采用固定大小隐藏状态 $h$ 压缩记忆，具有有损性，但计算复杂度与序列长度无关，适合长序列处理。
2. Autoregressive (Causal) Nature:
  - The recurrent nature of the SSM means the output $y_t$ at any step $t$ can only depend on current and past inputs ( $x_1, ..., x_t$ ). This is known as the causal mode of token mixing. This mode is essential for generative tasks like language modeling, where the next word is predicted based on preceding words.
  - However, vision tasks are typically understanding tasks, where the model can see the entire image at once. The optimal approach here is the fully-visible mode, where every output token can draw information from all input tokens.
  - Imposing a causal constraint on an understanding task is unnecessarily restrictive and can hurt performance, as demonstrated by the paper's experiment in Figure 3(b), where a ViT with causal attention performs worse than one with default fully-visible attention.
  - Conclusion: Mamba is inherently suited for tasks requiring causal token mixing.
    
    $Figure 3: (a) Two modes of token mixing \[63\]. For a total of $T$ tokens, the fully-visible mode allows token $t$ to aggregate inputs from all tokens, i $\\{ x i \\} _ { i = 1 } ^ { T }$ , to compute it…$ 该图像是示意图，展示了两种token混合模式及其对图像分类性能的影响。(a) 左图为全视野模式，token输出可访问所有输入token，典型如BERT和ViT注意力机制；右图为因果模式，token输出仅依赖当前及之前的输入token，如GPT注意力和Mamba的SSM。(b) 右图柱状图显示，将ViT的注意力从全视野改为因果模式，在ImageNet分类任务上导致准确率下降，表明因果混合对于理解任务并非必要。
Steps & Procedures: Analyzing Vision Tasks The paper then analyzes standard vision benchmarks against these two characteristics.
- ImageNet Classification:
  - Sequence Length: With a standard $224 \times 224$ image and a patch size of $16 \times 16$ , the sequence length is only $14 \times 14 = 196$ tokens. The paper provides a heuristic ( $L > 6D$ ) to argue this is a short sequence.
  - Autoregressive: Classification is an understanding task. The model sees the whole image. It is not autoregressive.
  - Verdict: Fails on both characteristics.
- COCO Detection & ADE20K Segmentation:
  - Sequence Length: These tasks use higher resolution images (e.g., $800 \times 1280$ ), resulting in sequence lengths of ~4000 tokens. This qualifies as a long sequence.
  - Autoregressive: Like classification, these are understanding tasks and are not autoregressive.
  - Verdict: Meets the long-sequence characteristic but not the autoregressive one.
Hypotheses This analysis leads to the paper's two central hypotheses:
- Hypothesis 1: SSM is not necessary for image classification on ImageNet.
- Hypothesis 2: SSM may be beneficial for detection and segmentation tasks due to their long-sequence nature.
Mathematical Formulas & Key Details: The MambaOut Model To test the hypotheses, the authors construct MambaOut. The key insight is that a Mamba block is a Gated CNN block plus an SSM. MambaOut is simply a model built by stacking Gated CNN blocks.

$Figure 1: (a) Architecture of Gated CNN \[18\] and Mamba \[25\] blocks (omitting Normalization and shortcut). The Mamba block extends the Gated CNN with an additional state space model (SSM). As will be…$ 该图像是示意图和性能对比图：(a)分别展示了Gated CNN模块和Mamba模块的结构示意，其中Mamba模块在Gated CNN基础上增加了状态空间模型（SSM）；(b)展示了MambaOut与多种视觉Mamba模型在ImageNet分类任务上的准确率、计算量（MACs）及模型大小的对比，结果显示去除SSM的MambaOut在准确率上超越了其他Mamba模型。

The meta-architecture for both is given by: $X' = \mathrm{Norm}(X)$ $Y = (\mathrm{TokenMixer}(X'W_1) \odot \sigma(X'W_2))W_3 + X$ Where:
- $X \in \mathbb{R}^{N \times D}$ is the input tensor with $N$ tokens and $D$ channels.
- $\mathrm{Norm}(\cdot)$ is a normalization layer (e.g., LayerNorm).
- $W_1, W_2, W_3$ are learnable weight matrices for linear projections.
- $\sigma$ is an activation function (e.g., GELU).
- $\odot$ denotes element-wise multiplication (the gating mechanism).
- $\mathrm{TokenMixer}(\cdot)$ is the module that mixes information across tokens.
  
  The only difference is the definition of the TokenMixer:
- For Gated CNN (MambaOut): $\mathrm{TokenMixer}_{\mathrm{GatedCNN}}(Z) = \mathrm{Conv}(Z)$
- For Mamba: $\mathrm{TokenMixer}_{\mathrm{Mamba}}(Z) = \mathrm{SSM}(\sigma(\mathrm{Conv}(Z)))$ MambaOut's architecture follows a standard hierarchical design, similar to ResNet and Swin Transformer, with four stages of decreasing spatial resolution and increasing channel depth.
$Figure 4: (a) The overall framework of MambaOut for visual recognition. Similar to ResNet \[32\], MambaOut adopts hierarchical architecture with four stages. `D _ { i }` represents the channel dimensio…$ 该图像是示意图，展示了MambaOut视觉识别模型的总体框架和Gated CNN模块结构。（a）部分显示了输入图像经过4个阶段的层级下采样和Gated CNN块处理，通道维度逐步变化；（b）部分展示了Gated CNN块的具体结构，包含线性变换、卷积、归一化及门控机制，区别于包含SSM的Mamba块。

The paper provides a simple PyTorch implementation of the Gated CNN block in Algorithm 1.

``python import torch import torch.nn as nn

class GatedCNNBlock(nn.Module): def init(self, dim, expension_ratio=8/3, kernel_size=7, conv_ratio=1.0, norm_layer=partial(nn.LayerNorm,eps=1e-6), act_layer=nn.GELU, drop_path=0.): super().init() self.norm = norm_layer(dim) hidden = int(expension_ratio * dim) self.fc1 = nn.Linear(dim, hidden * 2) self.act = act_layer() conv_channels = int(conv_ratio * dim) self.split_indices = (hidden, hidden - conv_channels, conv_channels) self.conv = nn.Conv2d(conv_channels, conv_channels, kernel_size=kernel_size, padding=kernel_size//2, groups=conv_channels) self.fc2 = nn.Linear(hidden, dim)
```
def forward(self, x):
    shortcut = x
    B, H, W, C = x.shape
    x = self.norm(x)
    g, i, c = torch.split(self.fc1(x), self.split_indices, dim=-1)
    c = c.permute(0, 3, 1, 2) # [B, H, W, C] -> [B, C, H, W]
    c = self.conv(c)
    c = c.permute(0, 2, 3, 1) # [B, C, H, W] -> [B, H, W, C]
    x = self.fc2(self.act(g) * torch.cat((i, c), dim=-1))
    return x + shortcut`
```

5. Experimental Setup

Datasets:
- ImageNet-1K: A large-scale dataset for image classification with ~1.3 million training images and 50,000 validation images across 1,000 object categories. It is the standard benchmark for pre-training vision models.
- COCO 2017: A benchmark for object detection and instance segmentation. It contains over 118k training images and 5k validation images with 80 object categories. It is challenging due to multiple objects per image, varying scales, and complex scenes.
- ADE20K: A scene parsing dataset for semantic segmentation, containing 20k training images and 2k validation images with 150 semantic categories (e.g., wall, sky, car).
Evaluation Metrics:
- Image Classification (ImageNet):
  1. Conceptual Definition: Top-1 Accuracy measures the percentage of test images for which the model's prediction with the highest confidence score is the correct label. It is a straightforward measure of classification correctness.
  2. Mathematical Formula: $\text{Top-1 Accuracy} = \frac{\text{Number of correctly classified images}}{\text{Total number of images}}$
  3. Symbol Explanation: N/A.
- Object Detection & Instance Segmentation (COCO):
  1. Conceptual Definition: Average Precision (AP) is the primary metric. It is the area under the precision-recall curve, calculated for each class and then averaged. A higher AP indicates better performance. The paper reports box AP (APb) for object detection and mask AP (APm) for instance segmentation. Variants like AP50andAP75 refer to AP calculated at a specific Intersection over Union (IoU) threshold of 0.5 and 0.75, respectively. The main AP metric is averaged over multiple IoU thresholds (from 0.5 to 0.95).
  2. Mathematical Formula (for a single class): $\mathrm{AP} = \sum_{k=1}^{N} (R_k - R_{k-1}) P_k$
  3. Symbol Explanation: $N$ is the total number of images, $P_k$ is the precision at the $k$ -th image, and $R_k$ is the recall at the $k$ -th image, calculated after sorting predictions by confidence.
- Semantic Segmentation (ADE20K):
  1. Conceptual Definition: mean Intersection over Union (mIoU) is the standard metric. For each class, IoU is the ratio of the area of overlap to the area of union between the predicted segmentation mask and the ground truth mask. mIoU is the average of these IoU values across all classes.
  2. Mathematical Formula: $\mathrm{mIoU} = \frac{1}{C} \sum_{i=1}^{C} \frac{\text{TP}_i}{\text{TP}_i + \text{FP}_i + \text{FN}_i}$
  3. Symbol Explanation: $C$ is the number of classes. $\text{TP}_i$ , $\text{FP}_i$ , and $\text{FN}_i$ are the number of true positive, false positive, and false negative pixels for class $i$ , respectively.
Baselines: The paper compares MambaOut` against a comprehensive set of models:
- Visual Mamba Models: Vim, VMamba, LocalMamba, PlainMamba, EfficientVMamba. These represent the direct competitors that use SSMs.
- CNN Models: ConvNeXt, VAN, InternImage, HorNet. These are modern, high-performing convolutional models.
- Attention-based Models (Transformers): DeiT, Swin, CSWin, Focal. These are the dominant vision backbones that Mamba aims to challenge.
- Hybrid Models (Conv + Attn): CoAtNet, CAFormer, TransNeXt. These models combine the strengths of both convolutions and attention.

6. Results & Analysis

The experimental results are presented in three main tables, which I will transcribe and analyze.

Core Results: ImageNet Classification (Table 1) This table compares MambaOut with a wide range of models on ImageNet.

(Manual transcription of Table 1 from the paper)

Model	TokenMixingType	Param(M)	Test@224²		Model	TokenMixingType	Param(M)	Test@224²
Model	TokenMixingType	Param(M)	MAC(G)	Acc(%)	Model	TokenMixingType	Param(M)	MAC(G)	Acc(%)
VAN-B0 [28]	Conv	4	0.9	75.4	ConvNeXt-S [52]	Conv	50	8.7	83.1
MogaNet-T [45]	Conv	5	1.1	79.0	VAN-B3 [28]	Conv	45	9.0	83.9
FasterNet-T1 [7]	Conv	8	0.9	76.2	ConvFormer-S36 [92]	Conv	40	7.6	84.1
InceptionNeXt-A [93]	Conv	4	0.5	75.3	InternImage-S [79]	Conv	50	8	84.2
DeiT-Ti [73]	Attn	6	1.3	72.2	MogaNet-B [45]	Conv	44	9.9	84.3
T2T-ViT-7 [94]	Attn	4	1.1	71.7	T2T-ViT-19 [94]	Attn	39	8.5	81.9
PVTv2-B0 [80]	Conv + Attn	3	0.6	70.5	Swin-S [51]	Attn	50	8.7	83.0
MobileViTv3-XS [77]	Conv + Attn	3	0.9	76.7	Focal-Small [90]	Attn	51	9.1	83.5
EMO-6M [101]	Conv + Attn	6.5	1.0	79.0	CSWin-S [22]	Attn	35	6.9	83.6
Vim-Ti [104]	Conv + SSM	7	1.5	76.1	MViTv2-S [46]	Attn	35	7.0	83.6
LocalVim-T [37]	Conv + SSM	8	1.5	76.2	CoAtNet-1 [16]	Conv + Attn	42	8.4	83.3
EfficientVMamba-T [58]	Conv + SSM	6	0.8	76.5	UniFormer-B [43]	Conv + Attn	50	8.3	83.9
EfficientVMamba-S [58]	Conv + SSM	11	1.3	78.7	CAFormer-S36 [92]	Conv + Attn	39	8.0	84.5
MambaOut-Femto	Conv	7	1.2	78.9	SG-Former-M [65]	Conv + Attn	39	7.5	84.1
PoolFormer-S24 [91]	Pool	21	3.4	80.3	TransNeXt-Small [69]	Conv + Attn	50	10.3	84.7
ConvNeXt-T [52]	Conv	29	4.5	82.1	VMamba-S [50]	Conv + SSM	44	11.2	83.5
VAN-B2 [28]	Conv	27	5.0	82.8	LocalVMamba-S [37]	Conv + SSM	50	11.4	83.7
ConvFormer-S18 [92]	Conv	27	3.9	83.0	PlainMamba-L2 [88]	Conv + SSM	25	8.1	81.6
MogaNet-S [45]	Conv	25	5.0	83.4	VMambaV9-S [50]	Conv + SSM	50	8.7	83.6
InternImage-T [79]	Conv	30	5	83.5	MambaOut-Small	Conv	48	9.0	84.1
InceptionNeXt-T [93]	Conv	28	4.2	82.3	ConvNeXt-B [52]	Conv	89	15.4	83.8
DeiT-S [73]	Attn	22	4.6	79.8	RepLKNet-31B [21]	Conv	79	15.3	83.5
T2T-ViT-14 [94]	Attn	22	4.8	81.5	ConvFormer-M36 [92]	Conv	57	12.8	84.5
Swin-T [51]	Attn	29	4.5	81.3	HorNet-B [64]	Conv	88	15.5	84.3
Focal-Tiny [90]	Attn	29	4.9	82.2	MogaNet-L [45]	Conv	83	15.9	84.7
CSWin-T [22]	Attn	23	4.3	82.7	InternImage-B [79]	Conv	97	16	84.9
CoAtNet-0 [16]	Conv + Attn	25	4.2	81.6	DeiT-B [73]	Attn	86	17.5	81.8
iFormer-S [70]	Conv + Attn	20	4.8	83.4	T2T-ViT-24 [94]	Attn	64	13.8	82.3
MOAT-0 [87]	Conv + Attn	28	5.7	83.3	Swin-B [51]	Attn	88	15.4	83.5
CAFormer-S18 [92]	Conv + Attn	26	4.1	83.6	CSwin-B [22]	Attn	78	15.0	84.2
SG-Former-S [65]	Conv + Attn	23	4.8	83.2	MViTv2-B [46]	Attn	52	10.2	84.4
TransNeXt-Tiny [69]	Conv + Attn	28	5.7	84.0	CoAtNet-2 [16]	Conv + Attn	75	15.7	84.1
Vim-S [104]	Conv + SSM	26	5.1	80.5	iFormer-L [70]	Conv + Attn	87	14.0	84.8
VMamba-T [50]	Conv + SSM	22	5.6	82.2	MOAT-2 [87]	Conv + Attn	73	17.2	84.7
Mamba-2D-S [44]	Conv + SSM	24	-	81.7	CAFormer-M36 [92]	Conv + Attn	56	13.2	85.2
LocalVim-S [37]	Conv + SSM	28	4.8	81.2	TransNeXt-Base [69]	Conv + Attn	90	18.4	84.8
LocalVMamba-T [37]	Conv + SSM	26	5.7	82.7	VMamba-B [50]	Conv + SSM	75	18.0	83.7
EfficientVMamba-B [58]	Conv + SSM	33	4.0	81.8	Mamba-2D-B [44]	Conv + SSM	92	-	83.0
PlainMamba-L1 [88]	Conv + SSM	7	3.0	77.9	PlainMamba-L3 [88]	Conv + SSM	50	14.4	82.3
VMambaV9-T* [50]	Conv + SSM	31	4.9	82.5	VMambaV9-B [50]	Conv + SSM	89	15.4	83.9

Similar papers

Recommended via semantic vector search.

No similar papers found yet.