A generalized e-value feature detection method with FDR control at multiple resolutions
TL;DR Summary
This paper introduces the Stabilized Flexible e-Filter Procedure (SFEFP) for detecting significant features and groups across multiple resolutions while controlling the false discovery rate (FDR). SFEFP outperforms existing methods by flexibly integrating detection processes and
Abstract
Multiple resolutions arise across a range of explanatory features due to domain-specific structures, leading to the formation of feature groups. It follows that the simultaneous detection of significant features and groups aimed at a specific response with false discovery rate (FDR) control stands as a crucial issue, such as the spatial genome-wide association studies. Nevertheless, existing methods such as the multilayer knockoff filter (MKF) generally require a uniform detection approach across resolutions to achieve multilayer FDR control, which can be not powerful or even not applicable in several settings. To fix this issue effectively, this article develops a novel method of stabilized flexible e-filter procedure (SFEFP), by constructing unified generalized e-values, developing a generalized e-filter, and adopting a stabilization treatment. This method flexibly incorporates a wide variety of base detection procedures that operate effectively across different resolutions to provide stable and consistent results, while controlling the false discovery rate at multiple resolutions simultaneously. Furthermore, we investigate the statistical theories of the SFEFP, encompassing multilayer FDR control and stability guarantee. We develop several examples for SFEFP such as eDS-filter and eDS+gKF-filter. Simulation studies demonstrate that the eDS-filter effectively controls FDR at multiple resolutions while either maintaining or enhancing power compared to MKF. The superiority of the eDS-filter is also demonstrated through the analysis of HIV mutation data.
Mind Map
In-depth Reading
English Analysis
1. Bibliographic Information
1.1. Title
A generalized e-value feature detection method with FDR control at multiple resolutions
1.2. Authors
- Chengyao Yu
- Ruixing Ming
- Min XIao
- Zhanfeng Wang
- Bingyi Jing
Affiliations:
- School of Statistics and Mathematics, Zhejiang Gongshang University
- School of Management, University of Science and Technology of China
- Department of Statistics and Data Science, Southern University of Science and Technology
- Emails provided indicate affiliations with Zhejiang Gongshang University and Southern University of Science and Technology.
1.3. Journal/Conference
This paper is published as a preprint on arXiv. Venue Reputation: arXiv is a popular open-access repository for preprints of scientific papers in various fields, including mathematics, physics, computer science, and quantitative biology. It allows researchers to disseminate their work quickly before formal peer review and publication, making it a highly influential platform for early research sharing.
1.4. Publication Year
Published at (UTC): 2024-09-25T15:46:46.000Z, implying a publication year of 2024.
1.5. Abstract
The paper addresses the crucial issue of simultaneously detecting significant features and groups (feature groups formed due to domain-specific multi-resolution structures) with false discovery rate (FDR) control. Existing methods, such as the multilayer knockoff filter (MKF), often require a uniform detection approach across resolutions for multilayer FDR control, which can be less powerful or inapplicable in certain scenarios. To overcome this, the authors propose a novel method called the stabilized flexible e-filter procedure (SFEFP). SFEFP involves constructing unified generalized e-values, developing a generalized e-filter, and applying a stabilization treatment. This method offers flexibility by incorporating diverse base detection procedures that operate effectively at different resolutions, aiming to provide stable, consistent, and powerful results while simultaneously controlling the FDR at multiple resolutions. The paper also provides theoretical guarantees for SFEFP, including multilayer FDR control and stability. Practical examples, such as the eDS-filter and eDS+gKF-filter, are developed. Simulation studies demonstrate that the eDS-filter effectively controls FDR at multiple resolutions while matching or exceeding the power of MKF. The efficacy of the eDS-filter is further validated through analysis of HIV mutation data.
1.6. Original Source Link
- Original Source Link: https://arxiv.org/abs/2409.17039v4
- PDF Link: https://arxiv.org/pdf/2409.17039v4.pdf
- Publication Status: Preprint.
2. Executive Summary
2.1. Background & Motivation
Core Problem
The paper addresses the challenge of identifying relevant features and groups of features that influence a specific response variable, particularly when these features exhibit multi-resolution structures (i.e., they can be grouped in different meaningful ways). The core problem is to achieve this detection while rigorously controlling the false discovery rate (FDR) simultaneously across all these resolutions (individual features and different levels of feature groups).
Importance and Challenges
- Domain-Specific Structures: Many scientific domains naturally present data with multi-resolution structures. For example, in genome-wide association studies (
GWAS), one might be interested in individualSNPs(single nucleotide polymorphisms) and the genes that harbor them. - Scientific Significance: Discoveries across different resolutions are often of substantial interest to domain scientists, as identifying an important feature also implies the significance of the group it belongs to, and vice-versa.
- Reproducibility and False Discoveries: Ensuring reproducibility of scientific findings necessitates rigorous control of false discoveries.
FDRis a suitable statistical measure for this in large-scale multiple testing. - Limitations of Existing Methods:
- Uniform Approach: Current methods like the
p-filterormultilayer knockoff filter (MKF)often demand a uniform detection strategy across all resolutions. This uniformity can lead to sub-optimal performance, being eithernot powerful(failing to detect true signals) ornot applicablein diverse settings. - P-value Challenges:
p-valuebased methods (likep-filter) struggle with constructing validp-valuesin high-dimensional settings and can lose power when handling dependencies by reshapingp-values. - Knockoff Limitations:
MKF, while offering multilayerFDRcontrol, can be overly conservative due to decoupling dependencies among cross-layerknockoff statistics. When features are highly correlated,knockoff-based procedurescan suffersevere power loss. Additionally,model-X knockoffmethods often require estimating the joint distribution of features, which is computationallyintractablein many scenarios. - "One-bit" Problem: Existing
e-filtermethods, specificallye-MKF(which leveragesone-bit knockoff e-values), can suffer from a "zero-power dilemma" where, if there are conflicts in discoveries across layers, the method might declare all features unimportant, leading to no selections.
- Uniform Approach: Current methods like the
Paper's Innovative Idea
The paper's entry point is to fix the limitations of existing multi-resolution FDR control methods by introducing a flexible and stabilized framework that doesn't rely on a single, uniform detection approach. It proposes to leverage e-values (an alternative to p-values that is simpler to construct and integrate) and allow for the integration of various state-of-the-art detection techniques at different resolutions. The key innovation is to introduce "generalized e-values" that can be derived from any FDR-controlled procedure, combine them through a "generalized e-filter," and apply a "stabilization treatment" to overcome the "one-bit" problem and enhance power.
2.2. Main Contributions / Findings
The paper makes several significant contributions:
- Generalized E-Filter and Unified Generalized E-Values:
- Develops a novel
generalized e-filterprocedure. - Proposes a unified construction of
generalized e-values(includingrelaxed e-values,asymptotic e-values, andasymptotic relaxed e-values). This construction is more generic than priore-valuemethods, allowing integration of a broad class ofFDR-controlled procedures (beyond justp-valuesorknockoffs) as base detection methods for each layer. This flexibility enables users to choose the most suitable technique for each resolution.
- Develops a novel
- Stabilized Flexible E-Filter Procedure (SFEFP):
- Introduces
SFEFPwhich merges thesegeneralized e-valueswith a stabilization treatment.SFEFPovercomes the "zero-power dilemma" of previouse-filtermethods (likee-MKFwithone-bit e-values) by producing non-binarygeneralized e-valuesthat better reflect the ranking of feature/group importance. This significantly improves detection power and stability.
- Introduces
- Theoretical Guarantees:
- Investigates the statistical theories of
SFEFP, establishing rigorousmultilayer FDR controlguarantees. - Provides a
stability guaranteeforSFEFPunder finite samples, demonstrating that as the number of replications increases, the selection set converges almost surely to a fixed set.
- Investigates the statistical theories of
- Practical Examples and Enhanced Performance:
-
Develops concrete examples of
SFEFPinstantiations:eDS-filter: Extends theDS (Data Splitting)method to multiple resolutions, offering a powerful alternative toMKFande-MKFin highly correlated settings.eDS+gKF-filter: A hybrid method combiningDSfor individual features andgroup knockofffor feature groups, suitable for scenarios with varying correlation structures.
-
Simulation studies demonstrate that the
eDS-filtereffectively controlsFDRat multiple resolutions while maintaining or enhancing power compared toMKF. -
Analysis of HIV mutation data further confirms the superiority of the
eDS-filterover ande-MKFin real-world applications, achieving higher power with robustFDRcontrol.In summary, the paper offers a flexible, robust, and powerful framework for feature detection with
FDRcontrol at multiple resolutions, addressing critical shortcomings of existing approaches and opening avenues for incorporating diverse, state-of-the-art detection techniques.
-
3. Prerequisite Knowledge & Related Work
3.1. Foundational Concepts
To understand this paper, a reader needs familiarity with concepts in multiple hypothesis testing, particularly regarding false discovery rate (FDR) control, and the idea of e-values as an alternative to p-values.
Multiple Hypothesis Testing
In many scientific studies, researchers test numerous hypotheses simultaneously. For example, a genome-wide association study (GWAS) might test millions of single nucleotide polymorphisms (SNPs) for association with a disease. When performing many tests, the probability of obtaining false positives (incorrectly rejecting a true null hypothesis) increases dramatically.
Null Hypothesis () and Alternative Hypothesis ()
- Null Hypothesis (): A statement that there is no effect or no relationship between variables. For example, means that feature is irrelevant to the response given all other features.
- Alternative Hypothesis (): A statement that contradicts the null hypothesis, suggesting an effect or relationship exists. If is not true, then is considered relevant.
Type I Error and False Discovery Rate (FDR)
- Type I Error (False Positive): Rejecting a true null hypothesis. In multiple testing, controlling the
family-wise error rate (FWER)(the probability of making at least one Type I error) can be too conservative, leading to low power (failing to detect true effects). - False Discovery Rate (FDR) [7]: A less stringent error rate that is more powerful for large-scale multiple testing.
FDRis defined as the expected proportion of false discoveries among all discoveries. If is the number of false discoveries (true nulls rejected) and is the total number of rejections, then (whereV/Ris taken as 0 if ). ControllingFDRat a level means that, on average, no more than proportion of rejections will be false positives. This paper focuses onFDRcontrol.
e-values
An e-value is a non-negative random variable that serves as an alternative to p-values for hypothesis testing.
- Definition: An
e-valuefor a null hypothesis is a test statistic such that its expectation under the null hypothesis is at most 1, i.e., . - Rejection Rule: For a given significance level , is rejected if .
- Type I Error Control: This rejection rule controls the Type I error rate. By Markov's inequality, .
- Advantages over p-values:
e-valuesare often simpler to combine and interpret thanp-values, especially when merging evidence from different sources or procedures. They can directly quantify evidence for the alternative hypothesis. - e-BH Procedure [44]: An analog of the classic Benjamini-Hochberg (
BH) procedure fore-values. Given a set ofe-valuesfor hypotheses , thee-BHprocedure sorts them in descending order and rejects hypotheses based on a dynamic threshold to controlFDR.
Knockoff Filter [2, 10, 13]
The knockoff filter is a method for FDR-controlled variable selection, particularly useful in high-dimensional settings (where the number of features is greater than the number of samples ).
- Core Idea: It creates synthetic "knockoff" features (
X~j) for each original feature (Xj). Theseknockoff featuresare designed to mimic the correlation structure of the original features but are conditionally independent of the response variable given the original features. - Symmetry Property: For each feature
Xj, a test statisticWjis computed based on bothXjandX~j. The key property is that for null features (those truly unrelated to the response), the distribution ofWjis symmetric around zero. For relevant features,Wjtends to be large and positive. - FDR Control: By comparing the
Wjstatistics of original features with theirknockoffcounterparts, theknockoff filtercan estimate the number of false positives and control theFDRwithout assuming a specific model or distribution for the true effects. - Model-X Knockoffs [10]: An extension where
knockoff featuresare generated conditional on , making the procedure valid for any conditional distribution of . - Group Knockoffs [13]: Extends the
knockoffidea to groups of features, allowing forFDRcontrol at the group level.
Multi-resolution Structures
This refers to the scenario where features can be organized into different hierarchical or overlapping groups based on domain knowledge. For example:
- Individual Features: The most granular level, e.g., individual
SNPs. - Groups/Layers: Higher levels of organization, e.g.,
SNPsgrouped into genes, or genes grouped into pathways. A feature might belong to a group at layer . The paper aims to simultaneously controlFDRat both the individual feature level and across all specified group levels.
Data Splitting
Data splitting is a statistical technique where the available dataset is randomly partitioned into two or more subsets.
- Purpose in FDR control: In methods like
DS (Data Splitting)[15],data splittingcan be used to generate independent estimates of feature importance (e.g., regression coefficientsbeta) or test statistics. By splitting the data, two independent sets of coefficients can be obtained. This independence is crucial for constructing test statistics with desired symmetry properties under the null hypothesis, which in turn facilitatesFDRcontrol.
3.2. Previous Works
The paper builds upon and differentiates itself from several established methods for FDR control, especially in multi-resolution settings.
-
P-filter [5, 30]:
- Concept: A method for multilayer
FDRcontrol based onp-values. It extendsBHprocedures to grouped hypotheses, allowing forFDRcontrol at both individual feature and group levels. It can also incorporate prior knowledge [30]. - Limitation (addressed by this paper): Constructing valid
p-valuesin high-dimensional settings can be challenging. Handling dependencies by reshapingp-valuescan lead to reduced power. Thep-filteris designed forp-values, whereas this paper leveragese-values.
- Concept: A method for multilayer
-
Multilayer Knockoff Filter (MKF) [23]:
- Concept: Integrates the
knockoff frameworkforFDRcontrol at multiple resolutions. It usesknockoff statisticsto identify important features and groups while controllingFDRacross all layers. - Limitation (addressed by this paper):
MKFcan be conservative due to decoupling dependencies among cross-layerknockoff statistics. When features are highly correlated,knockoff-based proceduresoften suffersevere power loss. Also,model-X knockoff(a commonknockoffvariant) requires estimating complex joint distributions, which is often intractable. The paper'sSFEFPaims to be more powerful and applicable in such correlated and high-dimensional settings.
- Concept: Integrates the
-
e-BH Procedure [44]:
- Concept: An analog of the
Benjamini-Hochbergprocedure adapted fore-values. It providesFDRcontrol for multiple hypothesis testing usinge-values. - Relation to this paper: This paper extends the
e-BHprinciple to a generalizede-filterfor multi-resolution settings, building on the foundation ofe-values.
- Concept: An analog of the
-
Derandomizing Knockoff Procedure [33] and e-MKF [18]:
- Derandomizing Knockoffs [33]: Proposed to address the randomness inherent in
knockoffprocedures (e.g., fromknockoffgeneration ordata splitting). It involves running theknockoffprocedure multiple times and averaging the results, often usingone-bit knockoff e-values. - e-MKF [18]: Extends
MKFby usinge-values(specifically,one-bit knockoff e-values) in ane-filterframework to enhance power and guarantee multilayerFDRcontrol. - Limitation (addressed by this paper): The paper reveals that
e-MKF(and generallye-filterswithone-bit e-values) can suffer from a "zero-power dilemma." If conflicting signals arise across layers, the method might select nothing. The stabilization treatment inSFEFPexplicitly addresses this by generating non-binarygeneralized e-valuesthat offer a more nuanced ranking of importance, preventing all-or-nothing outcomes. The paper states that derandomization for single resolution (as in [33]) often comes with a power cost, whileSFEFPaims for enhanced power in multi-resolution settings.
- Derandomizing Knockoffs [33]: Proposed to address the randomness inherent in
-
DS (Data Splitting) and MDS (Multiple Data Splitting) [15]:
- Concept: Feature detection methods designed for high-dimensional regression models, particularly powerful when features are highly correlated. They use
data splittingto generate independent estimates and controlFDRasymptotically. - Relation to this paper: The paper extends the
DSmethod to group detection and integrates it intoSFEFPas theeDS-filter, demonstrating the flexibility ofSFEFPto incorporate powerful base methods. TheeDS-filteris also presented as an alternative toMDS.
- Concept: Feature detection methods designed for high-dimensional regression models, particularly powerful when features are highly correlated. They use
-
Gaussian Mirror (GM) method [46] and Symmetry-based Adaptive Selection (SAS) framework [45]:
- Concept:
GMuses "Gaussian mirror" statistics forFDRcontrol.SASuses two-dimensional statistics and their symmetry properties to define rejection regions. Both are powerful alternatives toknockoffmethods. - Relation to this paper: The paper explicitly shows that these methods satisfy
Definition 1(framework for controlled detection procedures) and thus can be integrated intoSFEFPby constructing correspondinggeneralized e-values.
- Concept:
3.3. Technological Evolution
The field of multiple hypothesis testing has evolved from strictly controlling Type I error (like FWER control methods) to controlling FDR (starting with the Benjamini-Hochberg procedure [7]), which offers a better balance between error control and statistical power in large-scale settings. This evolution continued with the development of more sophisticated FDR control methods that account for dependencies among hypotheses, prior knowledge, or specific data structures:
- Early
FDRControl (1990s-early 2000s): InitialBHprocedure, extensions for dependentp-values[8], and concepts likelocal FDR[17]. - Model-Based and Structure-Aware Methods (2010s):
Knockoff filters[2, 10] emerged to provideFDRcontrol for variable selection in complex, high-dimensional models without strong distributional assumptions, especially forModel-Xsettings.P-filters[5, 30] introducedFDRcontrol for grouped and hierarchical hypotheses usingp-values, and integrated prior knowledge.Multilayer Knockoff Filter (MKF)[23] combinedknockoffswith multi-resolution structures.
- Alternative Test Statistics & Flexibility (late 2010s-present):
-
The concept of
e-valuesgained traction as a robust and easily aggregatable alternative top-values, leading toe-BH[44] ande-filtermethods [18]. -
Methods like
DS[15],GM[46], andSAS[45] developed powerfulFDRcontrol procedures for specific challenges like high correlations or two-dimensional statistics. -
Derandomizationtechniques [33, 31] were introduced to address the instability of stochasticFDRprocedures.This paper's work fits into the most recent phase of this evolution, aiming to unify these diverse advancements. Instead of proposing yet another specific
FDRcontrol method, it offers a meta-framework that can flexibly combine existing state-of-the-art methods tailored to different resolutions or data characteristics.
-
3.4. Differentiation Analysis
The SFEFP method significantly differentiates itself from previous work through its emphasis on flexibility, stabilization, and enhanced power in multi-resolution FDR control:
-
Flexibility in Base Procedures:
- Previous:
MKFande-MKFare tied to theknockoff framework.P-filteris tied top-values. These methods generally require a uniform approach across resolutions. - SFEFP: The core innovation is the
generalized e-valueandgeneralized e-filter. This allowsSFEFPto directly integrate anyFDR-controlled detection procedure (e.g.,knockoffs,DS,GM,SAS, orp-valuebased methods) as a base procedure for any given layer. This means researchers can select the optimal method for each specific resolution based on its characteristics (e.g., high correlation, sparsity, prior knowledge).
- Previous:
-
Addressing the "One-bit" Dilemma and Enhanced Power:
- Previous:
e-MKF(and generallye-filterswithone-bit e-values) can suffer from a "zero-power dilemma." If conflicting signals lead to certain all-or-nothinge-valuesat different layers, the final filter might select no features. Also,derandomizationfor single resolutions (e.g.,Ren et al. [33]) often trades power for stability. - SFEFP: The
stabilization treatmentis crucial. By averaginggeneralized e-valuesfrom multiple runs (replications),SFEFPgenerates non-binarygeneralized e-values. Theseaveraged e-valuesprovide a more nuanced ranking of feature/group importance, better reconciling discrepancies across layers. This leads to significantlyenhanced detection powerandstability, avoiding the "zero-power" issue and improving upon the power trade-offs of prior derandomization schemes.
- Previous:
-
Generality of E-value Construction:
- Previous: Some
e-valueconstructions (e.g., [1, 22, 25, 31]) are either less powerful, rely on specific assumptions (like mutual independence of nullp-values), or are tied to theknockoffframework. - SFEFP: The proposed construction of
generalized e-values(Equation 3) is more generic and powerful. It can be derived from any detection procedure that satisfiesDefinition 1(i.e., controlsFDRunder finite or asymptotic settings). This includes methods likeDS,GM, andSAS, which go beyond the scope of priore-valueconstructions.
- Previous: Some
-
Multilayer FDR Control with Flexible Procedures:
-
Previous: While
MKFande-MKFoffer multilayerFDRcontrol, their conservatism and limitations with high correlations or intractable joint distributions make them less ideal in some scenarios. -
SFEFP: Provides robust
multilayer FDR controlwhile allowing for the incorporation of highly powerful, specialized base procedures at each layer. This meansSFEFPcan achieveFDRcontrol without sacrificing power in complex settings. For instance,eDS-filterdemonstrates superior performance in highly correlated feature settings compared toMKF-based methods.In essence,
SFEFPshifts the paradigm from finding a single "best"FDRcontrol method for all resolutions to providing a flexible framework that effectively integrates the strengths of various specialized methods, thereby achieving superior and more stable detection performance across diverse multi-resolution data.
-
4. Methodology
4.1. Principles
The core idea behind the Stabilized Flexible E-Filter Procedure (SFEFP) is to provide a flexible and powerful framework for feature detection with simultaneous FDR control across multiple resolutions. The foundational principles are:
- Multi-Resolution Problem Formulation: Explicitly defining individual features and feature groups at multiple "layers" as hypotheses to be tested, with the goal of controlling
FDRat each layer. - Generalized E-Values for Any Detection Procedure: Instead of relying on
p-valuesor specificknockoff statistics, the method introduces a unified way to construct "generalized e-values" from any base detection procedure that controlsFDR. This allows for flexibility in choosing state-of-the-art methods tailored to specific data characteristics or resolutions. - Generalized E-Filter for Multi-Layer Selection: A novel
e-filterthat operates on thesegeneralized e-valuesto identify a coherent set of significant features and groups across all specified layers, while ensuringFDRcontrol at each. - Stabilization to Enhance Power and Resolve "One-bit" Dilemma: A crucial step involves running the base detection procedures multiple times (replications) and averaging the resulting
generalized e-values. This "stabilization" transforms potentially "one-bit" (binary)e-valuesinto continuous, nuanced scores, providing better ranking information. This helps overcome the "zero-power dilemma" (where conflicting binary signals across layers lead to no discoveries) and generally enhances detection power and stability. - Theoretical Guarantees: Rigorous statistical theory underpins
SFEFP, demonstrating its ability to controlFDRat multiple resolutions and guaranteeing stability of its selection set.
4.2. Core Methodology In-depth (Layer by Layer)
The methodology is developed in stages, starting with problem setup, introducing the Flexible E-Filter Procedure (FEFP), and then extending it to the Stabilized Flexible E-Filter Procedure (SFEFP).
4.2.1. Problem Setup
The paper considers a response variable and a set of features . For i.i.d. samples, we have data , where is the response vector and is the design matrix.
Individual Feature Hypotheses: The feature detection problem is formalized as multiple hypothesis tests: where and . This means the null hypothesis states that provides no additional information about given all other features. If is false, is a relevant feature.
-
is the set of relevant features.
-
is the set of irrelevant features.
Group Hypotheses at Multiple Resolutions: The features are interpreted at different resolutions. For each layer , the features are partitioned into groups, denoted by .
-
h(m, j)represents the group that feature belongs to at the -th layer. -
Group detection at layer is defined as: where . This null hypothesis states that the entire group of features is irrelevant given all other features outside this group.
-
The relationship between individual and group hypotheses is assumed to be: This means a group null hypothesis is true if and only if all individual feature null hypotheses within that group are true.
-
The set of null groups at layer is: A group is a null group if all features within it are irrelevant.
False Discovery Rate (FDR) at Multiple Resolutions: Given a selected feature set , the set of selected groups in layer is .
The FDR at the -th layer is defined as:
$
\mathrm{FDR}^{(m)} = \mathbb{E}[\mathrm{FDP}^{(m)}], \quad \mathrm{FDP}^{(m)} = \frac{\Big| \mathcal{S}^{(m)} \cap \mathcal{H}_0^{(m)} \Big|}{\big| \mathcal{S}^{(m)} \big| \vee 1},
$
where measures the size of a set. The goal is to find the largest set such that remains below a predefined level for all simultaneously.
4.2.2. Recap: Multiple Testing with e-values
An e-variable is a non-negative random variable with . An e-value is a realization of an e-variable.
For a significance level , a null hypothesis is rejected if .
- e-BH procedure [44]: For multiple tests with
e-values, thee-BHprocedure rankse-valuesand rejects hypotheses with correspondinge-valuesexceeding , where . - Relaxed e-values:
e-valuesare calledrelaxed e-valuesif , which is sufficient forFDRcontrol.
4.2.3. FEFP: Flexible E-Filter Procedure
Framework for Controlled Detection Procedures (Definition 1): A feature (or group) detection procedure determines a rejection threshold by: $ t_{\alpha} = \operatorname*{arg,max}_t R(t), \quad \mathrm{subject~to~} \widehat{\mathrm{FDP}}(t) = \frac{\widehat{V}(t) \vee \alpha}{R(t) \vee 1} \leq \alpha, $ where:
-
R(t)is the number of rejections (selected features/groups) under threshold . -
is the estimated number of false rejections (false positives) under threshold .
-
is the estimated
false discovery proportion. -
denotes the maximum operation (e.g.,
A \vee B = \max(A, B)). The in the numerator is a technical detail to avoid division by zero or very small numbers, ensuring conservativeFDRcontrol.A procedure belongs to:
-
if it controls
FDRunder finite samples, ensuring . -
if it controls
FDRasymptotically as at a proper rate, ensuring .The flexibility of
FEFPcomes from selecting for each layer from either or .
Proposition 1: States that the rejection set of remains unchanged if the FDP constraint is slightly modified to . This means the term in the numerator of Definition 1 is primarily for theoretical FDR control and doesn't change the set of rejected hypotheses in practice.
Construction of Generalized e-values (Definition 2):
Relaxed e-values: For hypotheses with non-negative test statistics , they arerelaxed e-valuesif .Asymptotic e-value: For , if .Asymptotic relaxed e-values: If .Asymptotic e-values,relaxed e-values, andasymptotic relaxed e-valuesare collectively calledgeneralized e-values.
Based on a detection procedure for layer , the generalized e-values for hypotheses are constructed as follows:
- For each layer , execute the detection procedure with an
original FDR level. This yields a selected set and an estimated number of false discoveries . - Transform these results into
generalized e-values: $ e_g^{(m)} = G^{(m)} \cdot \frac{\mathbb{I}\left{g \in \mathcal{G}^{(m)}(\alpha_0^{(m)})\right}}{\widehat{V}_{\mathcal{G}^{(m)}\left(\alpha_0^{(m)}\right)} \vee \alpha_0^{(m)}}. $- : The generalized
e-valuefor group at layer . - : The total number of groups at layer .
- : The indicator function, which is 1 if the condition is true (group is selected by at level ), and 0 otherwise.
- : The estimated number of false discoveries by the base procedure at
original FDR level. - : The
original FDR levelused by the base detection procedure . - : Ensures the denominator is never less than , preventing
e-valuesfrom becoming excessively large. This formula essentially assigns a positivee-valueto selected groups and 0 to non-selected groups. The magnitude depends inversely on the estimated number of false discoveries.
- : The generalized
Theorem 1: This theorem states that the selection set obtained from a detection procedure is precisely the set that would be selected by the generalized e-BH procedure if its input were the generalized e-values constructed by Equation (3) at the same level . This establishes the equivalence and validity of the generalized e-value construction.
Remark 1 (Comparison with existing constructions): The paper highlights that its generalized e-values (Equation 3) are generally more powerful than compound e-values [1, 22] (which use in the denominator instead of ). Also, this construction is more generic than knockoff e-values [31] and e-values relying on independence assumptions [25], as it can be derived from any FDR-controlled procedure satisfying Definition 1.
Leveraging the Generalized e-filter:
The generalized e-filter uses the constructed generalized e-values to identify the final selected features.
- Candidate Selection Set: To ensure consistency across layers (a feature is selected only if all groups containing it are rejected), the candidate selection set is defined for given thresholds as:
$
S(t^{(1)}, \ldots, t^{(M)}) = {j : \mathrm{for~all~} m \in [M], e_{h(m,j)}^{(m)} \geq t^{(m)}}.
$
- This means a feature is considered for selection only if its
e-valuefor grouph(m,j)at layer is above the threshold for all layers .
- This means a feature is considered for selection only if its
- Estimated False Discovery Proportion (): For the purpose of setting thresholds, the estimated
FDPat layer is approximated as: $ \widehat{\mathrm{FDP}}^{(m)}(t^{(1)}, \ldots, t^{(M)}) = \frac{G^{(m)} / t^{(m)}}{\left|S^{(m)}(t^{(1)}, \ldots, t^{(M)})\right|}. $- The numerator approximates . This is based on Markov's inequality and the
e-valueproperty. - The denominator is the number of selected groups at layer using the consistent feature selection set .
- The numerator approximates . This is based on Markov's inequality and the
- Admissible Thresholds: The set of admissible thresholds contains all threshold vectors such that for all layers .
- Final Thresholds: For each layer , the final threshold is chosen to maximize detections while staying within the admissible set:
$
\widehat{t}^{(m)} = \operatorname*{min}{t^{(m)} : (t^{(1)}, \ldots, t^{(M)}) \in \mathcal{T}(\alpha^{(1)}, \ldots, \alpha^{(M)})}.
$
This is an iterative process, usually starting with high thresholds and gradually lowering them to maximize discoveries while maintaining
FDRcontrol.
Algorithm 1: FEFP: a flexible e-filter procedure for feature detection The algorithm formalizes the steps:
- Input: Data , target
FDRlevels , originalFDRlevels , and partitions . - For each layer :
- Compute the
generalized e-valuesusing Equation (3). This involves running the base detection procedure at level to get selected groups and estimated false discoveries.
- Compute the
- Iterative Threshold Determination (Generalized e-filter):
- Initialize thresholds for all .
- Repeat:
- For each layer :
- Update by finding the minimum such that the condition is met. This is an implicit update as depends on all thresholds.
- Until all values are unchanged.
- For each layer :
- Final Selection:
-
Compute the final selection set using the converged thresholds and Equation (4).
-
If is non-empty, output . Otherwise, output an empty set.
Proposition 2: Confirms that the output threshold vector from Algorithm 2 (the generalized e-filter) matches the one defined by Equation (5). This ensures the algorithmic implementation correctly finds the desired
FDR-controlled thresholds.
-
Multilayer FDR Control and One-bit Property of FEFP:
- Lemma 1: Provides theoretical bounds for . If are
e-values, . If they arerelaxed e-values, . Similar asymptotic bounds are given forasymptotic e-valuesandasymptotic relaxed e-values. This establishes the fundamentalFDRcontrol property of thegeneralized e-filter. - Theorem 2: Guarantees that
FEFPcontrols simultaneously for all layers . This holds in finite sample settings if the base procedure , and asymptotically if and group sizes are uniformly bounded. This is a crucial theoretical result validatingFEFP'sFDRcontrol. - Theorem 3 (One-bit Property): This theorem reveals a critical limitation of
FEFP. Thegeneralized e-valuesfrom Equation (3) are "one-bit" or binary: a group either gets a positivee-value(if selected by ) or zero (if not selected). $ S_{\mathrm{init}}^{(m)} = \left{g \in G^{(m)} : A_g^{(m)} \bigcap \left[\bigcap_{l=1}^{M} \left{j : e_{h(l,j)}^{(l)} > 0\right}\right] \neq \emptyset\right}. $FEFPselects a specific set of features if and only if a certain condition involving the estimated number of false discoveries () and the size of is met for all layers. If this condition is not met (i.e., if there's sufficient conflict or incompatibility between the initial layer-specific discoveries), thenFEFPselects no features, leading tozero power. This "one-bit" nature means all selected groups are treated equally important at each layer, making it difficult to reconcile conflicts between layers and potentially leading to thiszero-power dilemma.
4.2.4. SFEFP: Stabilized Flexible E-Filter Procedure
To address the zero-power dilemma and instability of FEFP, SFEFP introduces a stabilization treatment. It aims to generate non-one-bit generalized e-values that provide better ranking information.
Algorithm 3: SFEFP: a stabilized flexible e-filter procedure for feature detection
- Input: Same as
FEFP, plus number of replications . - For (replications):
- Compute the
generalized e-valuefor each group at layer for this specific run . This is done using Equation (6), which is essentially Equation (3) but subscripted with to denote the replication: $ e_{gr}^{(m)} = G^{(m)} \cdot \frac{\mathbb{I}\left{g \in \mathcal{G}r^{(m)}(\alpha_0^{(m)})\right}}{\widehat{V}{\mathcal{G}_r^{(m)}\left(\alpha_0^{(m)}\right)} \vee \alpha_0^{(m)}}. $- Here, denotes the base detection procedure (which may have inherent randomness or change if different deterministic procedures are fused) for replication .
- Compute the
- Compute Averaged Generalized e-values: Aggregate the
generalized e-valuesfrom all replications by calculating a weighted average: $ \overline{e}g^{(m)} = \sum{r=1}^{R} \omega_r^{(m)} e_{gr}^{(m)}, \quad \sum_{r=1}^{R} \omega_r^{(m)} = 1. $- : The
averaged generalized e-valuefor group at layer . - : Weight for replication at layer . The paper suggests for simplicity. This averaging step is crucial: it transforms the "one-bit" (binary) into continuous values, which reflect how consistently a group was selected across replications, providing a more refined measure of importance.
- : The
- Apply Generalized e-filter: Use the
averaged generalized e-valuesas input to thegeneralized e-filter(Algorithm 2), along with the targetFDRlevels , to obtain the final set of selected features, .
Two settings for SFEFP:
- Setting 1 (Inherent randomness): If has inherent randomness (e.g.,
model-X knockoff), different runs will produce differente-values. Averaging these stabilizes the result. - Setting 2 (No randomness): If is deterministic, multiple procedures (e.g., using different random seeds for data generation, or different types of base procedures) are run, and their one-bit generalized
e-valuesare averaged. This is termed "fusion decision."
Multilayer FDR Control and Stability Guarantee:
- Theorem 4:
SFEFPsimultaneously controls for all layers . This holds in finite samples if and asymptotically if (and group sizes are uniformly bounded). This extends theFDRcontrol guarantee ofFEFPto the stabilized version. - Theorem 5 (Stability): This theorem guarantees that as the number of replications , the selected set obtained by
SFEFPalmost surely converges to a fixed set (which is the selection set obtained using the true expectede-values). The probability of is bounded below by an exponential term involving and a "gap" , defined as . This ensures that with enough replications, the output ofSFEFPbecomes stable and consistent.
Choices of Parameters:
- Original FDR level : This parameter influences the magnitude and number of non-zero
generalized e-values.- For (single resolution): is suggested. For , is optimal.
- For (multiple resolutions): is generally not optimal when because of the "one-bit" nature. Instead,
SFEFPbenefits from choosing to maximize the number and magnitude of non-zeroe-valuesat each layer, facilitating coordination. A common choice in practice is .
- A smaller can lead to fewer non-zero
generalized e-valuesbut with larger magnitudes, while a larger might result in more non-zeroe-valuesbut with smaller magnitudes due to inflated false discovery estimates. The optimal choice is a balance.
4.2.5. Examples for SFEFP
The paper provides concrete instantiations of SFEFP by specifying the base detection procedures.
eDS-filter: multilayer FDR control by data splitting (Section 4.1)
The eDS-filter uses the Data Splitting (DS) procedure [15] as its base detection method , extended for group detection. DS is particularly powerful for high-dimensional regression with highly correlated features.
- Review of DS method [15]:
- Uses Lasso+OLS to estimate coefficients .
- Splits data into two subsets to get independent coefficient estimates and .
- Assumption 1 (Symmetry): For , the sampling distribution of or is symmetric about zero.
- Test statistic is constructed as:
$
W_j = \mathrm{sign}(\widehat{\beta}_j^{(1)} \widehat{\beta}_j^{(2)}) f(|\widehat{\beta}_j^{(1)}|, |\widehat{\beta}_j^{(2)}|),
$
where
f(u, v)is non-negative, exchangeable (), and monotonically increasing (e.g., ). A larger positive indicates a more relevant feature. FDRcontrol is achieved by comparing positive to negative usingt_alpha: $ t_{\alpha} = \operatorname*{min}\left{t > 0 : \widehat{\mathrm{FDP}}(t) = \frac{#{j : W_j < -t}}{#{j : W_j > t} \vee 1} \leq \alpha\right}. $
- Data splitting for group detection:
- For a group at layer , a group-level test statistic is constructed by averaging the individual feature statistics within that group: $ T_g^{(m)} = \frac{1}{\vert \mathcal{A}g^{(m)} \vert} \sum{j \in \mathcal{A}_g^{(m)}} W_j, \quad g \in [G^{(m)}]. $
- Lemma 2: Under
Assumption 1, for null groups, . This symmetry allows forFDRestimation. - The estimated
FDPfor groups is: $ \widehat{\mathrm{FDP}}^{(m)}(t) = \frac{#\left{g : T_g^{(m)} < -t\right}}{#\left{g : T_g^{(m)} > t\right} \vee 1}. $ - Assumption 2 (Weak dependence): A technical condition on the covariance of indicator functions for null group test statistics, ensuring asymptotic
FDRcontrol. - Theorem 6: Under
Assumption 1andAssumption 2, the groupDSprocedure controlsFDRasymptotically.
- eDS-filter construction:
- At each layer , for replications, the
DSprocedure is run. - For each run , the group-level test statistics are computed.
- The threshold is computed for each run and layer : $ t_{\alpha_0^{(m)}}^r = \operatorname*{inf}\left{t > 0 : \frac{#\left{g : T_{gr}^{(m)} < -t\right}}{#\left{g : T_{gr}^{(m)} > t\right} \vee 1} \leq \alpha_0^{(m)}\right}. $
- The
DS generalized e-valuesare then constructed: $ e_{gr}^{(m)} = G^{(m)} \cdot \frac{\mathbb{I}\left{T_{gr}^{(m)} \geq t_{\alpha_0^{(m)}}^r\right}}{#\left{g : T_{gr}^{(m)} \leq -t_{\alpha_0^{(m)}}^r\right} \vee \alpha_0^{(m)}}, \quad g \in [G^{(m)}]. $ - These are then averaged (Equation 7) and fed into the
generalized e-filter.
- At each layer , for replications, the
- Theorem 7: Guarantees that the
eDS-filtercontrols asymptotically under certain conditions, confirming that itsgeneralized e-valuesareasymptotic relaxed e-values.
eDS+gKF-filter (Section 4.2)
This is a hybrid SFEFP variant designed for settings where features within a group are highly correlated (benefiting DS), but signals within groups might be sparse (making group DS less powerful) and group knockoffs might be preferable.
- Layer 1 (individual features): Uses
DS(specifically Lasso+OLS basedDS) to generategeneralized e-valuesfor individual features. - Subsequent Layers (groups ): Uses
group knockoff filter[13] to generategeneralized e-valuesfor groups.Group knockoff statisticsare constructed (e.g., usingfixed design group knockoffsif , or othermodel-Xvariants).Group knockoffssatisfy amartingale property: $ \mathbb{E}\left[\frac{#\left{g \in \mathcal{H}0^{(m)} : T{gr}^{(m)} \geq t_{\alpha_0^{(m)}}^r\right}}{1 + #\left{g \in [G^{(m)}] : T_{gr}^{(m)} \leq -t_{\alpha_0^{(m)}}^r\right}}\right] \leq 1, $ where is determined by: $ t_{\alpha_0^{(m)}}^r = \operatorname*{inf}\left{t > 0 : \frac{1 + #\left{g : T_{gr}^{(m)} < -t\right}}{#\left{g : T_{gr}^{(m)} > t\right} \vee 1} \leq \alpha_0^{(m)}\right}. $- Based on this,
group knockoff generalized e-valuesare constructed: $ e_{gr}^{(m)} = G^{(m)} \cdot \frac{\mathbb{I}\left{T_{gr}^{(m)} \geq t_{\alpha_0^{(m)}}^r\right}}{1 + #\left{g : T_{gr}^{(m)} \leq -t_{\alpha_0^{(m)}}^r\right}}. $
- Finally, these
generalized e-values(fromDSfor layer 1 andgroup knockoffsfor other layers) are averaged and passed to thegeneralized e-filter.
Other applications and possible extensions of SFEFP (Section 4.3)
e-MKF(stable and powerful version): The paper suggests that applyingSFEFPtoknockoffprocedures (wheree-MKFis an instantiation ofFEFPwithknockoff e-values) creates a stable and more powerfule-MKF, addressing thezero-power dilemmaof the originale-MKF.GM e-valuesandSAS e-values: The supplementary material demonstrates howGM[46] andSAS[45] methods can also be used as base procedures to generateasymptotic relaxed e-valuesforSFEFP.Extensions to time series data: MentionsTSKI (Time Series Knockoffs Inference)[11] and suggestsSFEFPcould be modified for time series by adapting to subsampling settings and constructing robuste-valuesforgroup knockoff filters.
4.2.6. Unified E-Filter (Appendix E)
The paper also presents a "unified e-filter" (Algorithm 4) as an extension to incorporate prior knowledge (penalties and priors) and handle overlapping groups and null-proportion adaptivity, similar to the p-filter [30]. This provides greater flexibility for domain experts.
- Overlapping groups: Allows a feature to belong to multiple groups at a given layer . is the index set of groups containing .
- Leftover features: are features not belonging to any group in partition .
- Penalties and priors: are penalties (e.g., cost of false discovery for ), and are priors (e.g., prior probability of being true). These are normalized such that .
- Null-proportion adaptivity: A weighted null proportion estimator (Equation 15) can be used to enhance power when
e-valuesare independent at a layer. $ \widehat{\pi}^{(m)} := \frac{|\boldsymbol{u}^{(m)} \cdot \boldsymbol{v}^{(m)}|_\infty + \sum_g u_g^{(m)} v_g^{(m)} \mathbf{1}\Big{e_g^{(m)} < 1/\lambda^{(m)}\Big}}{G^{(m)} \big(1 - \lambda^{(m)}\big)}. $- : The maximum element-wise product of penalty and prior vectors.
- : A user-defined constant for adaptivity.
- : Indicator function.
This estimator refines the
FDPcalculation by adapting to the estimated proportion of true null hypotheses based on the observede-values.
- Penalty-weighted FDR control: The goal is to control , where and
$
\mathrm{FDP}u^{(m)} = \frac{\sum{g \in \mathcal{H}0^{(m)}} u_g^{(m)} \mathbf{1}\big{g \in \mathcal{S}^{(m)}\big}}{\sum{g \in [G^{(m)}]} u_g^{(m)} \mathbf{1}\big{g \in \mathcal{S}^{(m)}\big}}.
$
This
FDPdefinition incorporates the penalties . - Candidate selection set for individuals (Equation 16):
$
\begin{array}{rcl}
\mathcal{S}(\vec{k}) & = & \Big{i : \forall m, \mathrm{either~} i \in L^{(m)}, \mathrm{or~} \exists g \in g^{(m)}(i), \
& & \qquad e_g^{(m)} \geq \operatorname*{max}\Big{\frac{1{\widehat{k}^{(m)} \neq 0} \widehat{\pi}^{(m)} G^{(m)}}{v_g^{(m)} \alpha^{(m)} \widehat{k}^{(m)}}, \ 1{\widehat{k}^{(m)} = 0} \cdot \infty, \frac{1}{\lambda^{(m)}}\Big}\Big}
\end{array}
$
- are rejection count thresholds.
- This ensures
internal consistency: an individual feature is selected only if, for every layer , either it's aleftoverfeature, or at least one of its containing groups is selected by thee-filterwith its respectivee-valueexceeding a dynamically calculated threshold.
- Candidate selection set for groups (Equation 17):
$
\mathcal{S}^{(m)}(\vec{k}) = \left{g \in [G^{(m)}] : A_g^{(m)} \cap \mathcal{S}(\vec{k}) \neq \emptyset \mathrm{~and~} e_g^{(m)} \geq \operatorname*{max}\left{\frac{1{\widehat{k}^{(m)} \neq 0} \widehat{\pi}^{(m)} G^{(m)}}{v_g^{(m)} \alpha^{(m)} \widehat{k}^{(m)}}, 1{\widehat{k}^{(m)} = 0} \cdot \infty, \frac{1}{\lambda^{(m)}}\right}\right}.
$
- This defines the groups selected at layer based on the selected individual features and their
e-valuesrelative to thresholds that incorporate penalties, priors, and null proportion estimates.
- This defines the groups selected at layer based on the selected individual features and their
- Algorithm 4: The unified e-filter
- Initializes and (using Equation 15).
- Repeats iteratively: For each layer , updates by finding the maximum such that the sum of penalties of selected groups for that layer is at least . The selection of groups depends on the current vector (and thus other layers' decisions).
- Outputs the final threshold vector and the corresponding rejected set .
- Theorem 10: This theorem provides
FDRcontrol guarantees for theunified e-filter, both with and without null-proportion adaptivity, fore-valuesandrelaxed e-values, in finite-sample and asymptotic settings.
4.2.7. General Formatting for Formulae
For example, when constructing the generalized e-values in FEFP (Equation 3):
$
e_g^{(m)} = G^{(m)} \cdot \frac{\mathbb{I}\left{g \in \mathcal{G}^{(m)}(\alpha_0^{(m)})\right}}{\widehat{V}_{\mathcal{G}^{(m)}\left(\alpha_0^{(m)}\right)} \vee \alpha_0^{(m)}}.
$
- : The computed generalized
e-valuefor group at resolution layer . - : The total number of groups at resolution layer .
- : An indicator function. It takes a value of 1 if the condition inside the braces is true (i.e., group is included in the selection set produced by the base detection procedure at
original FDR level), and 0 otherwise. - : The set of selected groups at layer by the base detection procedure when controlling
FDRat itsoriginal FDR level. - : The estimated number of false discoveries (false positives) from the output of the base detection procedure at
original FDR level. - : This denotes the maximum of and . This term is included in the denominator to prevent it from becoming too small (if is 0 or very small), which would otherwise lead to an excessively large
e-value. This ensures a conservative lower bound on the denominator, contributing to robustFDRcontrol.
5. Experimental Setup
The experimental setup focuses on two main types of evaluations: simulation studies to rigorously test the theoretical properties and performance under controlled conditions, and real-world data analysis (HIV mutation data) to demonstrate practical applicability and superiority.
5.1. Datasets
5.1.1. Simulated Data
The simulation studies are based on a linear model to generate synthetic data: $ \pmb{y} = \pmb{X} \beta + \pmb{\epsilon}, $ where:
- : The response vector.
- : The design matrix.
- : The true coefficient vector, indicating feature relevance.
- : Random noise, sampled from a normal distribution , where is the identity matrix.
Data Characteristics:
- Multi-resolution Structure: Two layers are considered:
- Individual features ( features).
- Groups ( groups), with each group containing
N/Gfeatures.
- Design Matrix (): Each row of is independently sampled from .
- : A block-diagonal matrix composed of Toeplitz submatrices.
- Toeplitz Matrix: A matrix where each descending diagonal from left to right is constant. This structure models
autocorrelation, where elements closer together are more correlated. - Block-Diagonal Structure: The overall covariance matrix is composed of independent blocks. This implies that features within a group are highly correlated (due to the Toeplitz structure), but features between different groups have near-zero correlation. This is a realistic setup for many biological or genomic datasets.
- The specific Toeplitz submatrix structure is: $ \left[ \begin{array}{cccccc} 1 & \frac{(G'-2)\rho}{G'-1} & \frac{(G'-3)\rho}{G'-1} & \dots & \frac{\rho}{G'-1} & 0 \ \frac{(G'-2)\rho}{G'-1} & 1 & \frac{(G'-2)\rho}{G'-1} & \dots & \frac{2\rho}{G'-1} & \frac{\rho}{G'-1} \ \vdots & \ddots & & & \vdots \ 0 & \frac{\rho}{G'-1} & \frac{2\rho}{G'-1} & \dots & \frac{(G'-2)\rho}{G'-1} & 1 \ \end{array} \right], $ where is the group size, and is the correlation parameter controlling the strength of correlation within groups.
- Toeplitz Matrix: A matrix where each descending diagonal from left to right is constant. This structure models
- : A block-diagonal matrix composed of Toeplitz submatrices.
- Relevant Features ():
- groups are randomly selected as "signal groups".
- (number of relevant features) elements are randomly selected from within these signal groups.
- On average, a relevant group contains relevant features.
- Signal Strength (): For , is sampled from . controls how strong the signal of relevant features is.
Simulation Parameters:
- Low-dimensional settings: , , (so features per group). , (on average,
3relevant features per signal group).- Correlation : varied in (fixed ).
- Signal strength : varied in (fixed ).
- High-dimensional settings: , , (so features per group). , .
- Correlation : varied in (fixed ).
- Signal strength : varied in (fixed ).
- All simulation results are averaged over 50 independent trials.
5.1.2. HIV Mutation Data
This dataset was previously analyzed in various studies [2, 15, 27, 35, 40] and contains information about HIV-1 mutations associated with drug resistance.
-
Domain: HIV-1 drug resistance.
-
Response Variable (): Log-fold increase of lab-tested drug resistance.
-
Features (): Binary variables indicating the presence or absence of mutation . Different mutations at the same location are treated as distinct features.
-
Multi-resolution Structure:
- Individual Mutations: The primary features.
- Mutation Positions: Mutations are naturally grouped based on their known genomic locations. These positions form the
group layer.
-
Goal: Identify mutations and their clusters (positions) that influence drug resistance, controlling
individual-FDRandgroup-FDRsimultaneously. -
Drug Classes:
- Protease Inhibitors (PIs): APV, ATV, IDV, LPV, NFV, RTV, SQV.
- Nucleoside Reverse Transcriptase Inhibitors (NRTIs): ABC, AZT, D4T, DDI.
-
Preprocessing: For each drug, rows lacking drug resistance info are removed, and mutations appearing fewer than 3 times are excluded.
-
Linear Model Assumption: Consistent with prior work, a linear model between response and features (no interaction terms) is assumed.
-
Ground Truth: The
treatment-selected mutation (TSM) panels[34] are used as a reference standard for evaluating performance (approximation of ground truth).The following are the results from [Table 1] of the original paper: TABLE 1 Sample information for the seven
P I-type drugs and the three NRTI-type drugs.
| Drug type | Drug | Sample size | # mutations | # positions genotyped |
| PI | APV | 767 | 201 | 65 |
| ATV | 328 | 147 | 60 | |
| IDV | 825 | 206 | 66 | |
| LPV | 515 | 184 | 65 | |
| NFV | 842 | 207 | 66 | |
| RTV | 793 | 205 | 65 | |
| SQV | 824 | 206 | 65 | |
| NRTI | ABC | 623 | 283 | 105 |
| AZT | 626 | 283 | 105 | |
| D4T | 625 | 281 | 104 | |
| DDI | 628 | 283 | 105 |
- Sample size: Number of HIV-1 samples available for each drug.
- # mutations: Total number of unique mutations considered as features after preprocessing.
- # positions genotyped: Number of distinct genomic locations (groups) where mutations were observed.
5.2. Evaluation Metrics
The primary evaluation metrics used are False Discovery Proportion (FDP) and Power.
5.2.1. False Discovery Proportion (FDP)
- Conceptual Definition:
FDPis the proportion of false discoveries among all discoveries made in a given experiment. It directly measures the empirical performance of anFDRcontrol procedure. If a method claims to controlFDRat a level , the observedFDPshould ideally be close to or below . - Mathematical Formula: $ \mathrm{FDP} = \frac{\mathrm{V}}{\mathrm{R} \vee 1} $
- Symbol Explanation:
-
: The number of false positives (false discoveries), which are true null hypotheses that were incorrectly rejected.
-
: The total number of rejections (discoveries) made by the method.
-
: Denotes the maximum of and 1. This ensures that if no discoveries are made (), the denominator is 1, and the
FDPis correctly calculated as 0, preventing division by zero.In the context of multi-resolution testing,
FDPis calculated separately for individual features (FDP (ind)) and for groups (FDP (grp)) at each layer , as .
-
5.2.2. Power
- Conceptual Definition:
Power(or True Positive Rate) is the probability that a statistical test correctly rejects a false null hypothesis. In the context of feature selection, it refers to the proportion of truly relevant features (or groups) that are successfully detected by the method. Higher power indicates a more sensitive and effective method. - Mathematical Formula: $ \mathrm{Power} = \frac{\mathrm{TP}}{\mathrm{P}} $
- Symbol Explanation:
-
: The number of true positives, which are truly relevant features (or groups) that were correctly detected.
-
: The total number of truly relevant features (or groups) available in the dataset (i.e., the size of or ).
In the simulation studies, "Power" is often shown as the raw count of true positives, or implicitly as a measure of how many truly relevant items are discovered. For the HIV data, "True (ind)" and "True (grp)" are the number of true positives identified for individual mutations and mutation groups, respectively.
-
5.3. Baselines
The proposed SFEFP methods (e.g., eDS-filter, eDS+gKF-filter) are compared against several established FDR control procedures, particularly those designed for multi-resolution settings or known to work well in specific contexts.
-
MKF+ [23] (Multilayer Knockoff Filter):
- Description: An extension of the
knockoff filterto controlFDRsimultaneously at multiple resolutions (layers). It uses a default constant (referred to as or ) to balance conservatism and power. - Why representative: It's a direct competitor for multilayer
FDRcontrol.
- Description: An extension of the
-
e-MKF [18]:
- Description: An
e-filterbased method that leveragesone-bit knockoff e-valuesfor multilayerFDRcontrol. It is an extension ofMKFusing thee-valueframework. - Why representative: It is a direct
e-valuebased counterpart to and a predecessor toSFEFPin usinge-valuesfor multilayer control, highlighting theone-bitissue addressed bySFEFP.
- Description: An
-
KF+ (Knockoff Filter+) [2]:
- Description: The original
knockoff filterfor single-resolutionFDRcontrol of individual features. The+usually indicates a practical variant that might relax theFDRslightly for higher power. - Why representative: Included to illustrate the necessity of multilayer
FDRcontrol. It is expected to control individualFDRbut not necessarily groupFDR, orFDRacross both simultaneously.
- Description: The original
-
SFEFP variants (proposed methods):
- eDS-filter: An instantiation of
SFEFPwhere the base detection procedure for all layers is theDS (Data Splitting)method, extended for group detection. - eDS+gKF-filter: An instantiation of
SFEFPwhereDSis used for the individual feature layer, andgroup knockoffis used for the group layers. - KF+gDS: An instantiation of
SFEFPwhereKnockoff Filteris used for the individual feature layer, andDSis used for the group layers. This is not explicitly detailed but implied as a combination symmetric to .
- eDS-filter: An instantiation of
-
FEFP (single replication) variants (denoted with
*prefix):- *e-MKF, *eDS-filter, *eDS+gKF, *KF+gDS: These denote the
FEFPversions (i.e.,SFEFPwith replication). These are crucial for demonstrating thepower enhancement brought by stabilization(derandomization) inSFEFPcompared toFEFP(which suffers from theone-bit dilemma). For example,*e-MKFis exactly thee-MKFmethod proposed by Gablenz et al. [18].
- *e-MKF, *eDS-filter, *eDS+gKF, *KF+gDS: These denote the
Specifics for Base Procedures:
- Fixed design (group) knockoffs: Used for low-dimensional simulations, typically with a
signed-maxfunction as the test statistic. - Model-X (group) knockoffs: Used for high-dimensional simulations, implemented using the
knockoffsR package [12]. - DS procedure (individual selection): Implemented using the procedure, as described in [15].
- DS procedure (group selection): Test statistic (Equation 10) is computed by averaging statistics obtained from the procedure.
FDR Levels:
- Target FDR levels: Set to for all methods across all simulations.
- Original FDR levels (for SFEFP methods): Set to for .
- Number of replications (for SFEFP): for simulations, for HIV data.
6. Results & Analysis
The experimental results are presented through simulation studies and a real-world HIV mutation data analysis. The key aspects evaluated are FDR control (measured by FDP) and Power.
6.1. Core Results Analysis
6.1.1. Simulation Results (Low-dimensional settings)
Settings: , , (10 features per group). , . Target FDR . Original FDR . Averaged over 50 trials.
The following figure (Figure 1 from the original paper) shows the simulation results for methods under different correlations, while fixing the signal strength as 3:

该图像是一个图表,展示了不同检测方法在个体和组别特征的功效(Power)和假发现率(FDR)下的模拟结果。图表分为四个部分:上左为个体特征的功效,右侧为个体特征的FDR;下左为组别特征的功效,右侧为组别特征的FDR。不同方法的表现随相关性(Correlation)变化,明显显示出eDS-filter等方法在控制FDR的同时维持或提高了功效。
-
Analysis of Figure 1 (Varying Correlation , fixed ):
-
FDR Control (Individual and Group): All methods generally control
FDR(orFDP) below the target level of0.2, except for which shows slightly inflatedFDPat higher correlations for individual features. -
Power under High Correlations:
eDS-filterand (both leveraging theDSmethod) consistently demonstrate higher power (more true discoveries) at both individual and group layers, especially as the correlation increases. This highlights the effectiveness ofDSin handling highly correlated features.- ,
e-MKF, and (which rely onknockoffs) show a noticeable drop in power asrhoincreases, confirming the known limitation ofknockoffprocedures in highly correlated settings.
-
Impact of Stabilization (SFEFP vs. FEFP, i.e., solid vs. dashed lines):
SFEFPmethods (solid lines) consistently show higher power than theirFEFPcounterparts (dashed lines) across almost all correlation levels. This clearly demonstrates that the stabilization step (averaginge-valuesover replications) is effective in enhancing detection power and overcoming the "one-bit" limitation ofFEFP.The following figure (Figure 2 from the original paper) shows the simulation results for nine methods under different signal strength, while fixing the correlation as 0.6:
该图像是图表,展示了九种方法在不同信号强度下的模拟结果,左上角显示了个体检测的功效和假发现率,右上角显示个体的假发现率,左下角则展示了组检测的功效,右下角展示了组的假发现率。效果对比表明,eDS-filter在多个分辨率下有效控制假发现率。
-
-
Analysis of Figure 2 (Varying Signal Strength , fixed ):
- FDR Control: All methods generally maintain
FDRcontrol well below0.2at both individual and group levels across varying signal strengths. - Power Enhancement with : As expected, power for all methods increases with signal strength .
- Superiority of eDS-filter and eDS+gKF:
eDS-filterand continue to show superior power compared to ,e-MKF, and . This confirms that methods incorporatingDSare more effective even when signals are strong, especially in correlated environments. - Stabilization Benefit: The solid lines (SFEFP) again outperform their dashed counterparts (FEFP with ), indicating that stabilization consistently improves power regardless of signal strength.
- FDR Control: All methods generally maintain
6.1.2. Simulation Results (High-dimensional settings)
Settings: , , . Target FDR . Original FDR . Averaged over 50 trials.
The following figure (Figure 3 from the original paper) shows the simulation results for the high-dimensional setting under different correlations with :

该图像是图表,展示了在不同相关性下的高维设置中,个体与组的功效(Power)和假发现率(FDR)。上半部分显示了个体的功效与FDR,而下半部分则展示了组的功效与FDR,比较了多种方法的表现。
-
Analysis of Figure 3 (Varying Correlation , fixed in high-dimensions):
-
The patterns observed in low-dimensional settings (Figure 1) are largely replicated here.
-
eDS-filterand maintain their power advantage in high-dimensional settings, particularly at higher correlations. -
The benefit of stabilization (solid vs. dashed lines) is also consistent, showing power enhancement.
The following figure (Figure 4 from the original paper) shows the simulation results for the high-dimensional setting under different signal strength with :
该图像是图表,展示了在不同信号强度下的模拟结果。上方左侧显示了各个方法在个体检测中的功效(Power),上方右侧为个体假发现率(FDR),下方左侧为组检测的功效,下方右侧为组的假发现率,所有数据均呈现出不同信号强度对各方法性能的影响。
-
-
Analysis of Figure 4 (Varying Signal Strength , fixed in high-dimensions):
-
Similar to low-dimensional results (Figure 2),
eDS-filterand show superior power across all signal strengths. -
Stabilization continues to provide a clear power boost.
Overall Simulation Conclusion: The simulation studies strongly validate the two main advantages of
SFEFP:
-
- Flexibility for Enhanced Power: By allowing different base detection procedures (
DSvs.knockoff) at different resolutions,SFEFPmethods (likeeDS-filterand ) can leverage the strengths of these procedures, leading to significantly higher power, especially in settings with high feature correlation. - Stabilization for Enhanced Power: The stabilization step (averaging
e-valuesover multiple replications) consistently improves detection power across various settings (correlation, signal strength, dimensionality) by addressing the "one-bit" dilemma and providing more robuste-values.
6.1.3. HIV Mutation Data Analysis
Goal: Identify important individual mutations and their genomic positions (clusters) associated with drug resistance, controlling FDR at both individual and group levels simultaneously.
Methods compared: , , e-MKF, and eDS-filter.
Reference Standard: Treatment-selected mutation (TSM) panels [34] are used as ground truth.
Target FDR levels: (slightly relaxed for real data analysis, following practical recommendations from [23]).
Replications for eDS-filter: .
Original FDR levels for eDS-filter: for .
The following are the results from [Table 2] of the original paper: TABLE 2 Results r PI dTre" reents the ber tre posities the ber taisr p identied in the TM panel for he PI class f treatments. "False" presents the number false positives. The FDP is calculated as the rato of the number false positives to the total number o positives. The targe FDR leel e α( = α( = 0.. Fo -il e R= 50 α for The best-performing method is highlighted in bold.
| Drug | Method | True (ind) | False (ind) | FDP (ind) | True (grp) | False (grp) | FDP (grp) |
| APV | KF+ | 27 | 9 | 0.250 | 18 | 7 | 0.280 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 27 | 4 | 0.129 | 18 | 2 | 0.100 | |
| ATV | KF+ | 19 | 6 | 0.240 | 19 | 1 | 0.050 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 18 | 1 | 0.053 | 18 | 0 | 0 | |
| IDV | KF+ | 34 | 33 | 0.493 | 24 | 15 | 0.385 |
| MKF+ | 26 | 3 | 0.103 | 17 | 0 | 0 | |
| e-MKF | 26 | 4 | 0.133 | 18 | 0 | 0 | |
| eDS-filter | 27 | 3 | 0.100 | 18 | 0 | 0 | |
| LPV | KF+ | 27 | 8 | 0.229 | 20 | 3 | 0.130 |
| MKF+ | 19 | 3 | 0.136 | 13 | 1 | 0.071 | |
| e-MKF | 19 | 3 | 0.136 | 13 | 1 | 0.071 | |
| eDS-filter | 23 | 3 | 0.115 | 15 | 0 | 0 | |
| NFV | KF+ | 33 | 22 | 0.400 | 24 | 8 | 0.250 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 32 | 8 | 0.200 | 20 | 2 | 0.091 | |
| RTV | KF+ | 19 | 5 | 0.208 | 12 | 2 | 0.143 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 25 | 7 | 0.219 | 17 | 2 | 0.105 | |
| SQV | KF+ | 22 | 6 | 0.214 | 16 | 2 | 0.111 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 22 | 4 | 0.154 | 15 | 0 | 0 |
The following are the results from [Table 3] of the original paper: TABLE 3 D For eDS-filter, we set The best-performing method is highlighted in bold.
| Drug | Method | True (ind) | False (ind) | FDP (ind) | True (grp) | False (grp) | FDP (grp) |
| ABC | KF+ | 14 | 0 | 0.176 | 14 | 2 | 0.125 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 13 | 0 | 0.133 | 12 | 2 | 0.143 | |
| AZT | KF+ | 17 | 0 | 0.320 | 16 | 5 | 0.238 |
| MKF+ | 11 | 0 | 0 | 10 | 0 | 0 | |
| e-MKF | 11 | 0 | 0 | 10 | 0 | 0 | |
| eDS-filter | 15 | 1 | 0.063 | 14 | 0 | 0 | |
| D4T | KF+ | 10 | 1 | 0 | 9 | 1 | 0.100 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 18 | 2 | 0.100 | 16 | 1 | 0.118 | |
| DDI | KF+ | 0 | 0 | 0 | 0 | 0 | 0 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 18 | 4 | 0.182 | 17 | 0 | 0.150 |
- Analysis of Tables 2 and 3 (HIV Mutation Data):
- KF+ Performance: The method (single-resolution
knockoff filter) often fails to controlFDPsimultaneously at both individual and group resolutions. For instance, forIDVandNFV(PI drugs), itsFDP (ind)andFDP (grp)are significantly above0.3. ForAZT(NRTI drug), itsFDP (ind)is0.320andFDP (grp)is0.238, both above the target0.2(the table states target FDR is0.3which is possibly a typo given other context, but even at0.3it fails for individual forAZT). This highlights the necessity of multi-resolutionFDRcontrol. - MKF+ and e-MKF Performance: These methods frequently make
zero discoveries(True (ind)=0, True (grp)=0) for several drugs (APV, ATV, NFV, RTV, SQV for PI; ABC, D4T, DDI for NRTI). This confirms the problem of conservatism or thezero-power dilemmathatSFEFPaims to solve. When they do make discoveries (e.g., IDV, LPV), their power is often lower thaneDS-filter. - eDS-filter (Proposed Method) Performance:
-
Consistent FDR Control: The
eDS-filterconsistently controlsFDPat multiple resolutions (individual and group) simultaneously, with mostFDPvalues at or below the target0.3. -
Superior Power: The
eDS-filtergenerally achieves significantly higher power (more true positives) compared to ande-MKF, especially for drugs where ande-MKFfound nothing. For example, forAPV,eDS-filterfinds 27 individual mutations and 18 groups, while ande-MKFfind 0. ForD4T,eDS-filterfinds 18 individual mutations and 16 groups, while ande-MKFfind 0. -
Comparable or Higher Power than KF+ with better FDP: Compared to ,
eDS-filterachieves similar or higher power (e.g., forRTV), but with much betterFDPcontrol at both resolutions. For instance, forIDV, hasFDP (ind)of0.493andFDP (grp)of0.385, whileeDS-filterhasFDP (ind)of0.100andFDP (grp)of0. -
Comparison with DeepPINK: The paper also mentions (in text) that
eDS-filterachieved higher power thanDeepPINK[27] (a method for individualFDRcontrol) for all drugs except ATV, and lowerFDPfor several drugs.Overall Conclusion from Real Data: The analysis of HIV mutation data strongly supports the advantages of
eDS-filteras an instantiation ofSFEFP. It demonstrates superior power and robustFDRcontrol at multiple resolutions compared to existingmultilayer knockoffande-filtermethods, highlighting the practical benefits of the proposed framework. The flexibility ofSFEFPto incorporate powerful base procedures likeDSfor specific data characteristics (e.g., correlated mutations) is crucial for achieving these results.
-
- KF+ Performance: The method (single-resolution
6.2. Ablation Studies / Parameter Analysis
While the paper doesn't present explicit "ablation studies" in the traditional sense (removing components of SFEFP one by one), the comparison between SFEFP (solid lines) and FEFP (dashed lines, equivalent to SFEFP with replication) in the simulation results serves as a crucial analysis of the effect of the stabilization step.
-
Impact of Stabilization (R > 1 vs. R = 1):
- Observation: Across all simulation figures (Figures 1, 2, 3, 4), the
SFEFPmethods (e.g.,eDS-filter, ,e-MKF, with ) consistently show higher power than theirFEFPcounterparts (prefixed with*, implying ). - Interpretation: This directly demonstrates the effectiveness of the stabilization treatment. The averaging of
generalized e-valuesover multiple runs (derandomization) transforms the "one-bit"e-valuesinto continuous, more informative scores. This effectively addresses thezero-power dilemmaand enhances the ranking information, leading to a more powerful detection of true signals without compromisingFDRcontrol. This finding is particularly significant because derandomization in single-resolution settings is sometimes associated with a power loss, whereas in multi-resolution settings,SFEFPshows a power gain.
- Observation: Across all simulation figures (Figures 1, 2, 3, 4), the
-
Impact of Base Detection Procedure (e.g., DS vs. Knockoffs):
- Observation: Comparing
eDS-filter(DS-based) withe-MKF(Knockoff-based) or (Knockoff on layer 1) clearly shows thatDS-based approaches yield significantly higher power when features are highly correlated (Figures 1, 3). - Interpretation: This validates the
flexibility principleofSFEFP. By allowing practitioners to choose the most suitable base detection procedure for each layer (e.g.,DSfor highly correlated features,knockoffsfor other settings),SFEFPcan achieve superior performance tailored to the data's characteristics. This is a form of "component analysis" showing the benefit ofSFEFP's modularity.
- Observation: Comparing
-
Impact of Original FDR Level ():
- The paper discusses that the choice of affects the number and magnitude of non-zero
generalized e-values. For multi-resolution settings, might not be optimal forFEFP(due to theone-bit dilemma). While specific ablation studies on are not presented, the discussion implies that (used in simulations) is a reasonable practical choice to balance these factors forSFEFP.
- The paper discusses that the choice of affects the number and magnitude of non-zero
6.3. Data Presentation (Tables)
The following are the results from [Table 1] of the original paper:
TABLE 1 Sample information for the seven P I -type drugs and the three NRTI-type drugs.
| Drug type | Drug | Sample size | # mutations | # positions genotyped |
| PI | APV | 767 | 201 | 65 |
| ATV | 328 | 147 | 60 | |
| IDV | 825 | 206 | 66 | |
| LPV | 515 | 184 | 65 | |
| NFV | 842 | 207 | 66 | |
| RTV | 793 | 205 | 65 | |
| SQV | 824 | 206 | 65 | |
| NRTI | ABC | 623 | 283 | 105 |
| AZT | 626 | 283 | 105 | |
| D4T | 625 | 281 | 104 | |
| DDI | 628 | 283 | 105 |
The following are the results from [Table 2] of the original paper: TABLE 2 Results r PI dTre" reents the ber tre posities the ber taisr p identied in the TM panel for he PI class f treatments. "False" presents the number false positives. The FDP is calculated as the rato of the number false positives to the total number o positives. The targe FDR leel e α( = α( = 0.. Fo -il e R= 50 α for The best-performing method is highlighted in bold.
| Drug | Method | True (ind) | False (ind) | FDP (ind) | True (grp) | False (grp) | FDP (grp) |
| APV | KF+ | 27 | 9 | 0.250 | 18 | 7 | 0.280 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 27 | 4 | 0.129 | 18 | 2 | 0.100 | |
| ATV | KF+ | 19 | 6 | 0.240 | 19 | 1 | 0.050 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 18 | 1 | 0.053 | 18 | 0 | 0 | |
| IDV | KF+ | 34 | 33 | 0.493 | 24 | 15 | 0.385 |
| MKF+ | 26 | 3 | 0.103 | 17 | 0 | 0 | |
| e-MKF | 26 | 4 | 0.133 | 18 | 0 | 0 | |
| eDS-filter | 27 | 3 | 0.100 | 18 | 0 | 0 | |
| LPV | KF+ | 27 | 8 | 0.229 | 20 | 3 | 0.130 |
| MKF+ | 19 | 3 | 0.136 | 13 | 1 | 0.071 | |
| e-MKF | 19 | 3 | 0.136 | 13 | 1 | 0.071 | |
| eDS-filter | 23 | 3 | 0.115 | 15 | 0 | 0 | |
| NFV | KF+ | 33 | 22 | 0.400 | 24 | 8 | 0.250 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 32 | 8 | 0.200 | 20 | 2 | 0.091 | |
| RTV | KF+ | 19 | 5 | 0.208 | 12 | 2 | 0.143 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 25 | 7 | 0.219 | 17 | 2 | 0.105 | |
| SQV | KF+ | 22 | 6 | 0.214 | 16 | 2 | 0.111 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 22 | 4 | 0.154 | 15 | 0 | 0 |
The following are the results from [Table 3] of the original paper: TABLE 3 D For eDS-filter, we set The best-performing method is highlighted in bold.
| Drug | Method | True (ind) | False (ind) | FDP (ind) | True (grp) | False (grp) | FDP (grp) |
| ABC | KF+ | 14 | 0 | 0.176 | 14 | 2 | 0.125 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 13 | 0 | 0.133 | 12 | 2 | 0.143 | |
| AZT | KF+ | 17 | 0 | 0.320 | 16 | 5 | 0.238 |
| MKF+ | 11 | 0 | 0 | 10 | 0 | 0 | |
| e-MKF | 11 | 0 | 0 | 10 | 0 | 0 | |
| eDS-filter | 15 | 1 | 0.063 | 14 | 0 | 0 | |
| D4T | KF+ | 10 | 1 | 0 | 9 | 1 | 0.100 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 18 | 2 | 0.100 | 16 | 1 | 0.118 | |
| DDI | KF+ | 0 | 0 | 0 | 0 | 0 | 0 |
| MKF+ | 0 | 0 | 0 | 0 | 0 | 0 | |
| e-MKF | 0 | 0 | 0 | 0 | 0 | 0 | |
| eDS-filter | 18 | 4 | 0.182 | 17 | 0 | 0.150 |
7. Conclusion & Reflections
7.1. Conclusion Summary
This paper introduces SFEFP (Stabilized Flexible E-Filter Procedure), a novel and robust framework for simultaneously detecting significant features and feature groups while rigorously controlling the False Discovery Rate (FDR) at multiple resolutions. SFEFP addresses critical limitations of existing methods, such as the conservatism of multilayer knockoff filter (MKF) and the zero-power dilemma of e-filter procedures that use one-bit e-values.
The core innovation lies in:
-
Generalized E-values: A unified construction that allows
SFEFPto incorporate a wide variety of base detection procedures (e.g.,knockoffs,data splitting (DS),Gaussian Mirror (GM),Symmetry-based Adaptive Selection (SAS)) at different resolutions, enabling practitioners to select the most effective method for each specific context. -
Generalized E-filter: A principled procedure that leverages these
generalized e-valuesto make coherent selections across multiple layers while guaranteeingFDRcontrol. -
Stabilization Treatment: This crucial step involves averaging
generalized e-valuesobtained from multiple replications. It transforms binary ("one-bit")e-valuesinto continuous scores, thereby providing richer ranking information. This stabilization effectively circumvents thezero-power dilemmaand consistently enhances detection power and stability.Theoretical results underpin
SFEFPwith guarantees formultilayer FDR controlandstability(convergence of the selection set with increasing replications). Practical instantiations, such as theeDS-filterandeDS+gKF-filter, demonstrate the framework's versatility. Simulation studies show thateDS-filtereffectively controlsFDRwhile maintaining or surpassing the power ofMKFande-MKF, especially in settings with high feature correlation. This superiority is further confirmed through the analysis of HIV mutation data.
In essence, SFEFP provides a powerful, flexible, and stable meta-methodology that can adapt to diverse data structures and leverages the strengths of various state-of-the-art FDR control techniques, thereby improving discovery potential in complex multi-resolution analyses.
7.2. Limitations & Future Work
The authors acknowledge several areas for further investigation and improvement:
- Impact of Original FDR Level () on Power: While the paper discusses the practical choice of , a deeper theoretical understanding of its optimal setting and impact on the power of
FEFPandSFEFPis needed. - Sharper FDR Bounds: Simulation results often show that
SFEFPachieves empirically lowerFDRthan the preset nominal level. This suggests that the current theoreticalFDRbounds might be conservative. Exploring milder conditions to derive sharperFDRbounds could potentially lead to even more powerful variants ofSFEFP. - Adaptive Weights for Replications: The current
SFEFPprimarily uses uniform weights () for averaginggeneralized e-valuesacross replications. Developing data-driven or adaptive weighting schemes could potentially improve the overall reliability and performance of the results. - Integration of Enhanced E-values: Techniques from other works focused on enhancing
e-values(e.g., [9, 24]) could be incorporated intoSFEFPto further boost its power.
7.3. Personal Insights & Critique
Inspirations
- The Power of Flexibility: The most significant inspiration from this paper is its emphasis on flexibility. Instead of developing yet another specific
FDRcontrol method, it provides a unifying framework (SFEFP) that allows researchers to plug in the best available method for each specific data layer or characteristic. This modularity is extremely valuable, as no singleFDRcontrol method is universally optimal. This approach fosters innovation by allowing the framework to benefit from future advancements in base detection procedures. - Addressing the "One-bit" Dilemma: The clear identification and effective solution to the "one-bit" problem in
e-filterprocedures are insightful. The stabilization treatment, by generating continuouse-valuesfrom multiple runs, is a clever way to introduce nuance and resolve conflicts across layers, transforming a potential "zero-power disaster" into a consistent power gain. This highlights the importance of carefully considering the implications of binary decision rules in complex inference tasks. - Bridging Theory and Practice: The paper successfully bridges theoretical
FDRcontrol with practical considerations. By integrating powerful methods likeDS(known for handling high correlations) intoSFEFP, it demonstrates how theoretical guarantees can be combined with domain-specific effectiveness. The discussion on relaxing targetFDRlevels in real-world scenarios also shows a pragmatic understanding of applied statistics.
Potential Issues, Unverified Assumptions, or Areas for Improvement
-
Computational Cost of Stabilization: While
SFEFPoffers significant advantages, running base detection procedures times can be computationally intensive, especially for complex base methods or large datasets. The paper could elaborate more on the practical computational overhead for different choices of and how to optimize it (e.g., parallelization, early stopping for replications). -
Choice of : The paper admits that the choice of
original FDR levelis important and its optimal setting requires further theoretical investigation. While a default of is suggested, this parameter might be critical for maximizing power in specific scenarios. A more data-driven or theoretically justified method for choosing would be highly beneficial. -
Interpretation of Averaged E-values: While averaging
e-values(Equation 7) intuitively makes sense for stabilization, a deeper theoretical exploration of the properties of theseaveraged generalized e-values(e.g., how "tight" they remain, their precise distributional characteristics) could provide more insights into whySFEFPgains power rather than losing it, as sometimes seen in single-resolution derandomization. -
Scalability to Many Layers: The framework is designed for layers. While or
3is common, the complexity of iteratively updating thresholds (Algorithm 2 and 3) might increase with a very large . Exploring the convergence properties and computational efficiency for a large number of layers could be valuable. -
Generalizability of Assumptions for Base Procedures: The paper demonstrates that several methods (
DS,GM,SAS) satisfyDefinition 1and theire-valuesarerelaxedorasymptotic relaxed e-values. Ensuring that other novelFDRprocedures (especially those with complex dependence structures or non-standard nulls) also fit these definitions might require careful verification by practitioners.Overall,
SFEFPrepresents a significant step forward in multi-resolutionFDRcontrol, offering a highly adaptable and powerful framework. The future work suggested by the authors, particularly regarding optimal parameter choices and deeper theoretical understanding ofe-valueaggregation, will further strengthen its utility and impact.
Similar papers
Recommended via semantic vector search.