Label noise analysis meets adversarial training: A defense against label poisoning in federated learning
TL;DR Summary
Combining adversarial training with label noise analysis, this paper proposes a generative scheme injecting artificial noise and noisy-label classifiers to detect and defend against label poisoning in federated learning, validated in IoT intrusion detection.
Abstract
Knowledge-Based Systems 266 (2023) 110384 Contents lists available at ScienceDirect Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys Label noise analysis meets adversarial training: A defense against label poisoning in federated learning Ehsan Hallaji a , Roozbeh Razavi-Far a , b , ∗ , Mehrdad Saif a , Enrique Herrera-Viedma c a Department of Electrical and Computer Engineering, University of Windsor, Windsor, ON N9B 3P4, Canada b Faculty of Computer Science, University of New Brunswick, Fredericton, NB E3B 5A3, Canada c Andalusian Research Institute on Data Science and Computational Intelligence, Department of Computer Science and AI, University of Granada, Granada, 18071, Spain a r t i c l e i n f o Article history: Received 9 October 2022 Received in revised form 22 December 2022 Accepted 8 February 2023 Available online 14 February 2023 Keywords: Noisy labels Federated learning Intrusion detection systems Label poisoning attacks Deep learning Adversarial training a b s t r a c t Data decentralization and privacy constraints in federated learning systems withhold user data from the server. As a result, intruders can take adv
Mind Map
In-depth Reading
English Analysis
1. Bibliographic Information
1.1. Title
The central topic of the paper is a defense mechanism against label poisoning attacks in federated learning environments, which integrates concepts from label noise analysis and adversarial training.
1.2. Authors
The authors are:
-
Esa Hallaj
-
Roozbeh Razavi-Far
-
Mehrdad Saif
-
Enrique Herrera-Viedma
Their affiliations include the Department of Electrical and Computer Engineering at the University of Windsor, Canada, and Granada, Spain (presumably University of Granada for Enrique Herrera-Viedma, though not explicitly stated for all). Their research backgrounds appear to be in electrical and computer engineering, with a focus on machine learning, federated learning, and security aspects.
1.3. Journal/Conference
The paper was published in an Elsevier B.V. journal, based on the copyright notice and publication timeline. The specific journal name is not provided in the abstract or article info, but Elsevier is a well-regarded academic publisher in engineering and computer science fields, indicating a peer-reviewed publication.
1.4. Publication Year
The paper was received on October 9, 2022, revised on December 22, 2022, accepted on February 8, 2023, and made available online on February 14, 2023. Therefore, the publication year is 2023.
1.5. Abstract
This paper addresses the vulnerability of federated learning (FL) systems to label poisoning attacks, where malicious actors exploit data decentralization to corrupt shared models. It proposes a novel defense mechanism that combines adversarial training and label noise analysis. Specifically, the authors design a Generative Adversarial Label Poisoner (GALP) to artificially inject local models with label noise that mimics real-world backdoor and label flipping attacks. This synthetic noise helps train client models to be robust against various noise mechanisms (class-independent, class-dependent, and instance-dependent). Additionally, the paper advocates for using noisy-label classifiers within the client models. The combination of GALP and noisy-label classifiers enables the models to learn and counteract potential noise distributions, thereby neutralizing corrupted updates. The work also includes a comparative study of state-of-the-art deep noisy label classifiers. The proposed framework's effectiveness is evaluated on two Internet of Things (IoT) network datasets for intrusion detection, showing promising results.
1.6. Original Source Link
The original source link provided is /files/papers/690885161ccaadf40a4344fa/paper.pdf. This appears to be a relative path or an internal file system link, not a public URL. Given the copyright information from Elsevier B.V. and the availability date, it is likely an officially published paper, possibly behind a paywall, or an accepted manuscript from a journal.
2. Executive Summary
2.1. Background & Motivation
The core problem the paper aims to solve is the vulnerability of federated learning (FL) systems to label poisoning attacks. FL is a distributed machine learning paradigm where models are trained collaboratively across decentralized devices without sharing raw data, addressing data privacy and communication efficiency concerns inherent in centralized cloud-based training.
However, this decentralized nature and privacy feature, where the server does not directly observe client data, create a significant security loophole. Malicious actors can exploit this by manipulating their local training data (data poisoning) and sending forged updates to the central server. These corrupted updates can then degrade the global model, eventually affecting all other participants. Specifically, label poisoning (manipulating the labels of data samples) is identified as an open and critical research problem. Such attacks can lead to catastrophic consequences, especially in sensitive applications like Intrusion Detection Systems (IDS), where targeted attacks can inject backdoors (making specific malicious samples undetectable) or untargeted attacks can drastically increase false alarm rates, rendering the system dysfunctional. The challenge lies in distinguishing malicious label noise from benign, unintentional noise, as malicious noise often follows a specific, designed pattern.
The paper's entry point and innovative idea stem from treating label poisoning as a noisy label classification problem. While noisy label classifiers can handle random noise, they struggle with adversarial noise that follows specific patterns. The innovation lies in proposing a generative model (Generative Adversarial Label Poisoner or GALP) to artificially simulate these adversarial label poisoning attacks. By training noisy label classifiers on this known artificially poisoned data, the models can learn to recognize and become robust against the distribution of real-world label poisoning attacks, effectively "vaccinating" them.
2.2. Main Contributions / Findings
The paper makes several primary contributions:
-
Novel Robustness Approach: Proposes a novel approach to make neural network models robust against
label poisoning attacksby treating malicious labels asnoisy labels. This is a shift from traditional defense mechanisms that might focus on anomaly detection of updates. -
Generative Adversarial Label Poisoner (GALP) Design: Introduces and designs
GALP, aGenerative Adversarial Network (GAN)-based scheme.GALPartificially injectsclient networkswith syntheticlabel noisethat resembles realbackdoorandlabel flipping attacks. This allows for trainingnoisy label classifierson knownlabel noise distributions, enabling them to learn attack patterns. -
Comparative Study of Noisy-Label Classifiers: Conducts a comprehensive comparative study of state-of-the-art
deep noisy label classifiersto identify the most compatible and effective models when coupled withGALP. -
Categorization of Label Poisoning Attacks: Studies
label poison attacksbased on three distinctlabel noise mechanisms(class-independent,class-dependent, andinstance-dependent) within theFLcontext. It demonstrates howbackdoorandlabel flipping attacksmap to these mechanisms, informing theGALPdesign. -
Empirical Evaluation on IoT Networks: Designs and tests an
FL-basedIDSon two real-worldIoTnetwork datasets (UNSW-NB15 and NIMS). The study investigates the effects ofnoise ratioandlabel noise mechanismon theIDS's detection performance, providing practical validation.The key conclusion is that coupling the
GALPalgorithm with a robustnoisy-label classifier(specificallyCORESin their experiments) significantly reduces the impact of maliciousnoisy labelsinFLsystems. The findings indicate thatGALPeffectively neutralizeslabel flipping attacks(simulated byclass-dependentandclass-independent noise) regardless of thenoise ratio. Forbackdoor attacks(simulated byinstance-dependent noise),GALPeffectively neutralizes noise ratios up to ten percent, showing robustness for typical attack scenarios. This solves the problem ofFLmodels being vulnerable tolabel poisoningby proactively training them to recognize and mitigate adversarial noise patterns.
3. Prerequisite Knowledge & Related Work
3.1. Foundational Concepts
To understand this paper, a reader should be familiar with the following core concepts:
- Machine Learning (ML): A field of artificial intelligence that uses statistical techniques to give computer systems the ability to "learn" from data, without being explicitly programmed. In
ML, models are trained on datasets to find patterns and make predictions or decisions. - Deep Learning (DL): A subfield of
MLthat uses artificial neural networks with multiple layers (deep networks) to learn representations of data with multiple levels of abstraction.Deep neural networksare particularly effective for complex tasks like image recognition, natural language processing, and, in this context, intrusion detection. - Federated Learning (FL): An
MLparadigm that enables multiple decentralized edge devices or organizations to collaboratively train a shared global model without exchanging their local data.- Decentralization: Data remains on local devices.
- Privacy: Raw data is not shared with a central server, only model updates (e.g., gradients or model parameters).
- Aggregation: A central server aggregates the model updates from multiple clients to create an improved global model, which is then sent back to the clients.
- Intrusion Detection System (IDS): A security mechanism that monitors network traffic or system activities for malicious activities or policy violations. An
IDSis crucial for protectingIoTnetworks from cyberattacks. - Generative Adversarial Network (GAN): A class of artificial intelligence algorithms used in
unsupervised learning, implemented by a system of two neural networks, aGenerator() and aDiscriminator(), contesting with each other in a zero-sum game framework.- Generator (): Learns to create new data instances that resemble the training data. Its goal is to produce
fakedata convincing enough to fool theDiscriminator. - Discriminator (): Learns to distinguish between real data from the training set and
fakedata produced by theGenerator. Its goal is to correctly identifyfakedata. - Adversarial Training: The
GeneratorandDiscriminatorare trained simultaneously. TheGeneratortries to minimize theDiscriminator's ability to distinguish real from fake, while theDiscriminatortries to maximize its ability to do so. This adversarial process drives both networks to improve.
- Generator (): Learns to create new data instances that resemble the training data. Its goal is to produce
- Label Noise: Errors or inaccuracies in the
labels(target outputs) of a dataset. Instead of the true label, a data sample might have an incorrect label.- Class-independent (Symmetric/NCAR - Noisy Completely At Random): The probability of a label being corrupted is independent of the true class or any other variables. E.g., any label has a fixed probability of being randomly swapped with another label.
- Class-dependent (Asymmetric/NAR - Noisy At Random): The probability of a label being corrupted depends on its true class. E.g., 'cat' labels might often be mislabeled as 'dog', but not vice-versa, or specific classes are more prone to mislabeling than others.
- Instance-dependent (NNAR - Noisy Not At Random): The probability of a label being corrupted depends on both the sample's features () and its true class (). This is the most complex and realistic type of noise, often indicative of a malicious attack targeting specific data patterns.
- Label Poisoning Attacks: A type of
data poisoning attackwhere an adversary intentionally manipulates thelabelsof training data samples to degrade the performance of amachine learning model.- Targeted Attack: Aims to cause misclassification for specific target samples or classes, or inject
backdoors. - Untargeted Attack: Aims to degrade the overall performance of the model across all classes.
- Label Flipping Attack: A common type of
label poisoningwhere the truelabelof a data sample is changed to an incorrectlabel. It can beuntargeted(random swaps) ortargeted(swapping specific labels to specific incorrect ones). - Backdoor Attack: A more sophisticated
targeted label poisoning attackwhere the attacker embeds a "trigger" pattern into a small subset of training data. When this trigger is present in input data during inference, the model is forced to output a specific, incorrect prediction, while maintaining high accuracy on clean data without the trigger. This is highly effective for injecting vulnerabilities.
- Targeted Attack: Aims to cause misclassification for specific target samples or classes, or inject
- Noisy-Label Classifiers: Machine learning models or training techniques specifically designed to be robust to
label noiseduring training, often by trying to identify and correct mislabeled samples or minimize their impact. - F-measure (F1-score): A metric used in statistical analysis of binary classification to measure a test's accuracy. It is the harmonic mean of precision and recall. It is useful for imbalanced datasets where one class is much more frequent than the other.
- Accuracy: The proportion of total predictions that were correct. Calculated as (True Positives + True Negatives) / (Total Samples).
3.2. Previous Works
The paper references several prior works, primarily in the context of federated learning, label noise, and adversarial attacks.
-
Federated Learning Challenges and Security:
- [1, 2] Highlight
data privacyandlatencyas major concerns in traditional centralizedMLforIoT. - [3] Google's original proposal for
Federated Learningto address these issues. - [4, 5] Discuss
privacy-preservingfeatures ofFL. - [6] Hallaji et al. (one of the current paper's authors) survey
adversariesanddefense mechanismsinFL. - [7] Mothukuri et al. survey
security and privacy of federated learning. - [8] Nasr et al. analyze
privacy attacksagainstcentralized and federated learning. - [9, 10] Detail how
backdoor attackscan be launched inFL. - [11] Discusses
false data injectionon features (distinguishing it from label poisoning). These works establish the context ofFL's vulnerability todata poisoning, especiallylabel poisoning, which is the focus of the current paper.
- [1, 2] Highlight
-
Noisy Label Classification Techniques: The paper compares against or builds upon various state-of-the-art
noisy label classifiers:- Co-teaching [13]: Trains two neural networks simultaneously, where each network selects "clean" samples (those confidently predicted by the other network) to train itself, thus being robust to
noisy labels. It leverages the observation thatdeep neural networkstend to fit clean data before memorizingnoisy labels. - MetaWeight-Net [14]: A
meta-learningapproach that learns to assign weights to samples. It extracts sample weights automatically to eliminate training bias fromnoisy labels. - Label Confidence [15]: Estimates
label confidenceto filter outnoisy labels, treating confidence as a metric for sample reliability. - Label Enhancement [16]: Recovers and refines the
label distribution, often based on "trusted data". - ProSelfLC [17]:
Progressive self label correctionbased onminimum entropy regularization. It assumesdeep modelslearn informative samples before fitting noise, guiding the process with "trusted data" obtained vialabel enhancement. - Stochastic Label Noise [18]: Suggests that
stochastic gradient noiseinduced bystochastic label noisecan help combat inherentlabel noise. - Learning from Massive Noisy Labeled Data (LMNL) [19]: A framework for
convolutional neural networksthat models relationships between samples, class labels, andlabel noiseusing aprobabilistic graphical model, even with limited clean labels. - Masking [20]: Incorporates human cognition of
noisy class transitionsby speculating thenoise transition matrixstructure. It uses astructure-aware probabilistic modelto estimate unmasked noise transition probabilities. - Gold Loss Correction (GLC) [21]: A semi-supervised method that uses a small set of
trusted datato estimatelabel noise parametersand then trains acorrected classifieron thenoisy labels. - Probabilistic End-to-End Noise Correction for Learning with Noisy Labels (PENCIL) [22]: Updates both network parameters and
label estimationsaslabel distributions. It is backbone-independent and does not require auxiliary clean data or prior noise information. - Symmetric Cross Entropy Learning (SCEL) [23]: Addresses
overfittingtonoisy labelson "easy classes" andunder-learningon "hard classes" when usingcross-entropy. It symmetrically combinescross-entropywithReverse Cross Entropy (RCE)for robustness.- Cross-Entropy Loss (
H(p, q)): Measures the difference between two probability distributions, (true distribution) and (predicted distribution). For classification, if is the one-hot encoded true label and is the model's predicted probability distribution, the cross-entropy loss for a single sample is: $ L_{CE} = -\sum_{i=1}^{C} y_i \log(\hat{y}_i) $ where is the number of classes. - Reverse Cross-Entropy Loss (
H(q, p)): Swaps the roles of the true and predicted distributions. $ L_{RCE} = -\sum_{i=1}^{C} \hat{y}_i \log(y_i) $ While is typically one-hot and would be undefined for ,RCEis usually applied with a smoothed version of or within specific contexts likeSCELwhere the true labels are implicitly handled (e.g., through confidence estimation or a noise transition matrix). The intuition is that it penalizes predictions that are far from the true label, but in a way that is less sensitive to hard mislabeling than standard cross-entropy. - Symmetric Cross-Entropy (SCEL) Loss: Combines
Cross-EntropyandReverse Cross-Entropy: $ L_{SCEL} = L_{CE} + L_{RCE} $ This combination aims to leverage the benefits of both, reducingoverfittingtonoisy labelswhile improvingunder-learning.
- Cross-Entropy Loss (
- COnfidence REgularized Sample Sieve (CORES) [24]: Filters out
noisy samplesby learning the underlyingclean distributionrather than thenoisy distribution. It defines aregularization termto enhance model confidence and separatescleanandnoisy samplesto apply asupervised lossonclean dataand anunsupervised consistency lossonnoisy samples. - Online Label Smoothing (OLS) [25]: Assumes
model predictive distributionscan discover inter-category relationships. It replaceslabel smoothingby generatingsoft labelsthat consider these relationships, treating each category as amoving label distribution.
- Co-teaching [13]: Trains two neural networks simultaneously, where each network selects "clean" samples (those confidently predicted by the other network) to train itself, thus being robust to
3.3. Technological Evolution
The field has evolved from centralized ML with inherent privacy and latency issues to federated learning as a solution. However, FL introduced new security challenges, particularly data poisoning attacks, which are harder to detect due to data decentralization. Initial defenses often focused on feature poisoning or simple label flipping that induced class-independent noise. The evolution progressed to understanding more sophisticated label noise mechanisms (class-dependent, instance-dependent) that mimic backdoor attacks. Simultaneously, research in noisy label classification advanced, leading to various techniques to handle different types of label inaccuracies.
This paper's work fits within this evolution by bridging adversarial attack simulation (using GANs) with noisy label classification. Instead of solely relying on noisy label classifiers to implicitly handle unknown noise, it proactively generates adversarial noise to explicitly train models to be robust against targeted attack patterns. This represents a step forward in making FL more robust against realistic and sophisticated label poisoning threats, particularly those that leverage specific noise mechanisms for malicious goals like backdoors.
3.4. Differentiation Analysis
Compared to prior noisy label classification methods, this paper's core innovation is the integration of a Generative Adversarial Label Poisoner (GALP) into the federated learning training process.
-
Traditional Noisy-Label Classifiers: Most existing
noisy label classifiers(e.g., Co-teaching, LMNL, Masking, GLC, PENCIL, SCEL, CORES, OLS) are designed to handle unintentional or randomly distributedlabel noise. While some, likeCORES, targetinstance-dependent noise, they primarily focus on detecting and correcting noise within the given dataset. They might struggle if thenoise distributionis specifically crafted by an intelligent adversary to evade detection. -
This Paper's Approach (GALP + Noisy Classifier): The key differentiator is the proactive generation of adversarial label noise using
GALP.GALPsimulatesbackdoorandlabel flipping attacksby creatingclass-dependentandinstance-dependent noise. By training thenoisy label classifierson this artificially generated and controlledadversarial noise, the models are "vaccinated." This means thenoisy label classifierdoesn't just try to recover from unknown noise; it learns the signature ofmalicious noiseand its distribution, making it explicitly robust against such attack patterns. -
Focus on Federated Learning Security: While
noisy label classificationis a generalMLproblem, this paper specifically applies and adapts it to theFLcontext to address a critical security vulnerability that arises fromFL's decentralized nature. TheGALPensures that even attacker models, when initialized withserver updates, become "vaccinated," thus neutralizing theirpoisoned updates. -
Comprehensive Noise Mechanisms: The paper explicitly addresses all three
label noise mechanisms(class-independent,class-dependent,instance-dependent) and mapslabel flippingandbackdoor attacksto these, providing a more structured approach toadversarial noise generation.In essence, while previous work provided tools to handle
label noise, this paper provides a method to prepare the models against maliciously crafted label noise in afederatedsetting by generating realistic attack scenarios for training.
4. Methodology
The proposed framework aims to mitigate label poisoning attacks in federated learning by combining a Generative Adversarial Label Poisoner (GALP) with noisy label classifiers. The core idea is to "vaccinate" local models against label poisoning by injecting them with artificially generated label noise that mimics real attack patterns. This allows the models to learn the distribution of malicious noise and become robust to it.
The overall process is illustrated in Image 2.
Image 2: Schematic diagram illustrating the proposed defense mechanism in federated learning that combines adversarial training with label noise analysis.
The process flow is as follows:
-
Local Model Initialization: Each
client model() is initialized with the latest global update () from the server. -
Artificial Poisoning by GALP: On each client, the
GALPcomponent generates artificiallypoisoned data() by injectinglabel noiseinto a subset of the client's local dataset (). TheGALPuses aGenerator() and aDiscriminator() to createlabel noisethat resemblesclass-dependentorinstance-dependentlabel poisoning attacks. -
Noisy Label Classifier Training: The client's local model (), which is a chosen
noisy label classifier, is then trained on a combination of its originalclean dataand theartificially poisoned datagenerated byGALP. This training makes the model robust to variousnoise distributions. This robust model is referred to as a "vaccinated model." -
Local Parameter Update: After training, the
vaccinated modelproduces updated local parameters (). -
Server Aggregation: These
vaccinated parameters() are sent to the central server. The server aggregates the parameters from all participating clients using anaggregation function() to produce a new global model update (). -
Global Model Broadcast: The server broadcasts this new
vaccinated global update() back to all clients. -
Attack Neutralization: If an attacker attempts to
poisontheir local data, their model will also be initialized with thevaccinated global update. Consequently, the attacker's model will bevaccinatedagainstlabel poisoning, and thepoisoned updatesit sends to the server will have aneutralized effecton theaggregation process.Consider a
federated networkwith a set ofclient models. Eachclient modeloperates on a training set , where are samples and are their correspondinglabels. is the number of samples in . Afternoisy labelsare injected (either artificially byGALPor by a real attacker), becomes corrupted for some . Thenoisy label setsandcorrupted training setsare denoted as and , respectively.
4.1. Label Poisoning using Noisy Label Injection (GALP)
The GALP is a GAN-based system designed to inject label noise using class-dependent and instance-dependent mechanisms. It consists of a Generator () and a Discriminator ().
The core idea is that tries to generate fake samples (i.e., samples whose labels are corrupted ) while tries to distinguish these fake samples from real samples (i.e., samples with their original labels ).
The Generator takes a random noise matrix as input and generates fake samples corresponding to .
The Discriminator receives the produced corrupted training set and identifies samples where the label has been corrupted ().
The structures of and are formally defined as: where:
-
: input random noise for .
-
: input sample for .
-
: input noisy label for .
-
: the number ofhidden layersin and . -
and : the
activationsof the representation of the -th layer in and , respectively. -
: the generated fake sample from .
-
: the
Discriminator's output, indicating whether the input is real or fake (or the predicted label).The computation of
hidden layerrepresentations is given by a recursive formulation: $ h_l = \frac{u_{l-1}^T \cdot w_l - \mu_l}{\sigma_l}, \ s.t. \ u_0 = {x_i, \tilde{y}_i}, $ where: -
:
activationfrom the previous layer (l-1). -
:
weightsfor the current layer (). -
and :
meanandstandard deviationof , used forbatch normalizationto normalize the layer's inputs. -
For , the initial input is .
The
activation functionfor the layers is defined as: $ u_l(h_l) = \left{ \begin{array}{ll} 0.01 h_l & \mathrm{if} h_l < 0 \mathrm{~\land~} l < L \ h_l & \mathrm{if} h_l \geq 0 \mathrm{~\land~} l < L \ \frac{1}{1 + e^{-h_l}} & \mathrm{if} l = L \end{array} \right. $ This implies: -
For
hidden layers(): ALeaky Rectified Linear Unit (Leaky ReLU)activation is used. If is negative, the output is ; otherwise, it's .Leaky ReLUhelps mitigate the "dying ReLU" problem by allowing a small, non-zero gradient when the unit is not active. -
For the
output layer(): ASigmoid activation functionis used, mapping the output to a range between 0 and 1, suitable for binary classification (e.g., real or fake).The
GALPalgorithm (Algorithm 1) outlines the training process for bothclass-dependentandinstance-dependent injectors.
The following are the results from Algorithm 1 of the original paper:
Algorithm 1: Generative Adversarial Label Poisoner (GALP)
Input: Local data Tr, server update θs, mechanism.
Output: Locally updated parameters θut
Definitions: Instance-dependent confidence threshold: t2.
Initialization:
Estimate M from Γ.
Initialize client models Γ.
Initialize G with random weights.
Form a random noise matrix Z.
Initialize D with θ.
1 Classify T, using Mr(θ, Xr) = Y 2 for ∀ do
2 Estimate label error E for using (6).
3 end for
4 = {c | > 1}.
5 if Mechanism = Instance-dependent then
6 for ∀{xi, i} Cc: do
7 Calculate label confidence i using (10).
8 end for
9 = {{ | < 2}
10 end if
11 for ∀ epochs do
12 Train G on Z: G(Z) = X.
13 switch Mechanism do
14 case Class-dependent do
15 Train D with G(Z) and the class-dependent subset: D(X, CC) = fake
16 Update G and D using cD in (9).
17 end case
18 case Instance-dependent do
19 Train D with G(Z) and the instance-dependent subset: D(X, CD) = Pfake
20 Update G and D using VD in (11).
21 end case
22 end switch
23 end for
24 Train Mr(θs, {X, Y, Y} U Tr).
return
Note: The provided Algorithm 1 text has several OCR/formatting issues (e.g., missing variables, broken symbols, line numbers out of order). I will interpret the intent based on the surrounding text and common GAN training procedures and describe it. The key steps are initialization, error estimation for identifying target classes/instances, and then an epoch loop for GAN training based on the chosen noise mechanism. Finally, the noisy label classifier is trained.
4.1.1. Class-dependent injector
The class-dependent injector aims to create label noise where the corrupted labels are statistically dependent on the true class but not on the sample features (). This mimics a targeted label flipping attack.
From a security perspective, the attacker wants to minimize the probability of the injected noisy labels being detected as malicious. To achieve this, the GALP identifies classes that already have a higher error rate when evaluated with the server update .
First, the current client model (updated with ) is used to predict labels for the local data :
$
\mathcal{M}_r(\theta_S, X_r) = \hat{Y}_r,
$
where are the predicted labels.
Next, classes with a higher risk of being noisy are determined by calculating the error rate for each class in the local dataset:
$
E_j^r = \frac{1}{|c_j|} \sum_{i=1}^{|c_j|} P(\hat{y}_i^r = y_i^r), {x_i, y_i} \in c_j,
$
where:
-
: the error probability for class on client
's data. -
: the set of all possible data classes. (Note: The paper uses
for both number of clients and number of classes in , which can be ambiguous. Assuminghere refers to number of classes for ). -
: the cardinality (number of samples) of class .
-
: This term appears to be a misprint in the formula, as it calculates the probability of correct prediction. For an error rate, it should likely be or . However, strictly adhering to the paper's formula, it calculates the average accuracy for each class. If is low, it means the class has a higher error.
-
: refers to samples belonging to class .
The set of
classes with label errorsis then identified as . TheGeneratoris then fed withrandom noiseto produceartificially generated samples: $ \mathcal{G}(Z) = \tilde{X}. $ TheDiscriminatoris fed with thesegenerated samplesand theclass-dependent subset: $ \mathcal{D}(\tilde{X}, C_{CD}) = P_{fake} $ This indicates that tries to identify if coupled with isfake(i.e., corresponds topoisoned labelswithin the targetedclasses).
The class-dependent injector converges by minimizing the following loss function (value function ):
$
\underset{\mathcal{G}}{\operatorname*{min}} \underset{\mathcal{D}}{\operatorname*{max}} V_{CD}(\mathcal{G}, \mathcal{D}) = \mathbb{E}\left[\log \mathcal{D}(\hat{\mathcal{X}}, C_{CD})\right] + \mathbb{E}\left[1 - \mathcal{D}\big(\tilde{\mathcal{X}}, C_{CD}\big)\right],
$
where:
- : This is the minimax objective function typical of
GANs. tries to maximize its ability to distinguish real from fake, while tries to minimize 's ability to do so. - : Expectation operator.
- : Denotes real samples (with their original labels).
- : Denotes fake samples generated by .
- : The set of
classesidentified as having errors, used to guide both and to focus onclass-dependent noise. TheGeneratorparticipates in the loss estimation as part of theDiscriminator's input.
4.1.2. Instance-dependent injector
The instance-dependent (ID) mechanism is designed to initiate a backdoor attack, meaning the label noise depends on both the sample variables () and the class labels (). The goal is to produce poisoned labels that are difficult to detect as anomalies.
In addition to finding classes with a higher probability of error (as in class-dependent), instance-dependent injector also identifies specific samples within those classes that have the least membership confidence.
Given a Softmax activation function at the final layer of the noisy classifier, the confidence for the -th sample is measured as:
$
\varphi_i = \frac{\sum_{j=1}^{|u_L|} (u_L(j) - \bar{u_L})}{|u_L| - 1},
$
where:
- : the measured confidence for the -th sample.
- : the output of the
Softmax layerfor class. - : the mean of the
Softmax outputvalues for all classes for that sample (). - : the number of output classes (dimension of the
Softmax output). This formula calculates a value related to the variance of theactivation outputs. A lower indicates less confidence in the assignedlabel, making that sample a candidate forinstance-dependent poisoning.
These low-confidence samples are collected as . The instance-dependent injector then uses the same GAN structure as the class-dependent injector, but with a different objective function:
$
\underset{\mathcal{G}}{\operatorname*{min}} \underset{\mathcal{D}}{\operatorname*{max}} V_{ID}(\mathcal{G}, \mathcal{D}) = \mathbb{E}\left[\log \mathcal{D}(\hat{\boldsymbol{X}}, \boldsymbol{C}{ID})\right] + \mathbb{E}\left[1 - \mathcal{D}\big(\tilde{\boldsymbol{X}}, \boldsymbol{C}{ID}\big)\right],
$
where:
- : the value function for the
instance-dependent injector. - : real samples.
- : fake samples generated by .
- : the set of
samplesidentified as having low confidence, used to guide and to focus oninstance-dependent noise.
4.2. Training the Noisy Label Classifier and Aggregation
Upon convergence of the noisy label injectors (GALP), the device trains its noisy label classifier () using the server update , its original local data (), and the artificially poisoned data generated by GALP (, where are the original features of the poisoned samples, are the poisoned labels, and might refer to the original labels or a context for GALP):
$
\mathcal{M}r(\theta_S, {\hat{X}, \tilde{Y}, Y} \cup T_r) \to \theta{\mathrm{out}}^r,
$
where represents the outgoing parameters set from client . This vaccinated model is then sent to the server.
On the server-side, the global model parameters are obtained by an aggregation function that combines the outgoing parameters from all clients:
$
\theta_S = f({ \theta_{\mathrm{out}}^r \mid 1 \leq r \leq k }).
$
For simplicity, the paper defines the aggregation function as a simple average:
$
f({ \theta_{\mathrm{out}}^r \mid 1 \leq r \leq k }) = \frac{1}{k} \sum_{r=1}^k \theta_{\mathrm{out}}^r.
$
This averaged aggregation ensures that the robustness learned by individual vaccinated models is incorporated into the global model, which is then broadcast back to all clients for the next round.
5. Experimental Setup
5.1. Datasets
The experiments utilize two real-world network traffic datasets commonly used for IoT intrusion detection systems: UNSW-NB15 and NIMS.
-
UNSW-NB15 Dataset [28]:
- Source: Captured from a network environment simulating real-world traffic, including both
normalandabnormal (attack)activities. - Characteristics: Contains
pcap,Argus,Bro, andCsvfiles. Originally, it's a binary classification problem (normal vs. attack). - Modification for Experiments: To increase the complexity and challenge for the
noisy classifiers, the authors transformed the binary classification problem into amulti-class problemwith 9 classes: 1normalclass and 8 categories ofattacks. - Purpose: Evaluate
network intrusion detection systems. - Data Sample Example (Conceptual): A row in the dataset might represent a network connection with features like
source IP,destination IP,port numbers,protocol type,duration,number of bytes transferred,flags, and finally alabelindicating whether it'snormaltraffic or a specific type ofattack(e.g.,DoS,Exploits,Generic,Shellcode,Backdoor, etc.).
- Source: Captured from a network environment simulating real-world traffic, including both
-
NIMS Botnet Dataset [29]:
- Source: Collected from a research lab testbed network where various network scenarios were simulated.
- Characteristics: Includes
network trafficfrom a client computer connected to SSH servers outside the testbed, generatingSSH connections. Emulates variousapplication behaviorssuch asDNS,HTTP,FTP,P2P (limewire), andtelnet. - Purpose: Specifically designed for
botnet detectionand analysis ofencrypted traffic. - Data Sample Example (Conceptual): A data sample would represent
network flow characteristics(e.g.,packet size distribution,inter-arrival times,flow duration,byte counts) without relying onport numbers,IP addresses, orpayload inspection, which are challenging forencrypted traffic. Thelabelwould indicate if a flow isnormalor part of abotnet activity.
-
Dataset Split and Client Distribution: For both datasets,
50 clientsare simulated. For each client,50 percentof samples from eachclassin the original dataset are randomly selected to form the client's local data. This ensuresdata heterogeneityacross clients while maintaining class representation.
5.2. Evaluation Metrics
The performance of the proposed approach and selected noisy classifiers is evaluated using two common metrics: F-measure and Accuracy.
5.2.1. Accuracy
- Conceptual Definition:
Accuracymeasures the proportion of correctly classified instances out of the total number of instances. It provides a general idea of how well the model performs across all classes. - Mathematical Formula: $ \text{Accuracy} = \frac{\text{Number of Correct Predictions}}{\text{Total Number of Predictions}} = \frac{TP + TN}{TP + TN + FP + FN} $
- Symbol Explanation:
TP: True Positives (correctly predicted positive instances).TN: True Negatives (correctly predicted negative instances).FP: False Positives (incorrectly predicted positive instances, type I error).FN: False Negatives (incorrectly predicted negative instances, type II error).
5.2.2. F-measure (F1-score)
- Conceptual Definition: The
F-measure(orF1-score) is the harmonic mean ofPrecisionandRecall. It is particularly useful when dealing withimbalanced datasets(where one class significantly outnumbers others), as it balances the concerns ofPrecisionandRecall, giving a more robust evaluation thanAccuracyalone in such scenarios. Formulti-class classification, it is often calculated as a weighted or macro average across all classes. - Mathematical Formula: $ \text{F1-score} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} $ Where: $ \text{Precision} = \frac{TP}{TP + FP} $ $ \text{Recall} = \frac{TP}{TP + FN} $
- Symbol Explanation:
TP: True Positives.FP: False Positives.FN: False Negatives.Precision: The proportion of positive identifications that were actually correct (answers the question: "Of all items I identified as positive, how many were actually positive?").Recall: The proportion of actual positives that were identified correctly (answers the question: "Of all actual positive items, how many did I correctly identify?").
5.3. Baselines
The paper's method (GALP combined with a noisy label classifier) is compared against various state-of-the-art deep noisy label classifiers themselves. These noisy label classifiers serve as the baselines to show the effectiveness of adding GALP for adversarial noise robustness. The selected baselines are:
-
LMNL(Learning from Massive Noisy Labeled Data) [19] -
Masking[20] -
GLC(Gold Loss Correction) [21] -
PENCIL(Probabilistic End-to-End Noise Correction for Learning with Noisy Labels) [22] -
CE(Cross Entropy - likely used as a standard baseline without explicit noise handling, but could also beSCELwhich usesCEas a component, thoughSCELis also listed separately). Given the mention ofSCELlater,CEmight refer to a basicCross-Entropytrained model. -
SCEL(Symmetric Cross Entropy Learning) [23] -
CORES(COnfidence REgularized Sample Sieve) [24] -
OLS(Online Label Smoothing) [25]These baselines are representative as they cover a wide range of modern
noisy label classificationtechniques, including probabilistic models, meta-learning, confidence-based filtering, and regularization approaches, providing a comprehensive comparison forGALP's contribution.
Experimental Settings Details:
- Runs: All experiments are repeated ten times.
- Batch Size: 100 for all
noisy classifiersandGALP. - GALP Structure:
Generator() andDiscriminator() each have threehidden layers. - Optimizer:
Adam optimizeris used forGALPparameters. - Hyperparameter Search:
-
Learning rate: Searched within{0.001, 0.01, 0.02, 0.1}. -
Number of hidden layers: Selected from{2, 3, 4, 5}empirically.The following are the results from Table 1 of the original paper:
Algorithm Parameters Value/Method LMNL Optimizer AdaMax Activations Relu and Softmax #. hidden layers 3 Learning rate 0.01 Masking Optimizer Adam Activations Sigmoid and Softmax #. hidden layers 3 Learning rate 0.01 GLC Optimizer Adam Activations Relu and Softmax #. hidden layers 3 Learning rate 0.001 PENCIL Optimizer Adam Activations Relu and LogSoftmax #. hidden layers 4 Learning rate 0.02 CE Optimizer SGD Activations Convolution + Relu and Sofmax #. hidden layers 5 Learning rate 0.1 CORES Optimizer SGD Activations Convolution + LeakyRelu and Softmax #. hidden layers 5 Learning rate 0.01 OLS Optimizer SGD Activations Convolution + LeakyRelu and Softmax #. hidden layers 5 Learning rate 0.02
-
Table 1 provides specific parameter settings for each noisy classifier used in the comparative study, including their optimizer, activation functions, number of hidden layers, and learning rate.
6. Results & Analysis
6.1. Core Results Analysis
The experimental results demonstrate the effectiveness of the proposed approach, particularly when GALP is combined with robust noisy label classifiers. The analysis focuses on how different label noise mechanisms and noise ratios affect the performance of various noisy classifiers in an FL setting for intrusion detection.
The following figures show the accuracy and F-measure results under different label noise mechanisms and ratios.
Image 3: Accuracy of noisy classifiers for different label noise mechanisms on UNSW-NB15 and NIMS datasets.
Image 4: F-measure of noisy classifiers for different label noise mechanisms on UNSW-NB15 and NIMS datasets.
-
Impact of Noise Mechanism:
Class-independent (CI)noise is consistently the easiest fornoisy classifiersto handle. For instance,GLC'saccuracyon UNSW-NB15 degrades by roughly 15% withCInoise (Figure 3a), whereasclass-dependent (CD)noise causes up to 30% degradation for the same technique and dataset (Figure 3b). This confirms thatCDnoise, whichGALPcan generate, represents a more challenging and effectivelabel flipping attackthan traditionalCInoise.Instance-dependent (ID)noise is generally the most challenging, as it mimicsbackdoor attacksthat are highly specific to data instances.
-
Sensitivity to Noise Ratio:
GLCshows high sensitivity toCIandCDnoise ratios but surprisingly high robustness againstIDnoise on UNSW-NB15 (Figure 3c).CORESconsistently exhibits the highest robustness againstnoise ratioacross mostnoise mechanismsand datasets (e.g., Figure 3d, 3e, 3f).
-
Dataset Challenge: UNSW-NB15 appears to be more challenging for the
noisy classifierscompared to the NIMS dataset, as evidenced by generally lower performance and steeper degradation curves. -
F-measure vs. Accuracy: The F-measure results (Image 4) largely confirm the accuracy analysis. However,
Masking's F-measure degrades at a higher rate compared to its accuracy, indicating a potential issue withclass imbalancefor this method.The following figures show the distribution and variance of obtained accuracy and F-measure values.
Image 5: Box plots comparing accuracy and F1-score across different noise types for various methods. -
Stability (Variance) of Methods:
-
LMNLshows the lowest variance when dealing withCIandCDlabels, suggesting high stability for these noise types. -
Maskingis the least stable forCIandCDnoise mechanisms. -
PENCILandSCELshow similar stability. -
For
IDnoise, the stability varies by dataset.GLCis stable in terms of accuracy on NIMS (Figure 5c) but most stable in terms of F-measure (Figure 5f), indicating that dataset distribution plays a role in method stability. -
CORESmakes the most accurate predictions regardless of thenoise mechanism, aligning with its robustness tonoise ratio.The following are the results from Table 2 of the original paper:
Algorithm Accuracy F-measure Rank LMNL 0.7811 ± 0.0229 0.6895 ± 0.0229 6 Masking 0.8529 ± 0.1011 0.6909 ± 0.1011 5 GLC 0.7861 ± 0.0626 0.6354 ± 0.0626 7 PENCIL 0.9038 ± 0.0740 0.7929 ± 0.0740 4 SCEL 0.9059 ± 0.0637 0.8259 ± 0.0637 2 CORES 0.9306 ± 0.0547 0.8599 ± 0.0551 1 OLS 0.9218 ± 0.650 0.8199 ± 0.632 3
-
Table 2 provides an overall ranking of the noisy label classifiers based on their average Accuracy and F-measure across all experimental conditions (datasets, noise types, and ratios).
-
Overall Ranking:
CORES: Outperforms all other competitors with the highest averageAccuracy(0.9306) andF-measure(0.8599). This suggestsCORESis the best choice to combine withGALP. Its design to handleinstance-dependent label noiseis a key factor.SCEL: Ranks second with a strongF-measure(0.8259), indicating good performance in handlingclass imbalance.OLS: Ranks third with highAccuracy(0.9218).PENCILandMasking: Ranked fourth and fifth, respectively.LMNLandGLC: Rank sixth and seventh.LMNLhas betterF-measurethanGLC, despiteGLChaving slightly higherAccuracy, due to the significance ofF-measureforimbalanced data.
-
Effectiveness of GALP: The results imply that coupling
GALPwith a classifier likeCORESsignificantly reduces the impact ofmalicious noisy labels. WhileGALPgeneratesCI,CD, andIDnoise, many techniques are primarily designed forCIandCD. The superior performance ofCORESis attributed to its design specifically targetinginstance-dependent noise, which is harder to handle.- For
label flipping attacks(CIandCDnoise),GALPalmost offsets their effect, regardless of thenoise ratio. - For
backdoor attacks(IDmechanism),GALPeffectively neutralizesnoise ratiossmaller than 10%. For highernoise ratios, thedata distributionbecomes a more critical factor. However, the authors argue thatcorrupting more than ten percent of traffic datausingbackdoor attacksis unlikely in most large-scaleFLapplications.
- For
6.2. Ablation Studies / Parameter Analysis
To explicitly demonstrate the effect of GALP in making noisy classifiers robust, an ablation study was performed by removing GALP from the training process. The best-performing algorithm, CORES, was chosen for this comparison.
The following figure illustrates the effect of GALP on mitigating label poisoning.
Image 6: Effect of GALP on mitigating label poisoning, comparing CORES trained with and without GALP's artificial poisoning.
- Comparison of
CORES(with GALP) vs.CORES(without GALP):CORESwithoutGALPwas trained usingsymmetric noise(randomly generated, typicallyclass-independent) and then tested againstclass-dependentandinstance-dependent label poisoning. This is a crucial distinction:CORESon its own might handlerandom noisewell, but not necessarilyadversarial noisepatterns it hasn't seen during training.- Figure 6a (Class-Dependent Attack): The
accuracyofCORESwithoutGALP(orange line) degrades significantly as thenoise ratioincreases. In contrast,CORESwithGALP(blue line) maintains a much higheraccuracy, showing robustness. - Figure 6b (Instance-Dependent Attack): A similar trend is observed.
CORESwithoutGALPsuffers substantial performance drops, whileCORESwithGALPdemonstrates remarkable resilience, especially at highernoise ratios.
- Conclusion from Ablation: The results clearly indicate that
GALPplays a vital role invaccinatingthe models. The effect ofGALPbecomes more significant as theratio of label noiseinpoisoning attacksincreases. This validatesGALP's contribution by showing that simply using a robustnoisy label classifieris not enough to defend againstadversarial label poisoningwithout explicitly training it onadversarially generated noise.
7. Conclusion & Reflections
7.1. Conclusion Summary
This paper successfully proposes and evaluates a novel defense mechanism against label poisoning attacks in federated learning environments. The core of their approach is the Generative Adversarial Label Poisoner (GALP), which artificially generates label noise mimicking realistic class-dependent and instance-dependent label flipping and backdoor attacks. By coupling GALP with state-of-the-art noisy label classifiers, the framework effectively "vaccinates" local models, enabling them to learn malicious noise distributions and robustly withstand poisoned updates. The comprehensive comparative study identified CORES as the most compatible noisy label classifier for this framework. Evaluated on two IoT intrusion detection datasets (UNSW-NB15 and NIMS), the proposed approach demonstrated significant effectiveness in mitigating the impact of label poisoning attacks, especially for label flipping attacks and backdoor attacks with reasonable noise ratios.
7.2. Limitations & Future Work
The authors implicitly or explicitly highlight several limitations and potential future research directions:
- Instance-Dependent Noise at High Ratios: While
GALPis effective forinstance-dependent noisebelow 10%, its effectiveness decreases for highernoise ratios, wheredata distributionbecomes a critical factor. This implies that for extremely aggressivebackdoor attacks, the defense might still be challenged. - Scope of Attacks: The paper focuses on
label poisoning. Future work could explore integrating defenses against other types ofdata poisoning, such asfeature poisoning(false data injection), ormodel poisoningwhere attackers manipulate model parameters directly. - Generative Model Complexity: Training
GANscan be challenging and computationally intensive. The paper does not delve into the computational overhead or convergence stability ofGALPin a real-worldFLsetting with many clients and varying computational resources. This could be a practical limitation. - Generalizability of GALP: The effectiveness of
GALPrelies on its ability to accurately simulate real attack patterns. While it modelsclass-dependentandinstance-dependent noise, theGAN's ability to perfectly capture the nuances of all potential futureadversarial strategiesmight be limited. Future work might explore adaptiveGALPdesigns. - Homogeneous Networks: The paper mentions "homogeneous federated networks." This implies clients have similar data distributions and model architectures. Extending the defense to
heterogeneous FL(e.g., varying data schemas, different local model types) could be a challenge. - Server-Side Defenses: The paper focuses on client-side vaccination. While the server aggregates vaccinated models, more robust
aggregation rulesoranomaly detectionmechanisms at the server could further enhance overall security.
7.3. Personal Insights & Critique
This paper presents a highly relevant and innovative approach to a critical problem in federated learning. The idea of actively generating adversarial noise to "vaccinate" models is conceptually elegant and addresses a key vulnerability.
- Innovation: The integration of
GANsforadversarial noise generationin theFLcontext is a strong contribution. It moves beyond passive noise handling to proactive defense. The mapping ofbackdoorandlabel flipping attacksto specificlabel noise mechanismsprovides a structured framework for understanding and simulating these threats. - Applicability: The methods and conclusions are highly transferable. The
GALPconcept could be adapted to otherdata poisoningscenarios (e.g., feature poisoning, or evenmodel poisoningby generating malicious model updates). Thevaccinationparadigm could inspire defenses in other distributedmachine learningormulti-agent systemswhere trust among participants is limited. - Potential Issues/Areas for Improvement:
-
Practical Deployment Overhead: While conceptually sound, deploying
GALPon resource-constrainedIoT edge devicesmight be computationally expensive. Training aGANitself requires significant resources. The paper doesn't detail the computational cost per client, which is crucial forFLinIoT. -
Threshold Selection: The determination of
class-dependent error thresholdandinstance-dependent confidence threshold(mentioned in Algorithm 1, though not explicitly defined in the text) is critical. How these thresholds are chosen and their sensitivity to different datasets or attack scenarios could impact performance. This might introduce new hyperparameters that need careful tuning. -
Dynamic Nature of Attacks: Attackers might evolve their strategies. While
GALPlearns knownnoise distributions, a constantly evolving attacker could develop newpoisoning patterns. TheGALPwould need to be continuously updated or made adaptive to such zero-day attack patterns, which might require re-training or more sophisticatedGANarchitectures. -
Data Availability for GALP: The paper implies
GALPuses local client data to generate poisoned versions. If a client's local data is extremely limited or highly skewed,GALP's ability to generate diverse and realisticadversarial noisemight be constrained. -
Aggregation Robustness: While the paper mentions simple averaging for aggregation, exploring more
robust aggregation rules(e.g.,Krum,trimmed mean) in conjunction withGALP-vaccinated models could offer even stronger defenses. This could be a fruitful area for future empirical work.Overall, this paper provides a robust foundation for building more resilient
federated learning systemsin the face of increasingly sophisticatedadversarial attacks. Thevaccinationmetaphor aptly captures the proactive defense mechanism, which is a valuable shift in perspective.
-
Similar papers
Recommended via semantic vector search.