Paper status: completed

A constant-time sampling algorithm for binary Gaussian distribution over the integers

Published:01/21/2022

Binary Gaussian Sampling over Integers (1)Lattice-Based Cryptography Primitives (1)Constant-Time Sampling Algorithm (1)Bitwise Operation Optimized Sampling (1)Rényi Divergence Security Analysis (1)

Original Link

Price: 0.100000

9 readers

This analysis is AI-generated and may not be fully accurate. Please refer to the original paper.

TL;DR Summary

This paper presents a novel constant-time sampling algorithm for binary Gaussian distribution, crucial for lattice-based cryptography. It requires no pre-computed storage, uses hardware-friendly bitwise operations, boasts lower computational complexity, and consumes less entropy

Abstract

Information Processing Letters 176 (2022) 106246 Contents lists available at ScienceDirect Information Processing Letters www.elsevier.com/locate/ipl A constant-time sampling algorithm for binary Gaussian distribution over the integers ✩ Yusong Du, Baoying Fan, Baodian Wei ∗ Guangdong Province Key Laboratory of Information Security Technology, School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China a r t i c l e i n f o a b s t r a c t Article history: Received 21 January 2020 Accepted 16 January 2022 Available online 21 January 2022 Communicated by Abhi Shelat Keywords: cryptography randomized algorithms discrete Gaussian sampling binary Gaussian distribution timing attack Discrete Gaussian sampling over the integers is one of fundamental operations in lattice- based cryptography. The binary Gaussian distribution D Z + , σ 2 is a special discrete Gaussian distribution with σ 2 = √ 1 /( 2 ln 2 ) and μ = 0 over the set

Mind Map

In-depth Reading

English Analysis~14 min read · 18,043 chars

1. Bibliographic Information

Title: A constant-time sampling algorithm for binary Gaussian distribution over the integers
Authors: Yusong Du, Baoying Fan, Baodian Wei. (Affiliation: Guangzhou 510006, China)
Journal/Conference: Information Processing Letters. This is a reputable peer-reviewed scientific journal that publishes short articles on various topics in computer science.
Publication Year: 2022 (Accepted January 16, 2022; Available online January 21, 2022)
Abstract: The paper addresses the problem of discrete Gaussian sampling, a crucial operation in lattice-based cryptography. It focuses on a specific distribution called the binary Gaussian distribution ( $D_{\mathbb{Z}^+, \sigma_2}$ ), which can serve as a base for sampling from more general Gaussian distributions. The authors propose a new constant-time sampling algorithm for this distribution. Key features of their algorithm are that it requires no pre-computed storage, relies mainly on simple bitwise operations (making it hardware-friendly), has lower computational complexity than the standard Knuth-Yao method, and consumes less entropy (randomness) than cumulative distribution table (CDT) based methods. Additionally, its security analysis is simpler.
Original Source Link: /files/papers/68ef2fe8e77486f6f3192e66/paper.pdf (This link is a relative path from the source file. The paper is published and accessible through the journal's official channels.)

2. Executive Summary

Background & Motivation (Why):
- Core Problem: Many modern cryptographic systems, especially those designed to be secure against quantum computers (lattice-based cryptography), rely on sampling numbers from a discrete Gaussian distribution. A major security vulnerability in cryptographic implementations is the timing attack, where an attacker can infer secret information by measuring how long different operations take. To prevent this, cryptographic algorithms must run in constant time, meaning their execution time is independent of any secret data. Existing algorithms for discrete Gaussian sampling are often either not constant-time or are inefficient when made so.
- Importance & Gaps: The binary Gaussian distribution is a fundamental building block for more complex samplers. The original algorithm for it is not constant-time. Making standard methods like the Knuth-Yao or CDT methods constant-time results in significant performance penalties or high memory/randomness costs. There is a need for a sampling algorithm for the binary Gaussian distribution that is secure (constant-time), efficient, and practical for hardware implementations.
- Innovation: This paper introduces a novel algorithm (Algorithm 2) that achieves constant-time sampling through a pure rejection sampling approach. Its design avoids large pre-computed tables and complex arithmetic, relying instead on simple bit-level operations, making it uniquely suited for hardware.
Main Contributions / Findings (What):
- A Novel Constant-Time Algorithm: The paper presents Algorithm 2, a new algorithm for sampling from the binary Gaussian distribution that runs in constant time per attempt.
- Efficiency: The proposed algorithm is shown to be more efficient than existing constant-time alternatives. It has lower computational complexity (fewer bits to scan) than the full-tree Knuth-Yao method and lower entropy consumption (fewer random bits needed) than the full-table-access CDT method.
- No Precomputation: Unlike CDT or Knuth-Yao methods, this algorithm does not require any pre-computed tables to be stored, saving memory and simplifying implementation.
- Simplified Security Analysis: The security of the algorithm can be analyzed using Rényi divergence without the complexities of "precision issues" that affect table-based methods. This makes it easier to formally prove its security for a given parameter set.

Foundational Concepts:
- Lattice-Based Cryptography: A field of public-key cryptography based on the presumed difficulty of solving certain mathematical problems on lattices (grids of points). It is a leading candidate for post-quantum cryptography, i.e., cryptographic systems that can resist attacks from both classical and quantum computers.
- Discrete Gaussian Distribution ( $D_{\mathbb{Z}, \sigma, \mu}$ ): A probability distribution defined over the integers. A random integer $x$ is drawn with a probability proportional to a Gaussian (bell curve) function centered at $\mu$ with a standard deviation $\sigma$ . The probability of picking an integer $x$ is given by $D_{\mathbb{Z}, \sigma, \mu}(x) = \frac{\exp(-(x-\mu)^2 / 2\sigma^2)}{\sum_{k \in \mathbb{Z}} \exp(-(k-\mu)^2 / 2\sigma^2)}$ . Sampling from this distribution is a core operation in many lattice-based schemes.
- Binary Gaussian Distribution ( $D_{\mathbb{Z}^+, \sigma_2}$ ): A specific, simplified case of the discrete Gaussian distribution over non-negative integers ( $\mathbb{Z}^+$ ) with center $\mu=0$ and a fixed standard deviation $\sigma_2 = \sqrt{1/(2\ln 2)}$ . With this specific value of $\sigma_2$ , the relative probability of sampling an integer $x$ simplifies beautifully to $\rho_{\sigma_2}(x) = 2^{-x^2}$ . This distribution is important because it can be used as a simple "base sampler" to construct samplers for more general discrete Gaussian distributions.
- Timing Attack: A type of side-channel attack where an attacker analyzes the time taken to execute a cryptographic algorithm. If the execution time depends on secret data (like a secret key), the attacker can deduce information about that data.
- Constant-Time Algorithm: An algorithm whose execution time is independent of the values of its inputs, especially secret inputs. This is the primary defense against timing attacks.
- Rejection Sampling: A general technique for generating samples from a target distribution. It works by sampling from a simpler, proposal distribution and then "accepting" or "rejecting" the sample based on a probabilistic test. If accepted, the sample is returned; if rejected, the process repeats.
- Inversion Sampling (CDT Method): A sampling method that uses the Cumulative Distribution Table (CDT). The CDT stores the cumulative probabilities $P(X \le k)$ for each possible outcome $k$ . To sample, one generates a uniform random number $u \in [0, 1)$ and finds the smallest $k$ such that $P(X \le k) \ge u$ . A constant-time version requires scanning the entire table every time (full-table-access).
- Knuth-Yao Method (DDG Tree): A method that uses a binary tree (a Discrete Distribution Generating tree) where a path from the root to a leaf, determined by a sequence of random bits, corresponds to sampling a specific value. A constant-time version requires traversing the entire tree structure every time (full-tree Knuth-Yao).
- Rényi Divergence: A mathematical function that measures the "distance" or divergence between two probability distributions. In cryptography, it is used to quantify the security loss when an ideal probability distribution (e.g., a perfect Gaussian) is replaced by a practical, imperfect one (e.g., a truncated or approximated distribution). A smaller divergence means the two distributions are very similar, and the security loss is minimal.
Previous Works:
- Ducas et al. [5]: Introduced the binary Gaussian distribution and proposed the first sampling algorithm for it (Algorithm 1 in this paper). However, their algorithm is not constant-time because the number of loops and random bits it consumes depends on the output value. For example, sampling 0 is much faster than sampling a larger number.
- CDT-based Constant-Time Samplers [6]: To achieve constant time with the inversion method, one must always scan the entire pre-computed CDT. This is efficient in software but can consume a large number of random bits (high entropy consumption) to achieve the required precision.
- Knuth-Yao-based Constant-Time Samplers [12]: To make the Knuth-Yao method constant-time, one must effectively traverse the entire DDG tree structure (or scan the entire probability matrix). This incurs a significant performance overhead, especially in hardware.
- Polynomial Approximation [6, 7]: A modern technique where the binary Gaussian sampler is used as a base. Samples from the base are used as inputs to a pre-computed polynomial that approximates the target distribution's sampling ratio. This whole process can be constant-time if the base sampler is constant-time. This context motivates the need for a better constant-time binary Gaussian sampler.
Differentiation: The proposed algorithm differs from previous works in several key ways:
- vs. Ducas et al. [5]: The new algorithm is constant-time, fixing the main security flaw of the original.
- vs. CDT/Knuth-Yao [6, 12]: It requires no pre-computed tables, saving memory. It is also more computationally efficient (fewer operations/scans) and has lower entropy consumption than its constant-time counterparts.
- vs. Boolean Function Samplers [12, 14]: While samplers based on Boolean functions are also constant-time and efficient, they require complex precomputation and large memory to store the functions, resulting in a larger code or circuit size. The proposed algorithm is simpler in this regard.

4. Methodology (Core Technology & Implementation)

The core of the paper is Algorithm 2, a constant-time sampling algorithm for the binary Gaussian distribution.

Algorithm 2 Sampling from D_Z⁺,σ₂.

Input: Integer N₁ ≥ 7 and N₀ = N₁ · (N₁ − 1)

Output: A non-negative integer approximately from D_Z⁺,σ₂

1: sample N₁ + 1 random bits b₁, b₂, ..., b_N₁+1

2: set n₁ be 0 if b₁ = 0, otherwise set n₁ be the maximum integer such that b₁ = b₂ = ... = b_n₁ = 1

3: sample N₀ random bits b'₁, b'₂, ..., b'_N₀

4: set n₀ be 0 if b'₁ = 1, otherwise set n₀ be the maximum integer such that b'₁ = b'₂ = ... = b'_n₀ = 0

5: set n ← (n₁ − 1) · n₁

6: return n₁ if n₀ ≥ n, otherwise goto Step 1

Principles: The algorithm is a clever implementation of rejection sampling. The goal is to produce an integer $x$ with a probability proportional to $2^{-x^2}$ . The algorithm achieves this by breaking down the probability into two components that are easy to generate using random bits. The overall probability of outputting a value $n_1$ in a single run of the loop (before a potential restart) is constructed to be proportional to $2^{-n_1^2}$ .
Steps & Procedures:
1. Generate $n1$ (Steps 1-2):
  - $N_1+1$ random bits are sampled. $n1$ is determined by the number of leading 1s.
  - If the first bit $b1$ is 0, $n1$ is 0. The probability of this is $1/2$ .
  - If the first $n1$ bits are 1s and the $(n1+1)$ -th bit is 0, the value $n1$ is chosen. The probability of this event is $2^{-(n_1+1)} = (1/2) \cdot 2^{-n_1}$ . This step generates a number $n1$ with a probability proportional to $2^{-n_1}$ .
2. Generate $n0$ (Steps 3-4):
  - $N_0$ random bits are sampled. $n0$ is determined by the number of leading 0s.
  - The probability of getting a specific value $n0$ is $2^{-(n_0+1)}$ , but what matters here is the probability that $n_0$ is at least some value $n$ .
  - The probability that $n_0 \ge n$ (i.e., the first $n$ bits are all 0s) is exactly $2^{-n}$ .
3. The Rejection Test (Steps 5-6):
  - A target value $n = n_1(n_1-1)$ is calculated.
  - The algorithm accepts and returns $n_1$ if $n_0 \ge n$ . The probability of this acceptance condition, given $n_1$ , is $P(\text{accept}|n_1) = P(n_0 \ge n_1(n_1-1)) = 2^{-n_1(n_1-1)}$ .
4. Overall Probability in One Loop:
  - The total probability of generating and accepting a value $n_1$ in a single iteration is the product of the probability of generating $n_1$ and the probability of accepting it: $P(\text{output } n_1) = P(\text{generate } n_1) \times P(\text{accept}|n_1) = ( \frac{1}{2} \cdot 2^{-n_1} ) \times ( 2^{-n_1(n_1-1)} ) = \frac{1}{2} \cdot 2^{-n_1 - n_1^2 + n_1} = \frac{1}{2} \cdot 2^{-n_1^2}$
  - This shows that the probability of outputting $n_1$ in one go is proportional to the target probability density $\rho_{\sigma_2}(n_1) = 2^{-n_1^2}$ . The "goto Step 1" handles the rejection case, ensuring that eventually, a sample is produced from the correct final distribution.
Mathematical Formulas & Key Details (Theorem 1):
- Theorem 1 states that Algorithm 2 produces a tailcut binary Gaussian distribution. This means it samples from the binary Gaussian distribution but restricted to the set of integers $[N_1] = \{0, 1, \ldots, N_1\}$ .
- Proof Intuition:
  - Let $q(x) = \frac{1}{2} \cdot 2^{-x^2}$ be the probability of successfully outputting $x$ in one iteration.
  - The total probability of success in one iteration is $S = \sum_{x=0}^{N_1} q(x)$ . The probability of failure (rejection) is $Q_{N_1} = 1 - S$ .
  - The final probability of outputting a value $x$ is the sum of probabilities of succeeding on the first try, or failing then succeeding on the second, and so on: $P(\text{final output is } x) = q(x) + Q_{N_1}q(x) + Q_{N_1}^2q(x) + \dots = q(x) \sum_{i=0}^{\infty} Q_{N_1}^i = \frac{q(x)}{1-Q_{N_1}} = \frac{q(x)}{S}$
  - Substituting the expressions for q(x) and $S$ : $P(\text{final output is } x) = \frac{\frac{1}{2} \cdot 2^{-x^2}}{\sum_{n_1=0}^{N_1} \frac{1}{2} \cdot 2^{-n_1^2}} = \frac{2^{-x^2}}{\sum_{n_1=0}^{N_1} 2^{-n_1^2}} = \frac{\rho_{\sigma_2}(x)}{\rho_{\sigma_2}([N_1])}$
  - This final expression is precisely the definition of the tailcut binary Gaussian distribution over the set $[N_1]$ .
Constant-Time Implementation Details:
- The algorithm can be made constant-time because the number of bits sampled in each loop is fixed: $N_1+1$ and $N_0$ .
- Steps 2 and 4 (finding the number of leading 1s or 0s) can be implemented by always scanning all $N_1+1$ or $N_0$ bits.
- Step 6 (comparison $n0 >= n$ ) can be implemented in constant time using a signed integer subtraction and checking the sign bit.

5. Experimental Setup

The paper's evaluation is theoretical, focusing on security and complexity analysis rather than empirical benchmarks on specific hardware.

Datasets: No datasets are used. The analysis is purely algorithmic.
Evaluation Metrics:
1. Rényi Divergence:
  - Conceptual Definition: A measure of how different two probability distributions are. A value close to 1 indicates the distributions are very similar. It's used here to measure the "statistical distance" between the algorithm's actual output distribution (the tailcut one, $\mathcal{D}'$ ) and the ideal, infinite binary Gaussian distribution ( $\mathcal{D}$ ). This quantifies the security impact of using an approximate sampler.
  - Mathematical Formula: The Rényi divergence of order $a$ is: $R_a(\mathcal{P} || \mathcal{Q}) = \left( \sum_{x \in \mathrm{Supp}(\mathcal{P})} \frac{\mathcal{P}(x)^a}{\mathcal{Q}(x)^{a-1}} \right)^{\frac{1}{a-1}}$
  - Symbol Explanation:
    - $\mathcal{P}, \mathcal{Q}$ : Two probability distributions. In this paper, $\mathcal{P} = D_{[N_1], \sigma_2}$ (the algorithm's output) and $\mathcal{Q} = D_{\mathbb{Z}^+, \sigma_2}$ (the ideal distribution).
    - $a$ : The order of the divergence, where $a \in (1, \infty)$ . For cryptographic security analysis, it is often set to $a = 2\lambda$ where $\lambda$ is the bit-security level.
    - $\mathrm{Supp}(\mathcal{P})$ : The support of the distribution $\mathcal{P}$ , i.e., the set of all possible outcomes.
2. Computational Complexity:
  - G: The average number of random bits that need to be generated to produce one output sample.
  - S: The average number of bits that need to be scanned to produce one output sample. In the constant-time implementation of Algorithm 2, $G = S$ .
Baselines:
- Full-tree Knuth-Yao: A constant-time sampler based on the Knuth-Yao method, where the entire probability matrix $M$ is scanned for each sample.
- Full-table access CDT: A constant-time sampler based on inversion sampling, where the entire Cumulative Distribution Table is read for each sample.

6. Results & Analysis

The paper presents its results in two tables, which are transcribed and analyzed below.

Core Results: Security Analysis (Table 1)

The following is a transcription of Table 1 from the paper.

$N_1$	$R_a(D_{[N_1], \sigma_2} \|\| D_{\mathbb{Z}^+, \sigma_2})$	256-bit security, q queries	256-bit security, $q_s$ signatures
$N_1 = 7$	$< 1 + 2^{-64}$	$q = 2^{60}$	$q_s = 2^{49}$
$N_1 = 8$	$< 1 + 2^{-81}$	$q = 2^{77}$	$q_s = 2^{66}$
$N_1 = 9$	$< 1 + 2^{-100}$	$q = 2^{96}$	$q_s = 2^{85}$

Analysis of Table 1:
- This table evaluates the security impact of the "tailcut" (truncation) for different values of the output range parameter $N_1$ . The analysis is for a security level of $\lambda=256$ bits, so the Rényi divergence order is $a = 2\lambda = 512$ .
- The second column shows that the Rényi divergence is extremely close to 1, indicating the output distribution is statistically very close to the ideal one. For $N_1=9$ , the divergence is less than $1+2^{-100}$ , which is a negligible difference for any practical purpose.
- The third and fourth columns translate this divergence into practical security terms using Lemma 1. They show the maximum number of queries ( $q$ ) or digital signatures (qs) an adversary can make before the 256-bit security level might drop to 255-bit. For $N_1=9$ , the scheme remains secure even after an impossibly large number of operations ( $2^{85}$ signatures).
- Key Insight: A small value of $N_1$ (like 9) is sufficient to achieve a very high level of security. This is important because $N_1$ directly impacts the algorithm's complexity.
- Simpler Analysis: The authors highlight that this analysis is simpler than for CDT/Knuth-Yao because it only involves analyzing the tailcut probability. It does not have to deal with additional errors introduced by finite-precision representation of probabilities in tables, known as the "precision issue".

Core Results: Complexity Analysis (Table 2)

The following is a transcription of Table 2 from the paper.

	$N_1 = 7$	$N_1 = 8$	$N_1 = 9$
Algorithm 2	G = S = 63.9	G = S = 83.1	G = S = 104.8
K.Y. (p=64)	G = 64, S = 256
K.Y. (p=96)		G = 96, S = 432
K.Y. (p=128)			G = 128, S = 640

Analysis of Table 2:
- This table compares the computational complexity of Algorithm 2 with the full-tree Knuth-Yao (K.Y.) method. Complexity is measured by bits generated ( $G$ ) and scanned ( $S$ ).
- For Algorithm 2, the complexity ( $G=S$ ) is the expected number of bits needed per sample, calculated as $2(N_1 + N_0 + 1) / \sum_{n_1=0}^{N_1} 2^{-n_1^2}$ . For $N_1=9$ , this is about 104.8 bits.
- For the Knuth-Yao method, to sample from $\{0, \dots, N_1\}$ , the probability matrix has $\lambda = N_1+1$ rows. The number of columns $p$ determines the precision. For $N_1=9$ (so $\lambda=10$ rows) and a typical precision of $p=128$ , the sampler must generate 128 random bits ( $G=128$ ) and scan a large number of bits in the matrix ( $S$ is estimated as $\lambda \cdot p / 2 = 10 \cdot 128 / 2 = 640$ , to account for optimizations).
- Key Insight: Algorithm 2 is significantly more efficient. For $N_1=9$ , it requires fewer random bits to be generated (104.8 vs. 128) and far fewer bits to be scanned (104.8 vs. 640). This makes it much faster, especially in hardware where scanning memory can be costly.
Comparison with CDT-based Sampler:
- The paper also compares entropy consumption with a constant-time CDT-based sampler from Zhao et al. [6], which outputs integers from 0 to 9. That sampler requires 126 random bits per sample.
- Algorithm 2 with $N_1=9$ provides the same output range but requires only 104.8 bits of entropy on average. This is a ~17% reduction in randomness required, which can be a significant advantage.

7. Conclusion & Reflections

Conclusion Summary: The paper successfully introduces a new constant-time sampling algorithm for the binary Gaussian distribution. This algorithm is a strong alternative to existing methods, offering superior performance in terms of computational complexity and entropy consumption. Its key advantages are that it requires no pre-computed storage, is well-suited for hardware implementation due to its reliance on bitwise operations, and allows for a more straightforward security analysis. It serves as an excellent base sampler for building more complex, constant-time discrete Gaussian samplers in lattice-based cryptography.
Limitations & Future Work:
- The paper does not provide an empirical performance analysis on specific hardware (like FPGAs or ASICs), which would be a valuable next step to concretely demonstrate its hardware-friendliness.
- The total execution time is still probabilistic, as the algorithm might loop multiple times due to rejection. While each loop is constant-time (protecting against simple timing attacks), an attacker observing the total time over many operations might still gain some statistical information. However, the probability of rejection is constant and independent of any secret, making this a very difficult attack vector.
- The choice of $N_1$ is a trade-off between the output range, security (statistical distance), and performance. The paper provides clear guidance for this trade-off, but it remains a parameter that implementers must choose carefully.
Personal Insights & Critique:
- The algorithm's design is elegant and insightful. It cleverly decomposes the target probability $2^{-x^2}$ into two components, $2^{-x}$ and $2^{-x(x-1)}$ , which can be sampled probabilistically by counting leading bits. This is a non-obvious and highly effective approach.
- The most significant practical contribution is arguably the combination of constant-time security, high efficiency, and zero precomputation. In resource-constrained environments (like smart cards or IoT devices), eliminating the need for storing large tables (CDT or probability matrices) is a major advantage.
- The simplification of the security proof is a noteworthy academic contribution. By sidestepping the "precision issue" inherent in table-based methods, the authors make it easier for cryptographers to validate and deploy implementations with confidence in their security bounds. This work is a solid step forward in the practical implementation of secure post-quantum cryptography.

Similar papers

Recommended via semantic vector search.

No similar papers found yet.