Documentation

LeanCert.Engine.Affine.Vector

Affine Arithmetic: Vector Operations #

This file provides vectorized affine operations for efficient neural network verification. The key insight is that shared noise symbols track correlations across vector elements.

The Dependency Problem in Vectors #

Consider LayerNorm: y_i = (x_i - μ) / σ where μ = mean(x).

With standard interval arithmetic:

With Affine Arithmetic:

Main definitions #

Affine Vector #

@[reducible, inline]

A vector of affine forms sharing the same noise symbol space.

All elements should have the same coeffs.length (number of noise symbols). This ensures correlations are properly tracked.

Equations
Instances For

    Create an affine vector from intervals, assigning each a unique noise symbol.

    If intervals = [I₀, I₁, I₂], creates:

    • x₀ = mid(I₀) + rad(I₀)·ε₀
    • x₁ = mid(I₁) + rad(I₁)·ε₁
    • x₂ = mid(I₂) + rad(I₂)·ε₂
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Linear Operations #

      Aggregation Operations #

      LayerNorm Components #

      Compute (x - μ) for each element, where μ = mean(x).

      This is where Affine Arithmetic shines: the subtraction properly cancels the correlated parts, giving tight bounds.

      Equations
      Instances For

        Layer Normalization: (x - μ) / √(σ² + ε) * γ + β

        Parameters:

        • v: input vector (affine)
        • gamma: scale parameters
        • beta: shift parameters
        • eps: numerical stability constant
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Softmax Components #

          Compute exp(x_i - max(x)) for numerical stability

          Equations
          • One or more equations did not get rendered due to their size.
          Instances For

            Softmax using algebraic cancellation.

            softmax(x)_i = exp(x_i) / Σ exp(x_j) = 1 / Σ exp(x_j - x_i)

            By computing differences first, we track correlations better.

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Attention Components #

              Scaled dot-product attention scores for a single query.

              Computes softmax(q · K^T / √d_k) where K is a list of key vectors.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                Apply attention weights to values

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  GELU #

                  GELU activation: x · Φ(x) ≈ 0.5 · x · (1 + tanh(√(2/π) · (x + 0.044715 · x³)))

                  Using affine arithmetic preserves correlations in x · tanh(f(x)).

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    Soundness #

                    Membership for affine vectors

                    Equations
                    Instances For
                      theorem LeanCert.Engine.Affine.AffineVector.mem_sum {v_real : List } {v : AffineVector} {eps : AffineForm.NoiseAssignment} (_hvalid : AffineForm.validNoise eps) (hmem : mem v_real v eps) :
                      v.sum.mem_affine eps v_real.sum

                      Sum is sound

                      theorem LeanCert.Engine.Affine.AffineVector.mem_mean {v_real : List } {v : AffineVector} {eps : AffineForm.NoiseAssignment} (hvalid : AffineForm.validNoise eps) (hmem : mem v_real v eps) (hne : ¬List.isEmpty v = true) :
                      v.mean.mem_affine eps (v_real.sum / v_real.length)

                      Mean is sound

                      theorem LeanCert.Engine.Affine.AffineVector.mem_centered {v_real : List } {v : AffineVector} {eps : AffineForm.NoiseAssignment} (hvalid : AffineForm.validNoise eps) (hmem : mem v_real v eps) :
                      have μ := v_real.sum / v_real.length; have centered_real := List.map (fun (x : ) => x - μ) v_real; mem centered_real v.centered eps

                      Centered is sound: x - mean(x)

                      theorem LeanCert.Engine.Affine.AffineVector.mem_variance {v_real : List } {v : AffineVector} {eps : AffineForm.NoiseAssignment} (hvalid : AffineForm.validNoise eps) (hmem : mem v_real v eps) (hne : ¬List.isEmpty v = true) :
                      have μ := v_real.sum / v_real.length; have var_real := (List.map (fun (x : ) => (x - μ) * (x - μ)) v_real).sum / v_real.length; v.variance.mem_affine eps var_real

                      Variance is sound

                      theorem LeanCert.Engine.Affine.AffineVector.mem_layerNorm {v_real : List } {v : AffineVector} {eps : AffineForm.NoiseAssignment} (gamma beta : List ) (epsilon : ) (heps_pos : 0 < epsilon) (hvalid : AffineForm.validNoise eps) (hmem : mem v_real v eps) (hne : ¬List.isEmpty v = true) (hlen_eps : List.length eps = (v.variance.add (AffineForm.const epsilon)).coeffs.length) (hlen_sqrt : List.length eps = (v.variance.add (AffineForm.const epsilon)).sqrt.coeffs.length) (hsqrt_pos : 0 < (v.variance.add (AffineForm.const epsilon)).sqrt.toInterval.lo) :
                      have μ := v_real.sum / v_real.length; have variance_real := (List.map (fun (x : ) => (x - μ) * (x - μ)) v_real).sum / v_real.length; have std_real := (variance_real + epsilon); have inv_std_real := 1 / std_real; have output_real := List.zipWith3 (fun (xi : ) (g b : ) => (xi - μ) * inv_std_real * g + b) v_real gamma beta; mem output_real (v.layerNorm gamma beta epsilon) eps

                      LayerNorm is sound for the complete operation.

                      Note: This theorem requires additional hypotheses that would typically be verified at the call site:

                      • The variance + epsilon must be positive (guaranteed by epsilon > 0)
                      • Length constraints for gamma and beta
                      • The affine form for var + eps must have positive lower bound for inv

                      The proof composition uses:

                      1. mem_centered for centering
                      2. mem_variance for variance
                      3. mem_add for adding epsilon
                      4. mem_sqrt for standard deviation
                      5. mem_inv for reciprocal
                      6. mem_layerNorm_elem for each output element