Documentation

LeanCert.ML.LayerNormAffine

Affine Arithmetic LayerNorm for ML #

This file bridges the ML Transformer module with the Affine Arithmetic module, providing tight bounds for LayerNorm by preserving correlations.

The Problem with Standard Interval Arithmetic #

Standard interval arithmetic loses correlations between variables. In LayerNorm:

Example: If x_i ∈ [0.9, 1.1] for all i:

Solution: Affine Arithmetic #

Affine forms track symbolic dependencies via noise symbols:

Main Definitions #

References #

Affine LayerNorm Forward Pass #

Forward pass using affine arithmetic for tight bounds.

This converts the input intervals to affine forms, computes LayerNorm while preserving correlations, then extracts the resulting intervals.

Key advantage: The centering step x - μ preserves correlations, giving much tighter bounds than standard interval arithmetic.

Equations
Instances For

    Convert affine output back to intervals.

    This extracts conservative interval bounds from the affine forms. The bounds are tight because correlations were preserved during computation.

    Equations
    Instances For

      Comparison: Interval vs Affine Bounds #

      Compute both interval and affine bounds for comparison.

      Returns (interval_bounds, affine_bounds) for the same input. The affine bounds should be tighter, especially for centering.

      Equations
      Instances For

        Measure the tightness improvement from affine arithmetic.

        Returns the ratio of interval width to affine width for each output dimension. Values > 1 indicate affine is tighter.

        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Soundness Theorem #

          The affine LayerNorm is sound: if inputs are in the affine forms, then outputs are in the resulting affine forms.

          This follows from composition of:

          • AffineVector.mem_centered - centering preserves membership
          • AffineVector.mem_variance - variance is sound
          • AffineForm.mem_add - addition is sound
          • AffineForm.mem_sqrt - square root is sound
          • AffineForm.mem_inv - inversion is sound
          • AffineForm.mem_mul, AffineForm.mem_scale, AffineForm.mem_add - final combination

          The proof requires additional hypotheses about positivity of variance + epsilon and compatibility of lengths, which are handled in the implementation.

          FFNBlock and TransformerBlock with Affine Arithmetic #

          FFNBlock forward pass using affine arithmetic where beneficial.

          Uses affine arithmetic for LayerNorm (where correlation matters), standard intervals for linear layers (where it doesn't).

          Equations
          Instances For

            TransformerBlock forward pass with tight LayerNorm bounds.

            Uses affine arithmetic for LayerNorm to avoid the dependency problem.

            Equations
            Instances For