Quantum Finite Automata and the Stinespring Dilation Theorem in Lean

Bjørn Kjos-Hanssen and Swarnalakshmi Lakshmanan

Wiedijk has a famous list of 100 mathematical theorems whose formalization is sought in a proof assistant. David and Palsberg recently announced a similar list restricted to quantum information theory. The latter subject has been intensively formalized by Meiburg. In this article our aim is twofold: (1) to answer the challenge of David and Palsberg by supplying a solution to a finite-dimensional version of their problem #54: formalize the Stinespring dilation theorem; (2) answer an unpublished challenge by Meiburg to formalize quantum finite automata.

1 Introduction

In this project we formalize parts of the article Unbounded length minimal synchronizing words for quantum channels over qutrits [ 5 ] and notions in quantum information theory.

In particular we formalize the notion of a measure-once one-way general quantum finite automaton (MO-1gQFA) introduced by Li, Qiu, Zou, Li, Wu, Mateus [ 6 ] in 2012. As an alternative to “MO-1gQFA” we refer to these simply as Kraus operator automata. In particular we formalize cut-point languages as in [ 9 ] who call the automata “RT-QFA” (real-time QFA).

The paper addresses two needs or request by other researchers:

Marco David and Jens Palsberg posted a list of Top 100 Quantum Theorems on March 11, 2026, at https://marcodavid.net/top100. We include a matrix version of #54 on their list, the Stinespring Dilation Theorem.

Meiburg, Lessa and Soldati [ 8 ] formalized a result from Hayashi and Yamasaki [ 2 ] in the proof assistant Lean, giving a corrected proof of the Generalized Quantum Stein’s Lemma. The latter is an important result in operational quantum information theory going back to Hiai and Petz [ 3 ] .

In the course of formalizing this monumental result they build a library Lean-QuantumInfo [ 7 ] containing large parts of quantum information theory.

One notable missing and sought-after ingredient in Lean-Quantuminfo is the theory of quantum finite automata. In the project Kraus [ 4 ] we partially remedy this. In particular, we formalize the language of a measure-once real-time QFA [ 9 , 6 ] in terms of Mathlib’s probability mass functions (PMF) and compute some examples related to the Kraus operators in this paper and in Grudka et al.’s paper [ 1 ] . Most results are proved for the two fields \(\mathbb R\) and \(\mathbb C\), encapsulated in Lean’s mathematics library Mathlib4 by the RCLike typeclass.

The list collects 100 important theorems in quantum physics, quantum information and quantum computing. It was inspired by the list of the top 100 mathematical theorems of which Freek Wiedijk tracks the formalization progress (Figure 1).

3 55

The Spectral Theorem
Eigenstates in an infinite square well
The Quantum Harmonic Oscillator
Ehrenfest’s Theorem
Quantum tunneling through a finite potential barrier
Conservation of the probability (4-)current
Eigenstates of a δ-distribution well
Heisenberg Uncertainty Relation
Baker–Campbell–Hausdorff Formula
Trotter Product Formula (Lie–Trotter–Kato)
Simultaneous diagonalization of commuting observables
Noether’s Theorem
Wigner’s Theorem
Stone–von Neumann Theorem
Spectrum of the hydrogen atom (Coulomb potential)
Addition of angular momentum (e.g. Clebsch–Gordan)
Wigner–Eckart Theorem
Power-series spectrum of a perturbed time-independent Hamiltonian
Dyson series for time-dependent perturbation theory
Correctness of the Rayleigh–Ritz method
Correctness of the Born–Oppenheimer approximation
The Adiabatic Theorem
Berry phase (Geometric phase)
Superdense coding
Quantum teleportation with shared entanglement
No-Cloning Theorem / No-Broadcast Theorem / No-Teleportation Theorem
No-Communication Theorem
No-Deleting Theorem
Schrödinger–HJW Theorem (purification of mixed states)
Schmidt decomposition for Hilbert spaces
Bell’s Theorem (QM violates CHSH inequality)
Monogamy of entanglement
Tsirelson bound
Correctness of the Deutsch–Jozsa algorithm
Correctness of Shor’s algorithm
Correctness of Grover’s algorithm
The Bennett–Bernstein–Brassard–Vazirani Theorem (BBBV)
Quantum key distribution by BB84
Impossibility of quantum bit commitment (Mayers–Lo–Chau)
Solovay–Kitaev Theorem
Universality of sets of two-qubit quantum gates
Universality of the Toffoli and the Hadamard
Five two-qubit gates are necessary and sufficient to implement a Toffoli gate
Six two-qubit linear nearest-neighbor gates are necessary and sufficient to implement a Toffoli gate
3n CX gates are necessary to implement a multi-control Toffoli gate with n-1 controls
Adiabatic and gate-based quantum computing are equivalent up to polynomial factors
The Threshold Theorem (quantum fault-tolerance)
Petz recovery map
Choi’s Theorem on completely positive maps
Choi–Jamiołkowski isomorphism (channel-state duality)
Gelfand–Naimark–Segal construction
Gelfand–Naimark Theorem
Krein–Milman Theorem
Stinespring’s Factorization Theorem / Naimark’s Dilation Theorem
Continuous Functional Calculus
Gleason’s Theorem
Holevo’s Theorem
Pusey–Barrett–Rudolph Theorem
Kochen–Specker Theorem
The Spectral Theorem for PVMs
Strong subadditivity of quantum entropy
Entanglement-assisted classical capacity of quantum channels
Quantum state discrimination for two states (Ivanović–Dieks–Peres Limit)
The Lloyd–Shor–Devetak Theorem for quantum channel capacity
Eastin–Knill Theorem
Gottesman–Knill Theorem
Magic state distillation
Correctness of distillation of Bell pairs
1D gapped Hamiltonians have area-law ground states
The Lieb–Robinson bound
Onsager’s solution to the 2D Ising model
MIP* = RE
BQP \(\subseteq \) PP
QIP = PSPACE
Quantum Stein Lemma
Quantum 3-SAT is QMA1-complete
Computing the Jones polynomial at roots of unity is BQP-hard
Uniqueness of 4-dimensional representations of the Clifford algebra (up to unitary equivalence)
Spin-statistics Theorem
Reeh–Schlieder Theorem
Wick’s Theorem
Elitzur’s Theorem
2D TQFTs are equivalent to Frobenius algebras
Chern–Simons theory can compute the Jones polynomial
The CFT central charge is its entanglement entropy
Osterwalder–Schrader Reconstruction Theorem
4D Gaussian free field theory satisfies the Osterwalder-Schrader axioms
Bisognano–Wichmann Theorem
Wood–Spekkens–Bell Theorem
Correctness of block encoding
Correctness of Hamiltonian simulation by Trotterization
Correctness of Hamiltonian simulation by Linear Combination of Unitaries
Correctness of Hamiltonian simulation by Qubitization
Qubit reuse is NP-complete
Correctness of a quantum circuit optimizer
Correctness of the toric code
Correctness of entanglement-assisted stabilizer codes
The Knill–Laflamme conditions
Correctness of approximate circuit synthesis
Correctness of Simon’s algorithm

Figure 1 The Top 100 Quantum Theorems.

2 Formalizing QFAs

The following definitions and lemmas were generated by the Gemini 3 agent when asked to translate our Lean code from [ 4 ] into L^aT_eX and English.

First, operator “sandwiches”.

Definition 1 Kraus Map

✓

Let \(R\) be a star-ring with an additive commutative monoid structure. Given a family of matrices \(\{ K_i\} _{i=1}^r \in M_q(R)\), the Kraus application map \(\Phi : M_q(R) \to M_q(R)\) is defined as the completely positive map:

\begin{equation} \Phi (\rho ) = \sum _{i=0}^{r-1} K_i \rho K_i^\dagger \end{equation}

where \(K_i^\dagger \) denotes the conjugate transpose of the \(i\)-th Kraus operator \(K_i\).

Lemma 2 Kraus Map Preserves PSD

Let \(\rho \in M_q(R)\) be a positive semi-definite matrix (\(\rho \ge 0\)). Then for any Kraus operators \(\{ K_i\} \), the result of the map

\[ \Phi (\rho ) = \sum _{i} K_i \rho K_i^\dagger \]

is also positive semi-definite.

Proof ▶

For each \(i\), since \(\rho \ge 0\), the conjugation \(K_i \rho K_i^\dagger \) is positive semi-definite. Because the sum of positive semi-definite matrices is also positive semi-definite, it follows that \(\sum _{i} K_i \rho K_i^\dagger \ge 0\).

Quantum State and Operation Definitions

Definition 3 Quantum Channel / TP Condition

A set of Kraus operators \(\{ K_i\} _{i=1}^r \subset M_q(\mathbb {C})\) defines a quantum channel (or a trace-preserving map) if they satisfy the completeness relation:

\begin{equation} \sum _{i=1}^r K_i^\dagger K_i = I_q \end{equation}

where \(I_q\) is the \(q \times q\) identity matrix.

Definition 4 Quantum Operation

A set of Kraus operators \(\{ K_i\} _{i=1}^r \subset M_q(\mathbb {C})\) defines a quantum operation if they satisfy:

\begin{equation} \sum _{i=1}^r K_i^\dagger K_i \le I_q \end{equation}

In this context, \(\le \) denotes the Loewner order, meaning \(I_q - \sum K_i^\dagger K_i\) is a positive semi-definite matrix.

Definition 5 Density Matrix

A matrix \(\rho \in M_q(\mathbb {C})\) is a density matrix (or a normalized state) if it satisfies:

\(\rho \ge 0\) (Positive semi-definite)
\(\operatorname {Tr}(\rho ) = 1\) (Normalization)

Definition 6 Sub-normalized Density Matrix

A matrix \(\rho \in M_q(\mathbb {C})\) is a sub-normalized density matrix if it satisfies:

\(\rho \ge 0\) (Positive semi-definite)
\(\operatorname {Tr}(\rho ) \le 1\)

These often represent states after a non-deterministic measurement.

Lemma 7 Trace Preservation of Quantum Channels

Let \(\Phi : M_q(\mathbb {C}) \to M_q(\mathbb {C})\) be a map defined by a set of Kraus operators \(\{ K_i\} _{i=1}^r\) satisfying the quantum channel condition \(\sum _i K_i^\dagger K_i = I_q\). Then for any \(\rho \in M_q(\mathbb {C})\), the trace is preserved:

\begin{equation} \operatorname {Tr}(\Phi (\rho )) = \operatorname {Tr}(\rho ) \end{equation}

Proof ▶

By the linearity and cyclic property of the trace, \(\operatorname {Tr}(\sum _i K_i \rho K_i^\dagger ) = \sum _i \operatorname {Tr}(K_i^\dagger K_i \rho ) = \operatorname {Tr}((\sum _i K_i^\dagger K_i) \rho ) = \operatorname {Tr}(I_q \rho ) = \operatorname {Tr}(\rho )\).

Proposition 8 Trace-Preserving Property

Let \(\Phi \) be a quantum channel defined by Kraus operators \(\{ K_i\} \) such that \(\sum _i K_i^\dagger K_i = I\). If \(\rho \) is a normalized matrix with \(\operatorname {Tr}(\rho ) = 1\), then the output of the channel is also normalized:

\begin{equation} \operatorname {Tr}(\Phi (\rho )) = 1 \end{equation}

Definition 9 Quantum Channel Map on Density Matrices

Let \(\mathcal{D}_q\) denote the set of \(q \times q\) density matrices (positive semi-definite matrices with unit trace). A quantum channel \(\Phi \) induces a well-defined map:

\begin{equation} \Phi : \mathcal{D}_q \to \mathcal{D}_q \end{equation}

mapping \(\rho \mapsto \sum _i K_i \rho K_i^\dagger \). This map is well-defined because it preserves both the positive semi-definite property and the unit trace of the input state.

Definition 10 Quantum Word Map

Let \(\Sigma \) be an alphabet and \(\mathcal{K} : \Sigma \to (\text{Fin } r \to M_q(R))\) be a function mapping each symbol to a Kraus family. For a word \(w\) of length \(n\), the iterated application map \(\delta ^* : (\text{Fin } n \to \Sigma ) \to M_q(R) \to M_q(R)\) is defined recursively by:

\begin{equation} \delta ^*(w, \rho ) = \begin{cases} \rho & \text{if } n = 0 \\ \Phi _{w(n-1)} \left( \delta ^*(\text{init}(w), \rho ) \right) & \text{if } n {\gt} 0 \end{cases} \end{equation}

where \(\Phi _{a}(\sigma ) = \sum _{i \in \text{Fin } r} K_{a,i} \sigma K_{a,i}^\dagger \) is the completely positive map associated with the symbol \(a \in \Sigma \), and \(\text{init}(w)\) denotes the prefix of the word of length \(n-1\).

Theorem 11 Stability of Quantum Word Processing

Let \(\Sigma \) be an alphabet where each symbol \(a \in \Sigma \) is associated with a quantum channel \(\Phi _a\) (satisfying \(\sum _i K_{a,i}^\dagger K_{a,i} = I\)). For any word \(w \in \Sigma ^n\) and any initial density matrix \(\rho \), the state after processing the word, \(\rho _w = \delta ^*(w, \rho )\), satisfies:

\(\rho _w \ge 0\) (Positive Semi-definiteness)
\(\operatorname {Tr}(\rho _w) = 1\) (Trace Preservation)

In other words, \(\delta ^*(w, \cdot )\) is a well-defined map from \(\mathcal{D}_q\) to \(\mathcal{D}_q\).

Proof ▶

By induction on the length of the word \(n\):

Base case (\(n=0\)): \(\delta ^*(\epsilon , \rho ) = \rho \), which is a density matrix by hypothesis.
Inductive step: Assume \(\rho _{w'}\) is a density matrix for a word of length \(m\). For a word \(w = w'a\), \(\rho _w = \Phi _a(\rho _{w'})\). Since \(\Phi _a\) is a quantum channel, it preserves both the PSD property and the unit trace of \(\rho _{w'}\).

Proceeding this way, eventually we get to defining positive operator valued measure in terms of a PMF (probability mass function) where the probability of outcome \(e_i\) is the trace of the product of the density matrix and the corresponding pure state:

Definition 12 Computational Basis Measurement PMF

Let \(\rho \in M_k(\mathbb {C})\) be a density matrix (\(\rho \ge 0\) and \(\operatorname {Tr}(\rho ) = 1\)). Let \(\{ \mathbf{e}_i\} _{i=1}^k\) be the standard basis vectors in \(\mathbb {C}^k\), where \(\mathbf{e}_i\) has a 1 at index \(i\) and 0 elsewhere. The pure state projection associated with outcome \(i\) is defined as:

\begin{equation} P_i = \mathbf{e}_i \mathbf{e}_i^\top \end{equation}

The resulting probability mass function (PMF) over the set of outcomes \(\{ 1, \dots , k\} \) is defined by the assignment:

\begin{equation} \text{Prob}(i) = \operatorname {Re}(\operatorname {Tr}(P_i \rho )) \end{equation}

Next we turn this into a Bernoulli probability measure by declaring that a specific \(e_{\text{acc}}\) is the accepted subspace:

Definition 13 Binary Acceptance PMF

✓

Let \(\rho \in M_k(\mathbb {R})\) be a density matrix, and let \(\text{acc} \in \{ 1, \dots , k\} \) be the designated acceptance index. The probability distribution over the outcomes \(\{ 0, 1\} \) (Reject, Accept) is given by the PMF:

\begin{equation} \text{Pr}(i) = \begin{cases} \operatorname {Tr}(P_{\text{acc}} \rho ) & \text{if } i = 1 \text{ (Accept)} \\ \operatorname {Tr}((I - P_{\text{acc}}) \rho ) & \text{if } i = 0 \text{ (Reject)} \end{cases} \end{equation}

where \(P_{\text{acc}} = \mathbf{e}_{\text{acc}} \mathbf{e}_{\text{acc}}^\top \) is the projection onto the accepting basis state.

Finally we get to the measure-once language accepted by our quantum finite automaton, with cutpoint 1/2:

Definition 14 MO-QFA Language Acceptance

✓

Let \(\mathcal{A} = (\Sigma , \mathcal{K}, \rho _0, P_{\text{acc}})\) be a Measure-Once Quantum Finite Automaton. The language accepted by \(\mathcal{A}\) with cutpoint \(\lambda = \frac{1}{2}\) is the set of words:

\begin{equation} L(\mathcal{A}) = \left\{ w \in \Sigma ^* \mid \operatorname {Tr}(P_{\text{acc}} \, \delta ^*(w, \rho _0)) {\gt} \frac{1}{2} \right\} \end{equation}

where:

\(\delta ^*(w, \rho _0)\) is the state after processing word \(w\).
\(P_{\text{acc}} = \mathbf{e}_{\text{acc}} \mathbf{e}_{\text{acc}}^\top \) is the projection onto the accepting state.
The index \(1\) in the probability mass function represents the “Accept” outcome.

Lemma 15

✓

A probability mass function on two orthogonal projections \(P, P^{\bot }\), with measure equal to the trace of the density matrix \(\rho \) times the projection.

Definition 16

✓

Transition function \(\delta ^*\) corresponding to a word over an alphabet, where each symbol is mapped to a completely positive map in Kraus form, of rank at most \(r\).

Some other formal lemmas, that we only describe stenographically, include:

Lemma 17

✓

A projection-valued measure, with measure equal to the trace of the density matrix \(\rho \) times the projection.

Lemma 18

✓

If \(A\) and \(B\) are PSD then \(AB\) has nonnegative trace.

Lemma 19

✓

If \(P\) is a projection and \(\rho \) is PSD then the trace of \(P\rho \) is nonnegative.

Definition 20

✓

Projection-valued measure.

Definition 21

✓

Quantum channel.

Lemma 22

✓

Kraus operators preserve the PSD property.

Lemma 23

✓

An automaton based on a quantum channel maps density matrices to density matrices while reading a single letter.

Lemma 24

✓

An automaton based on a quantum channel maps density matrices to density matrices while reading a word.

Lemma 25

✓

A basis state density matrix has trace one.

Lemma 26

✓

A pure state is PSD.

3 Stinespring dilation

We now switch gears to consider the Stinespring dilation, in matrix form.

Stinespring Dilation and Partial Trace

Definition 27 Partial Trace

Let \(\rho \in M_{mn}(\mathbb {C})\) be a density matrix on a composite system \(\mathcal{H}_A \otimes \mathcal{H}_B\). The partial trace over the right subsystem (\(\mathcal{H}_B\)) is the map \(\text{Tr}_B : M_{mn}(\mathbb {C}) \to M_m(\mathbb {C})\) defined by the components:

\begin{equation} [\text{Tr}_B(\rho )]_{ij} = \sum _{k=1}^n \rho _{(i,k),(j,k)} \end{equation}

Definition 28 Stinespring Operator

Given a Kraus family \(\{ K_i\} _{i=1}^r \subset M_m(\mathbb {C})\), the Stinespring operator \(V \in M_{mr \times m}(\mathbb {C})\) is defined as:

\begin{equation} V = \sum _{i=1}^r K_i \otimes \mathbf{e}_i \end{equation}

where \(\mathbf{e}_i\) are the standard basis vectors of the environment space \(\mathbb {C}^r\).

Definition 29 Stinespring Dilation

The Stinespring dilation of a state \(\rho \) relative to the operator \(V\) is the matrix in the enlarged space:

\begin{equation} \text{Stinespring}(\rho , V) = V \rho V^\dagger \end{equation}

It follows that the original Kraus map is recovered by \(\Phi (\rho ) = \text{Tr}_B(V \rho V^\dagger )\).

Here is the main result towards answering Question #54 of David and Palsberg.

Theorem 30 Equivalence of Kraus and Stinespring Forms

Let \(\{ K_i\} _{i=1}^r \subset M_m(\mathbb {C})\) be a family of Kraus operators and \(\rho \in M_m(\mathbb {C})\) be a density matrix. Let \(V = \sum _{i} K_i \otimes \mathbf{e}_i\) be the Stinespring operator. Then:

\begin{equation} \operatorname {Tr}_B(V \rho V^\dagger ) = \sum _{i=1}^r K_i \rho K_i^\dagger \end{equation}

where \(\operatorname {Tr}_B\) denotes the partial trace over the environment space \(\mathbb {C}^r\).

Proof ▶

By the definition of the Stinespring operator \(V\), we have:

\[ V \rho V^\dagger = \left( \sum _i K_i \otimes \mathbf{e}_i \right) \rho \left( \sum _j K_j^\dagger \otimes \mathbf{e}_j^\top \right) = \sum _{i,j} (K_i \rho K_j^\dagger ) \otimes (\mathbf{e}_i \mathbf{e}_j^\top ) \]

Applying the partial trace \(\operatorname {Tr}_B\) to each term in the sum:

\[ \operatorname {Tr}_B((K_i \rho K_j^\dagger ) \otimes (\mathbf{e}_i \mathbf{e}_j^\top )) = (K_i \rho K_j^\dagger ) \operatorname {Tr}(\mathbf{e}_i \mathbf{e}_j^\top ) \]

Since \(\operatorname {Tr}(\mathbf{e}_i \mathbf{e}_j^\top ) = \delta _{ij}\) (the Kronecker delta), the double sum collapses to:

\[ \sum _{i} K_i \rho K_i^\dagger \]

which is exactly the Kraus representation \(\Phi (\rho )\).

We end with a sophisticated construction addressing a common problem in quantum information: if you have a Quantum Operation (which is trace-decreasing, \(\sum K_i^\dagger K_i \le I\)), how do you “complete” it into a full Quantum Channel (which is trace-preserving, \(\sum K_j^\dagger K_j = I\))?

Definition 31 Kraus Completion

Let \(\{ K_i\} _{i=1}^r \subset M_m(\mathbb {C})\) be a family of operators satisfying the trace-non-increasing (TNI) condition \(\sum _i K_i^\dagger K_i \le I\). The orthogonal CPTP completion is the operator \(V_{comp} \in M_{m(r+1) \times m}(\mathbb {C})\) defined block-wise as:

\begin{equation} V_{comp} = \begin{pmatrix} K_1 \\ \vdots \\ K_r \\ \Delta \end{pmatrix} \quad \text{where} \quad \Delta = \sqrt{I - \sum _{i=1}^r K_i^\dagger K_i} \end{equation}

In Lean notation, this is represented by branching on the index \(x.2\): using the Stinespring operator for the first \(r\) indices and the deficit root \(\Delta \) for the last index.

Theorem 32 Isometry Property

If the initial operators satisfy \(\sum _i K_i^\dagger K_i \le I\), then the completion \(V_{comp}\) is an isometry, i.e.,

\begin{equation} V_{comp}^\dagger V_{comp} = I \end{equation}

4 Conclusion

We have chosen a somewhat concrete approach to formalization of quantum information theory using matrices. This has the advantage that it is reasonably feasible to construct examples (such as those used to obtain synchronizing words in [ 5 ] ). On the other hand, for eventual inclusion in Lean’s Mathlib, a higher level of generality is needed. A natural next step for the Stinespring dilation is to formalize the unitary extension, also known as “going to the church of the larger Hilbert space”. For quantum finite automata there is a lot that can be done including formalizing a version of quantum automatic complexity.

Acknowledgments

This work was partially supported by grants from the Simons Foundation (#704836 to Bjørn Kjos-Hanssen) and Majesco Inc. (University of Hawai‘i Foundation Account #129-4770-4).

Bibliography

1: Andrzej Grudka, Marcin Karczewski, Paweł Kurzyński, Jedrzej Stempin, Jan Wójcik, and Antoni Wójcik. Quantum synchronizing words: Resetting and preparing qutrit states, 2025.
2: Masahito Hayashi and Hayata Yamasaki. Generalized quantum Stein’s lemma and second law of quantum resource theories, 2025.
3: Fumio Hiai and Dénes Petz. The proper formula for relative entropy and its asymptotics in quantum probability. Communications in Mathematical Physics, 143(1):99–114, 1991.
4: Bjørn Kjos-Hanssen. Kraus: formalization of quantum finite automata, 2026. Project website (https://bjoernkjoshanssen.github.io/kraus/).
5: Bjørn Kjos-Hanssen and Swarnalakshmi Lakshmanan. Unbounded length minimal synchronizing words for quantum channels over qutrits, 2026.
6: Lvzhou Li, Daowen Qiu, Xiangfu Zou, Lvjun Li, Lihua Wu, and Paulo Mateus. Characterizations of one-way general quantum finite automata. Theoret. Comput. Sci., 419:73–91, 2012.
7: Alex Meiburg. Quantum Information in Lean. https://github.com/Meiburg/Lean-QuantumInfo, 2024.
8: Alex Meiburg, Leonardo A. Lessa, and Rodolfo R. Soldati. A formalization of the generalized quantum Stein’s lemma in Lean, 2025.
9: Abuzer Yakaryılmaz and A. C. Cem Say. Unbounded-error quantum computation with small space bounds. Inform. and Comput., 209(6):873–892, 2011.