Into the Continuum

Homological Product Codes

2014-04-15T16:15:00.002-07:00

This post is my report in verbatim for a quantum error correction course taught by Daniel Gottesman and Robert König that i took in 2014 at the Institute for Quantum Computing / University of Waterloo.

The report is an exposition and summary of results in the paper "Homological Product Codes" by Sergey Bravyi and Matthew B. Hastings (2013). They are very very smart, unlike me, so assume any mistakes here are mine.

Abstract

Using the concept of a chain complex from homology theory a means of constructing CSS codes from chain complexes is described. The homological product of two chain complexes can be used to define a product of the respective CSS codes referred to as a homological product code introduced in \cite{Bravyi}. Code parameters of the constructed CSS codes are determined and randomization techniques are used to construct random CSS codes from random chain complexes. The main result is that with high probability a random homological product code on $n$ physical qubits has code parameters given by $[[n,\Omega(n),\Omega(n),O(\sqrt{n})]]$. Hence, homological product codes are one of the first examples of good quantum codes having relatively low stabilizer weight $O(\sqrt{n})$.

Quantum Error Correction

Computation, often considered entirely in the abstract, is a physical process, and although ideal faultless computations are an interesting idea to entertain, the need for active error correction seems inevitable. As physical systems undergoing physical processes, computers are subject to internal and external noise. This is even more undeniable when comparing quantum computation to classical computation. Despite offering a seemingly more powerful paradigm of computing, controlling quantum computation is a much more challenging task.

The objective of quantum error correction is to encode desired information in a subspace of a larger space through means of redundancy. Such an encoding can serve the purpose of protecting quantum states and transforming quantum states in a hopefully robust manner. Various error correcting codes achieve this is different ways. The primary means used to describe a code is through its code parameters. These parameters are the number of actual physical qubits $n$ used in the system, the number of effectively encoded logical qubits $k$ (where $2^k$ is the dimension of the codespace), and the code distance $d$ which specifies certain limitations on what the code can correct. For a quantum error correcting code, denote its parameters as $[[n,k,d]]$. Other parameters may also be defined in terms of these or independent of them.

"Good'' Codes

Given the parameters $[[n,k,d]]$ of a quantum error correcting code, some notion of what it means for the code to be good or bad ought to be determined. A formal criterion for the properties of a ``good" quantum code is presented here. An important issue to address is how the code parameters scale as the number of physical qubits $n$ increase. A code can be characterized by considering how the various parameters behave in the asymptotic limit as $n$ grows large. In this regard, consider the ratio $k/n$ called the encoding rate and the ratio $d/n$ called the relative distance. A $[[n,k,d]]$ quantum code is defined to be good if both of the following relations hold between the parameters in the limit as $n$ tends to infinity:
\[
k/n=\Omega(1) \Longleftrightarrow k=\Omega(n)
\]
and
\[
d/n=\Omega(1) \Longleftrightarrow k=\Omega(n).
\]
Thus, a code is good if both the encoding rate and relative distance are at least constant. This is equivalent to the condition that both $k$ and $d$ must grow at least linearly in $n$. Good codes defined in this way capture some desirable properties of a code. For instance, suppose it is required that additional logical qubits be involved in some error correcting scheme, or that the distance $d$ of the code needs to be increased to protect from more severe errors. Then for a good quantum code, both $k$ and $d$ can be increased at least linearly by only adding more physical qubits to the code. Good codes ensure that in order to increase parameters like $k$ and $d$, a large overhead in the number of physical qubits is not necessary in order to achieve the desired increase in parameter values.

Stabilizer Weight: $\mathbf{w}$

In addition to the standard code parameters $[[n,k,d]]$, another parameter known as the stabilizer weight $w$ becomes relevant for practical purposes. The stabilizer weight of a code is a concept that is defined for CSS codes and more generally for stabilizer codes. Intuitively, the stabilizer weight measures the degree of interaction between stabilizer operators and the physical qubits of the code. Each stabilizer operator acts nontrivially on some number of physical qubits, and each physical qubit is generally acted on by some number of stabilizer operators. A stabilizer code is said to have stabilizer weight $w$ (where $w$ is the smallest such number), if no stabilizer operator acts on more than $w$ physical qubits, and no more than $w$ different stabilizer operators act on any particular physical qubit.

In the CSS formalism, code spaces $C^X$ and $C^Z$ are spanned by columns of binary parity check matrices $A^X$ and $ A^Z$, which specify stabilizer operators consisting of only Pauli X or only Pauli Z operators, respectively. In this setting, the code $CSS(C^X,C^Z)$ is said to have stabilizer weight $w$ if every row and column of the parity check matrices $A^X$ and $A^Z$ has weight at most $w$. Here, the weight of a binary vector is given by the number of $1$s that comprise the vector.

LDPC Codes

Consider a code having stabilizer weight $w$. Then if it is the case that $w=O(1)$, the code is called an LDPC code (Low Density Parity Check code). Thus, in an LDPC code each stabilizer operator acts on at most a constant number of physical qubits. Moreover, each physical qubit is only acted on by a constant number of stabilizer operators.

For practical purposes, LDPC codes are ideal when compared to non LDPC codes having stabilizer weight which may scale with $n$. The reason why this is the case is ultimately due to fault tolerant considerations. Error correction generally requires measurements of some form. In the quantum setting, this is quite literally a sensitive issue. In practice operations and gates have a nonzero error probability, and the more gates or operations that are involved, the more this error is expected to accumulate. A common strategy for measuring quantum systems for error correction is to couple the system with an ancilla, and then measure the corresponding ancilla in order to read out syndrome information. This coupling is often achieved through, say, a two-qubit $CNOT$ gate which has the potential of propagating errors throughout the system. In an LDPC code, where the number of operators or qubits involved in direct interaction is small, there will naturally be less coupling required to actively perform error correction. Furthermore, in regards to fault tolerant computation, it has been shown that if the code distance of an LDPC code satisfies $d= \Omega(log \ n)$, then there exists a constant error threshold for stochastic error models \cite{Got}, \cite{Kovalev}.

In the classical framework, LDPC codes have had tremendous success. In fact, there exists classical LDPC codes which are also good in the technical sense. This serves as motivation for generalizing the notion of LDPC codes in the quantum setting. In the table below various quantum LDPC codes are given with their parameters $k,d$, and $w$, and their dependence on $n$. Note however, that none of these codes are good since the code distances of each grow strictly slower than $\Omega(n)$. It can also be seen that some of the codes have an encoding rate $k/n$ which approaches zero asymptotically.

Various examples of LDPC codes. References: [7]\cite{Kitaev}, [11] \cite{Zemor}, [13]\cite{Delfosse}, [6]\cite{Freedman}, [10]\cite{Tillich}, [5]\cite{FreedmanHas}.

Main Results

The principle objective of this paper is to introduce and outline new quantum error correcting codes known as homological product codes. By borrowing the mathematical framework and terminology of homology theory and algebraic topology, in particular the idea of chain complexes, CSS codes can be constructed in a straight forward manner. Moreover, the notion of a homological product of two chain complexes can be used to also define a homological product of CSS codes, in which a new code is constructed using two old codes. How these constructions are carried out, and what the parameters of these resulting codes are, will be discussed in more detail in what follows.

By defining random chain complexes, random CSS codes can be considered. Using these randomized constructions, it will be argued that with high probability the random CSS codes derived from random chain complexes are good codes. Likewise, it will also be argued that the product CSS code resulting from taking a product of two random chain complexes also gives a good code. Regardless, these CSS codes are not $LDPC$, but have stabilizer weight $w=O(\sqrt{n})$. Thus, in a certain sense it can be said that these codes are ``almost" LDPC. Besides a distinctively new way to construct new codes from old codes, the main contribution offered by homological product codes is that they are the first family of good codes to be introduced which have stabilizer weight smaller than $\Omega(n)$.

Homology Theory

In this section, some basic concepts and terminology in homology theory will be introduced. The main mathematical entity from homology theory, known as a chain complex or just complex for short, will be necessary to define the construction of CSS codes. In that regard, the notions to be introduced will be limited and highly specialized to suit these purposes. However, it should be acknowledged that many of these notions can be generalized to a broader setting.

The central object of study in homology theory is a chain complex $(C_i, \partial_i)$, which consists of:

A sequence of vector spaces $C_i$.

Linear transformations called boundary operators, $\partial_i:C_i\to C_{i+1}$ such that $\partial_i\partial_{i+1}=0$,

which can be schematically visualized as
\[
\dots \overset{\partial_{i+2}}\longrightarrow C_{i+1}\overset{\partial_{i+1}}\longrightarrow C_i \overset{\partial_{i}}\longrightarrow C_{i-1} \overset{\partial_{i-1}}\longrightarrow \dots
\]
Here, $\partial_i\partial_{i+1}=0$ says that the composition of two consecutive maps is the zero mapping sending all elements to the zero element in the appropriate space. It is worth remarking that the restriction of $(C_i,\partial_i)$ to vector spaces and linear transformations is a specialization of the definition of a chain complex. More generally, one could consider groups or modules for the $C_i$ and the appropriate structure preserving homomorphisms in their respective categories as the transformations $\partial_i$.

To understand the consequences of the seemingly peculiar condition $\partial_i\partial_{i+1}=0$
imposed on the boundary operators, define the following subspaces of $C_i$
\[
\begin{align*}
im\partial_i &:=\{\partial_i\vec{v}\in C_{i-1} \mid \vec{v}\in C_i\}, \\
ker\partial_i &:=\{\vec{w}\in C_i \mid \partial_i\vec{v}=\vec{0}\in C_{i-1}\}.
\end{align*}
\]
Thus, for any $\vec{v}\in C_{i+1}$ we have $\vec{w}:=\partial_{i+1}\vec{v}\in im\partial_{i+1}\subseteq C_i$. Then the condition $\partial_i\partial_{i+1}=0$ implies that $\partial_i\partial_{i+1}\vec{v}=\partial_i\vec{w}=0$ so $\vec{w}\in ker\partial_i$. Hence, for any chain complex it is always the case that $im\partial_{i+1}\subseteq ker\partial_i$. Now since $im\partial_i$ is a vector subspace of $ker\partial_i$, the quotient space $\ker\partial_i/im\partial_{i+1}$ can be defined as the set of equivalence classes $[\vec{w}]$, for $\vec{w}\in ker\partial_i$, where for two classes $[\vec{w}]=[\vec{w}']$ if and only if $\vec{w}-\vec{w}'\in im\partial_{i+1}$. The space $\ker\partial_i/im\partial_{i+1}$,called the $i^{th}$ \emph{homology space}, can be given a vector space structure by defining the vector addition and scalar multiplication as $[\vec{v}]+[\vec{w}]=[\vec{v}+\vec{w}]$ and $\alpha[\vec{w}]=[\alpha\vec{w}]$ for scalar elements $\alpha$ in the underlying field of $C_i$. Understanding these homology spaces in some chain complex is one of the main focuses of homology theory.

In a similar manner, to every complex $(C_i,\partial_i)$ there is an associated cocomplex $(C_i,\partial^{T}_i)$, defined in terms of the transpose maps $\partial^{T}_i: C_{i-1}\to C_i$ (or more generally the adjoints of $\partial_i$)., which satisfy $\partial^{T}_{i+1}\partial^{T}_{i}=0$. This can essentially be understood in terms of the standard complex by simply reversing arrows in the chain diagram:
\[
\dots \overset{\partial^{T}_{i+2}}\longleftarrow C_{i+1}\overset{\partial^{T}_{i+1}}\longleftarrow C_i \overset{\partial^{T}_{i}}\longleftarrow C_{i-1} \overset{\partial^{T}_{i-1}}\longleftarrow \dots
\]
In this way, the condition $\partial^{T}_{i+1}\partial^{T}_{i}=0$ implies that $im\partial^{T}_i \subseteq ker\partial^{T}_{i+1}$ so that the $i^{th}$ cohomology group is given by the quotient $ker\partial^T_{i+1}/im\partial^{T}_i$.

CSS codes from Complexes

General Construction: $\mathbf{CSS(C_i,\partial_i)}$

In what follows, when referring to some chain complex $(C_i,\partial_i)$, all vector spaces $C_i$ will be over the binary field $\mathbb{F}_2$ where addition and multiplication are carried out modulo $2$. Furthermore, for the purposes of defining a CSS code from the chain complex, it will suffice to restrict the discussion to short chain complexes of the form
\[
C_{2}\overset{\partial_{2}}\longrightarrow C_1 \overset{\partial_{1}}\longrightarrow C_{0},
\] consisting of just three spaces $C_0,C_1,C_2$ and boundary maps $\partial_1,\partial_2$ in the chain satisfying $\partial_1\partial_2=0$. Temporarily refraining from any
motivation, consider making the following definitions
\[
C^Z:=im\partial_2 \ \ \text{and} \ \ C^X:=im\partial^{T}_1,
\]
where $\partial_2: C_2\to C_1$ and $\partial^{T}_1 : C_0\to C_1$. Note that $C^Z,C^X\subseteq C_1$.

Suppose $C_1=\mathbb{F}^{n}_2$ for some $n$, and define the standard inner product between $\vec{v},\vec{w}\in C_1$ as
\[
<\vec{v},\vec{w}>:=\displaystyle\sum^{n}_{i=1}v_i w_i \ (mod \ 2),
\]
where $\vec{v}=(v_1,\dots,v_n)$ and $\vec{w}=(w_1,\dots,w_n)$ are the corresponding vector components in some basis. With respect to this inner product and for any subspace $S\subseteq C_1$ define the orthogonal complement of $S$ as the space
\[
S^\perp:=\{\vec{v}\in C_1 \mid <\vec{v},\vec{w}>=0 \ \text{for all } \ \vec{w}\in S\} .
\]
The space $S^\perp$ is indeed a vector subspace of $C_1$.

It will now be shown that with the definitions given above, $C^Z\subseteq (C^X)^\perp$. For any vectors $\vec{v}\in C^Z:=im\partial_2$ and $\vec{w}\in C^X:=im\partial^{T}_1$, there exists $\vec{x}\in C_2$ and $\vec{y}\in C_0$ such that $\partial_2\vec{x}=\vec{v}$ and $\partial^{T}_1\vec{y}=\vec{w}$. Then the inner product between $\vec{v}$ and $\vec{w}$ satisfies
\[
<\vec{v},\vec{w}>=<\partial_2\vec{x},\partial^{T}_1\vec{y}>=<\partial_1\partial_2\vec{x},\vec{y}>=<\vec{0},\vec{y}>=0
\]
since $\partial_1\partial_2=0$. Thus for any $\vec{v}\in C^Z$ it is also the case that $\vec{v}\in(C^X)^\perp$.

The boundary operator condition $\partial_1\partial_2=0$ implies that $C^Z\subseteq(C^X)^{\perp}$, or equivalently that $C^X\subseteq (C^Z)^\perp$. This condition, together with the underlying assumption that the the spaces under consideration are over the field $\mathbb{F}^{n}_2$ (so that $C^X,C^Z\subseteq C_1=\mathbb{F}^{n}_2$) is enough to completely specify a valid CSS code on $n=dim(C_1)$ physical qubits with code spaces given by $C^X$ and $C^Z$. Let $CSS(C^X,C^Z)$ or $CSS(C_i,\partial_i)$ denote the CSS code that results from the complex $(C_i,\partial_i)$. By choosing a basis for the code spaces $C^X$ and $C^Z$, the basis vectors can be used to specify stabilizer generators acting on the $n$ physical qubits consisting of either Pauli $X$ operators or Pauli $Z$ operators, respectively. In this case, since the code spaces are defined in terms of the images of $\partial_2$ and $\partial^{T}_1$, a basis can be specified by taking linearly independent columns of $\partial_2$ as a basis of $C^Z:=im\partial_2$, and linearly independent columns of $\partial^{T}_1$ to form a basis for $C^X=im\partial^{T}_1$. In this way, the essential condition for CSS codes $C^Z\subseteq(C^X)^{\perp}$ implies that all $X$-type stabilizers commute with all $Z$-type stabilizers---a necessary condition in the stabilizer formalism of quantum error correcting codes. More details regarding this construction will be explained in what follows.

The Single-Sector Theory: $\mathbf{CSS(C,\partial)}$

In the general construction of the code $CSS(C_i,\partial_i)$ from a chain complex, three different spaces $C_0,C_1,$ and $C_2$ were at play. The rest of this paper will be restricted to a more specialized setting known as the single-sector theory. In the \emph{single-sector theory}, only a single vector space $C=\mathbb{F}^{n}_2$ and boundary operator $\partial: C\to C$ is considered with $C=C_0,C_1,C_2$. This gives a chain complex $(C,\partial)$ of the form
\[
   C\overset{\partial}\longrightarrow C \overset{\partial}\longrightarrow C ,
\]
which in the present context can be viewed as
\[
   C\overset{\partial}\longrightarrow C \overset{\partial^{T}}\longleftarrow C.
\]
Then in this case the code spaces of the corresponding CSS code are defined to be
\[
C^Z:=im\partial \ \ \text{and} \ \ C^X:=im\partial^{T}.
\]
Moreover, it can be shown that
\[
(C^Z)^\perp=ker\partial^T \ \ \text{and} \ \ (C^X)^\perp=ker\partial.
\]
Let $A^Z$ and $A^X$ be the parity check matrices spanning $C^Z$ and $C^X$, and make the identification
\[
A^Z=\partial \ \ \text{and} \ \ A^X=\partial^T.
\]
Thus, the columns of $\partial$ span the code space $C^Z$ and the columns of $\partial^T$ (or equivalently the rows of $\partial$) span the code space $C^X$. Note that the columns and rows of $\partial$ may not be linearly independent. If this is the case then not every row of the parity check matrices $A^Z$ and $A^X$ will yield an independent stabilizer. In some conventions party check matrices that generate some CSS code are required to have full rank (where all columns are linearly independent). This restriction will not be imposed here in this paper in defining the parity check matrices in terms of $\partial$. Then in general, since it is possible to have distinct boundary operators $\partial\neq\partial'$, with $im\partial=im\partial'$, the correspondence in this translation of chain complexes $(C,\partial)$ to codes $CSS(C,\partial)$ is many-to-one. That is, different chain complexes may define equivalent CSS codes.

To make the construction of $CSS(C,\partial)$ more explicit consider, for example, the boundary operator $\partial : C\to C$, where $C=\mathbb{F}^{5}_2$, is expressed in some basis as
\[
\partial=\begin{pmatrix}
1&1&1&0&0 \\
0&0&1&1&1 \\
1&1&0&1&1 \\
1&1&1&0&0 \\
0&0&1&1&1 \\
\end{pmatrix} .
\]
Regard each column of $\partial$ as a vector $\vec{v}=(v_1,\dots, v_5)$. Then a $Z$-type stabilizer acting on the $dim(C)=5$ qubit space can be obtained using a column $\vec{v}$ as
\[
S^Z=Z^{v_1}_1Z^{v_2}_2Z^{v_3}_3Z^{v_4}_4Z^{v_5}_5.
\]
Hence, $S^Z$ acts nontrivially on qubits that correspond to the nontrivial support of $\vec{v}$. From the columns of $A^Z=\partial$, three $Z$-type stabilizers are obtained in this way:
\[
   \begin{align*}
S^{Z}_1&=Z_1Z_3Z_4 \\
S^{Z}_2&=Z_2Z_3Z_5 \\
S^{Z}_3&=S^{Z}_1S^{Z}_2, \\
\end{align*}
\]
Similarly, from the columns of $A^X=\partial^{T}$ three $X$-type stabilizers are obtained:
\[
   \begin{align*}
S^{X}_1&=X_1X_2X_3 \\
S^{X}_2&=X_3X_4X_5 \\
S^{X}_3&=S^{X}_1S^{X}_2. \\
\end{align*}
\]
For this particular $\partial$, it is seen that there are only two linearly independent columns and rows: $rank(\partial)=rank(\partial^T)=2$. This corresponds to the fact that there are only two independent stabilizer generators of each type. Given some stabilizer group $S$, recall that there are generally multiple ways to specify a generating set for $S$. This freedom implies that a conversion that begins with an arbitrary CSS code and translates to a corresponding chain complex $(C,\partial)$ will also be many-to-one in general.

Justification for why a complex $(C,\partial)$ defined in this setting naturally translates to a corresponding CSS code can be given by observing that the underlying condition $\partial^2=0$ is equivalent to the commutativity constraints demanded by CSS codes and stabilizer codes. Write $\partial=(\partial_{ij})_{ij}$ to express the matrix coefficients of $\partial$ in a particular basis, and note that $A^Z=\partial=(\partial_{ij})_{ij}$ and $A^X=\partial^T=(\partial_{ji})_{ij}$. Then
\[
\partial^2=0 \Longleftrightarrow   A^Z(A^X)^{T}=\displaystyle\sum_{k}^{}\partial_{ik}\partial_{kj}=0 (mod \ 2)   \Longleftrightarrow S^{X}_iS^{Z}_j=S^{Z}_jS^{X}_i.
\]

Here, the first statement is the relevant constraint in the context of chain complexes, while the second statement can be interpreted as a condition that implies $C^Z\subseteq(C^X)^T$ in the setting of CSS codes. The last statement gives the commutativity constraints imposed on the different types of stabilizer generators that result from the construction described above.

Equivalent Terminology

      Outlined below is analogous terminology of various entities in homology theory and the corresponding concepts in the quantum error correction setting. For a chain complex, $(C,\partial)$, the main spaces of interest are $ker\partial$, $im\partial$, and the quotient $ker\partial/im\partial$, which correspond to $Z$-type operators of the code $CSS(C,\partial)$; and the same spaces for $\partial^T$ that correspond to $X$-type operators of the code. In the table below, the middle column gives the nomenclature used for elements in these spaces.

     The relations that hold between the various spaces in the homology setting are analogous to the relationships between the different groups that arise in the stabilizer formalism. Let $S_Z$ be the stabilizer group defined by the $Z$-type part of $CSS(C,\partial)$, and denote $N(S_Z)$ as the normalizer group of $S_Z$ in the Pauli group for $n=dim(C)$ qubits. In this way, elements of $ker\partial$, called cycles, correspond to elements of $N(S_Z)$. Elements of $im\partial\subseteq ker\partial$ are trivial cycles that correspond to operators that are elements of the stabilizer $S_Z\subseteq N(S_Z)$. Furthermore, $ker\partial\backslash im\partial$ contains all cycles in $ker\partial$ that are not in $im\partial$, which corresponds to the nontrivial logical operators in $N(S_Z)$ that are not in $S_Z$. The homology space $ker\partial/im\partial$ is analogous to the quotient group $N(S_Z)/S_Z$, where in the stabilizer setting different elements of the quotient correspond to an equivalence class of logical operators. Operators in the same class can be distinct operators when expressed in terms of Pauli operators (up to a product of stabilizer elements), but in effect have the same \emph{logical} action on the encoded qubits.

      An similar translation between concepts and their interpretations can be made for the $X$-type part of $CSS(C,\partial)$ by simply replacing the boundary operator with the transpose operator $\partial^T$ as indicated in the table. The acquainted reader may recognize some of these concepts coming from homology theory from their experience with the toric code \cite{Kitaev},where historically speaking, some of the intimate relations between homology theory and error correction were first manifested.

Code Parameters of $\mathbf{CSS(C,\partial)}$

Having developed the framework to construct a code $CSS(C,\partial)$ from a chain complex $(C,\partial)$, let us determine the code parameters $[[n,k,d,w]]$ of $CSS(C,\partial)$. The number of physical qubits $n$ that define the global space of the code is given by
\[
n=dim(C)=dim(\mathbb{F}^{n}_2).
\]
This can equivalently be determined through $\partial$ by simply counting the number of rows/columns that comprise $\partial$.

For a stabilizer code with $r$ generators the number of encoded qubits is given by $k=n-r$, and since the number of $X$-type and $Z$-type generators are both given by the number $rank(\partial)$ of linearly independent rows/columns of $\partial$ we have
\[
k=n-2rank(\partial).
\]

The stabilizer weight $w$ and distance $d$ of $CSS(C,\partial)$ are not so readily determined from $\partial$. By definition, the stabilizer weight is the smallest $w$ such that every column and row of $\partial$ has weight at most $w$. In regards to the LDPC criterion, $CSS(C,\partial)$ will be an LDPC code if $\partial$ is a ``sparse'' matrix. More specifically, $CSS(C,\partial)$ is LDPC if each row and column of $\partial$ has $O(1)$ nonzero entries.

To determine the distance $d=min\{d^X,d^Z\}$, it is necessary to know the respective distances $d^X$ and $d^Y$, which are defined in this setting to be
\[
\begin{align*}
d^Z:=&min\{weight(\vec{v}) \mid \vec{v} \in ker\partial \backslash im\partial \} \\
d^X:=&min\{weight(\vec{v}) \mid \vec{v} \in ker\partial^T \backslash im\partial^T \}.
\end{align*}
\]
Thus, $d^Z$ and $d^X$ represents the minimum weight of nontrivial cycles in $ker\partial$ and $ker\partial^T$, respectively. It will be argued later that when considering a uniformly random distribution of possible boundary operators $\partial$ the corresponding code $CSS(C,\partial)$ will has linear distance ($d=\Omega(n)$) with high probability.


Homological Dimension $\mathbf{H(\partial)}$

Here we introduce the homological dimension of a complex $(C,\partial)$ and its relation to the number of encoded qubits $k$ of the code $CSS(C,\partial)$, which will become more relevant in later discussions. Define the homological dimension $H(\partial)$ of a complex $(C,\partial)$ to be the dimension of the homology space of the complex:
\[
H(\partial):=dim(ker\partial/im\partial).
\]
The following relations hold:
\[
\begin{align*}
H(\partial)&=dim(ker\partial/im\partial) \\
&=dim(ker\partial)-dim(im\partial) \\
&=(n-dim(im\partial))-dim(im\partial) \\
&=n-2dim(im\partial) \\
&=n-2rank(\partial) \\
&=k
\end{align*}
\]

Hence, the homological dimension of the complex is equal to the number of encoded qubits of the code: $H(\partial)=k$. The second identity is a general relationship that holds for quotient spaces which essentially follows from the fact that given a basis of a vector subspace $V\subseteq W$, a basis of $W$ can be obtained from a basis of $V$ and $W/V$. The third identity follows from the rank-null theorem of linear algebra which states that $dim(ker\partial)+dim(im\partial)=n$. The last two identities follow merely from the definition $rank(\partial):=dim(im\partial)$ and the previous relationship derived for $k$.


Homological Product

    Given two chain complexes $(C_1,\partial_1)$ and $(C_2,\partial_2)$, a natural construction to consider is some notion of a product defined on the complexes that yields another chain complex $(C,\partial)$. The homological product to be introduced here achieves precisely this. The product complex $(C_1,\partial_1)\times (C_2,\partial_2)=:(C,\partial)$ will express the space $C$ and boundary operator $\partial$ in terms of the spaces $C_1,C_2$ and boundary operators $\partial_1,\partial_2$ of the factors. Moreover, the K\"{u}nneth formula will provide a means of expressing $ker\partial$ in terms of $ker\partial_1$ and $ker\partial_2$, and the homological dimension $H(\partial)$ in terms of $H(\partial_1)$ and $H(\partial_2)$. This theory will then be applied to define a product of two codes $CSS(C_1,\partial_1)$ and $CSS(\partial_2)$ and to calculate its parameters.


$\mathbf{(C_1,\partial_1)\times (C_2,\partial_2)=(C,\partial)}$

    The homological product, denoted by $\times$, of two complexes $(C_1,\partial_1)$ and $(C_2,\partial_2)$ gives a product complex
\[
    (C,\partial):=(C_1,\partial_1)\times (C_2,\partial_2),
\]
    where $C=C_1\otimes C_2$ is given by the tensor product of $C_1$ and $C_2$, and the boundary operator by
\[
    \partial=\partial_1\otimes I +I\otimes \partial_2.
\]
    To verify that $\partial$ is indeed a valid boundary operator observe the following:
\[
    \begin{align*}
    \partial^2&=(\partial_1)^2\otimes I +2\partial_1\otimes\partial_2 + I\otimes (\partial_2)^2 \\
&=0\otimes I +0\cdot\partial_1\otimes\partial_2 + I\otimes 0 \\
&=0
    \end{align*}
\]
    Here, the assumptions $\partial^{2}_1=0$ and $\partial^{2}_2=0$ were used together with $2\equiv 0 \ (mod \ 2)$ since all spaces under consideration are taken to be over the field $\mathbb{F}_2$.


Kunneth Formula

    The Kunneth formula is a standard result in homology theory that, for a product complex $(C,\partial)=(C_1,\partial_1)\times (C_2,\partial_2)$, relates $ker\partial$ and the homological dimension $H(\partial)$ in terms of the corresponding entities of $(C_1,\partial_1)$ and $(C_2,\partial_2)$. The theorem statement reads:

    For any boundary operators $\partial_1,\partial_2$ and $\partial=\partial_1\otimes I +I\otimes \partial_2$,
\[
    ker\partial=ker\partial_1\otimes ker\partial_2+im\partial
\]
    and
\[
H(\partial)=H(\partial_1)H(\partial_2).
\]

Although a proof of the Kunneth formula will not be given here, the results will be used to derive certain properties of the product complex and resulting product code. In particular, the identity $H(\partial)=H(\partial_1)H(\partial_2)$ will be useful. The first statement says the any element $\vec{v}\in ker\partial$ can be expressed in the form $\vec{v}=\vec{v_1}\otimes\vec{v_2} +\vec{w}$ where $\vec{v_i}\in ker\partial_i$ and $\vec{w}\in im\partial$, and that any vector of such form is always in $ker\partial$.



Homological Product Codes

    Previously, it was described how a complex $(C,\partial)$ can be used to specify a code $CSS(C,\partial)$. By considering a product complex $(C,\partial)=(C_1,\partial_1)\times (C_2,\partial_2)$ that results from taking the homological product of two chain complexes, and then translating the resulting product complex $(C,\partial)$ into the corresponding code $CSS(C,\partial)$ a homological product code can be defined. In the following sections, the parameters of the product code will be calculated in terms of the parameters of the input codes.


$\mathbf{CSS(C_1,\partial_1)\times CSS(C_2,\partial_2)=CSS(C,\partial)}$

Consider two codes $CSS(C_1, \partial_1)$ and $CSS(C_2,\partial_2)$ defined by two complexes $(C,\partial_1)$ and $(C_2,\partial_2)$. Define the homological product of $CSS(C_1, \partial_1)$ and $CSS(C_2,\partial_2)$ to be the code
\[
CSS(C,\partial):=CSS(C_1,\partial_1)\times CSS(C_2,\partial_2),
\]
where $(C,\partial)=(C_1,\partial_1)\times (C_2,\partial_2)$ so that $C=C_1\otimes C_2$ and $\partial=\partial_1\otimes I +I\otimes \partial_2$. Even though the resulting product code does share some properties in common with codes constructed through code concatenation, the homological product offers a way to create a new code from two older codes in a way that is distinct from code concatenation. How the parameters of the product code are related to the code parameters of the input codes is the focus of the following sections.


Parameters of the Product Code $\mathbf{CSS(C,\partial)}$

Let $[[n_i,k_i,d_i,w_i]]$ be the code parameters of two codes $CSS(C_i,\partial_i)$ for $i=1,2$, and let $[[n,k,d,w]]$ be the parameters of the resulting product code $CSS(C,\partial)=CSS(C_1,\partial_1)\times CSS(C_2,\partial_2)$. The objective here will be to determine the parameters $[[n,k,d,w]]$ in terms of the parameters of the input codes.

Since $C=C_1\otimes C_2$, the number of physical qubits $n$ in the product code is given by

\[
n=dim(C)=dim(C_1\otimes C_2)=dim(C_1)dim(C_2)=n_1n_2.
\]
The number of encoded qubits $k$ of the product code is calculated through the K\"{u}nneth formula:
\[
k=H(\partial)=H(\partial_1)H(\partial_2)=k_1k_2.
\]
    Hence, the homological product of two codes always increases the number of physical and logical qubits provided that both input codes have parameters $n_i,k_i > 1$.

    Recall from the definition that if $CSS(C_i,\partial_i)$ have stabilizer weights $w_i$, the weight of rows and columns of $\partial_i$ is no greater than $w_i$. Then since taking the tensor product of any operator with the identity does not increase row or column weight, the stabilizer weights of $\partial_1\otimes I$ is no greater than $w_1$ and the stabilizer weight of $I\otimes \partial_2$ is no greater than $w_2$. Therefore the stabilizer weight of the product boundary operator $\partial=\partial_1\otimes I +I\otimes \partial_2$ must satisfy $w\leq w_1+w_2$, implying that the stabilizer weight of the product code is no greater than the sum of the stabilizer weights of the input codes.

    Calculating the distance $d$ of the product code $CSS(C,\partial)$ is highly non trivial, and only bounds are known. Let $d^{Z}_i, d^{X}_i$ be the $Z$ and $X$-type distances of CSS$(C_i,\partial_i)$ for $i=1,2$ so that the distances are given by $d_i=min\{d^{Z}_i, d^{X}_i\}$. Without complete proof we state the following relationship:
\[
    max\{d^{\alpha}_1,d^{\alpha}_2\}\leq d^{\alpha} \leq d^{\alpha}_1d^{\alpha}_2 (\text{for} \alpha=X,Z).
\]
    Then the distance of the product code will be given by $d=min\{d^X,d^Z\}$. To justify the upper bound, consider non trivial cycles $\vec{v}_i\in ker\partial_i\backslash im\partial_i$. Then $\vec{v}_1\otimes\vec{v}_2\in C$ will be a nontrivial cycle in $ker\partial\backslash im\partial$. So if $\vec{v}_i$ has weight greater than $d^{\alpha}_i$ for type $\alpha$, then $\vec{v}_1\otimes\vec{v}_2$ will have weight given by $d^{\alpha}_1d^{\alpha}_2$. Thus, $d^{\alpha}\leq d^{\alpha}_1d^{\alpha}_2$. Proving the lower bound is a little more involved, but can be done by invoking the K\"{u}nneth formula. This bound can be interpreted as stating that the distance $d^{\alpha}$ of the product code in no worse than the best of the distances $d^{\alpha}_1$ and $d^{\alpha}_2$ of the input codes.

    The parameters $n$ and $k$ regarding the number of physical and encoded qubits of the product code $CSS(C,\partial)$ are readily determined from the parameters of the input code. On the contrary, the stabilizer weight $w$ and the code distance $d$ of the product code are more subtle matters. Determining exact values for $w$ and $d$ is specific to the more detailed structure of the input boundary operators $\partial_1$ and $\partial_2$ and how they combine to form the product boundary operator $\partial$. However, it will be shown in the following section that with high probability, most homological product codes have a linear distance: $d=\Omega(n)$.


Distance Bounds on $\mathbf{CSS(C,\partial)}$

In order to better determine a stronger lower bound on the distance $d$ of a homological product code, it is necessary to make use of probabilistic arguments regarding random chain complexes $(C,\partial)$ with fixed $n=dim(C)$ and homological dimension $k=H(\partial)$. The main objective of this section is to sketch a proof that the homological product of two random complexes whose corresponding CSS codes have linear distance will yield a product code that also has linear distance $d=\Omega(\partial)$ with high probability. The statistical nature of this argument, shows that such a result holds in a ``typical case''. First, what is meant by a random complex will be made more precise. Then, a few logical reductions involved in the analysis of the distance bounds will be discussed. The interested reader can find further details regarding the proof in \cite{Bravyi} .

Random Complexes

A way of generating a random ensemble of complexes satisfying certain properties is discussed in what follows. Fix integers $H$ and $L$, and consider an arbitrary complex $(C,\partial)$ having homological dimension $H(\partial)=H$ and $rank(\partial)=L$. Define a \emph{canonical boundary operator} as the block matrix
\[\hat{\partial}=\begin{pmatrix}0&0&0 \\
0&0&I \\
0&0&0 \\
\end{pmatrix},
\]
where the rows and columns are grouped into blocks of sizes $H$, $L$, $L$. Then it can be shown \cite{Bravyi} that there exists an invertible matrix $U$ such that $\partial=U\hat{\partial}U^{-1}$. In this way, starting with a canonical boundary operator $\hat{\partial}$ a random boundary operator can be generated by choosing a random invertible matrix $U$ having the appropriate dimension of $M:=H+2L$. If such a $U$ is chosen uniformly at random, then the resulting boundary operator $\partial=U\hat{\partial}U^{-1}$ will also be distributed uniformly. Then by starting with random complexes $(C,\partial)$, random codes $CSS(C,\partial)$.

Consider the "encoding rate'' $H/M$, and the limiting scenario where $H,M\to\infty$ while $H/M$ remains constant. In this limit, a random complex $(C,\partial)$ with $M=dim(C)$ and homological dimension $H(\partial)=H$ will translate to a code $CSS(C,\partial)$ that has linear distance $d=\Omega(M)$ with high probability. This can be argued by observing that the number of low weight cycles in $ker\partial$ is small, which implies that with high probability $ker\partial$ does not contain any low weight cycles. Although this result shows that a random code $CSS(C,\partial)$ tends to have linear distance, it is not immediately obvious that the homological product of two random codes will also have linear distance. More work must be done in order to determine this.

Random Product Complexes

Here, the notion of random complexes discussed in the previous section is extended to random product complexes. Consider two canonical boundary operators $\hat{\partial}_1$ and $\hat{\partial}_2$, and two invertible matrices $U_1$ and $U_2$ chosen independently at random. Then two random boundary operators can be constructed as
\[
\partial_1=U_1\hat{\partial}_1U^{-1}_1 \ \ \text{and} \ \ \partial_2=U_2\hat{\partial}_1U^{-1}_2,
\]
and the boundary operator $\partial=\partial_1\otimes I + I\otimes \partial_2$ resulting from the homological product will be randomly distributed. Equivalently, a \emph{canonical boundary operator} can be defined as
\[
\hat{\partial}=\hat{\partial}_1\otimes I + I\otimes \hat{\partial}_2,
\]
Then random boundary operator can be constructed through the transformation
\[
\partial=(U_1\otimes U_2)\hat{\partial}(U^{-1}_1\otimes U^{-1}_2),
\]
where $U_1$ and $U_2$ are random invertible matrices. Note that from these relations it readily follows that
\[ker\partial=(U_1\otimes U_2)ker\hat{\partial} \ \ \text{and} \ \ im\partial=(U_1\otimes U_2)im\hat{\partial}.\]

The Event $\mathbf{E_c}$

In what follows, we consider two random complexes $(C_i,\partial_i)$ (as discussed in the previous section) having the same size $M=dim(C_i)$ and homological dimension $H=H(\partial_i)$ for $i=1,2$. Let $r=H/M$ be the encoding rate. The random complex $(C,\partial)=(C_1,\partial_1)\times(C_2,\partial_2)$ then has size $n=M^2$. The desired result pertaining to the code distance of the random product code $CSS(C,\partial)$ can then be stated as follows: for sufficiently small constants $c,r>0$, the random product code has distance $d>cM^2$ with high probability. Hence, for this to be the case it must be shown that every nontrivial cycle $\vec{v}\in ker\partial\backslash im\partial$ satisfies $weight(\vec{v})>cM^2$ with high probability.

In this regard, for some constant $c$, define the event
\[E_c=\{\exists \vec{v}\in ker\partial\backslash im\partial \mid weight(\vec{v})<cM^2\}.
\]
Then if it can be shown that the probability $Pr[E_c]<1/2$ for sufficiently small $c$ and large enough $M$ , the desired result will follow. The objective then comes down to bounding the probability $Pr[E_c]$ from above by an amount that is exponentially small in $M$. Note that this result can be equivalently obtained by considering the complementary event
\[\overline{E}_c=\{\forall \vec{v}\in ker\partial\backslash im\partial \mid weight(\vec{v})>cM^2\},\]
and showing that the probability $Pr[\overline{E}_c]>1/2$ instead and tends to $1$ in the limiting case. However, in the analysis that follows it will be preferable to work with $E_c$ as opposed to $\overline{E}_c$.

As described above, consider the random product complex that results from taking the canonical bipartite boundary operator $\hat{\partial}$ and transforming it to $\partial=(U_1\otimes U_2)\hat{\partial}(U^{-1}_1\otimes U^{-1}_2)$ for random invertible matrices $U_1$ and $U_2$. Recall that $ker\partial=(U_1\otimes U_2)ker\partial$. Then the probability of interest can be bounded by
\[Pr[E_c]\leq \displaystyle\sum^{}_{\vec{v}\in ker\hat{\partial}\backslash im\hat{\partial}} Pr[weight( \ (U_1\otimes U_2)\vec{v} \ )<cM^2].\]

Since $\vec{v}\in C=C_1\otimes C_2$, interpret $\vec{v}$ as a $M\times M$ matrix $v$ with rows corresponding to the space $C_1$ and columns corresponding to $C_2$. Throughout, we will freely interpret $\vec{v}\in C$ as either a vector or a matrix depending on context. In the latter situation when the vector $\vec{v}$ is to be interpreted as a matrix it will be written as $v$ without the $ \vec{} $ designation. Suppose the matrix $v$ has rank $R$. Then $(U_1\otimes U_2)v$ will be distributed uniformly over all rank $R$ matrices. Now let $\eta(R)$ be the number of rank $R$ "matrices" of size $M\times M$ in $ker\hat{\partial}\backslash im\hat{\partial}$, and $Pr[R]$ be the probability that such a matrix has weight less than $cM^2$. Then
\[Pr[E_c]\leq \displaystyle\sum^{}_{R\geq 1} \eta(R)Pr[R].\]
Although the quantities $\eta(R)$ and $Pr[R]$ are relatively straightforward to compute, expressing the bound on the probability $Pr[E_c]$ in this way results in a sum that is exponentialy large in $M$.

This failure in attempting to bound the probability $Pr[E_c]$ is partly due to the possibility that one of the input codes may be bad. If this is the case, then the resulting product code will have exponentially many low weight cycles. In a certain sense, bounding the probability as above exponentially amplifies bad choices of the input codes. In order to overcome this issue, it will be necessary to define a more refined condition that is only satisfied when both input codes are good.

Uniform Low Weight Condition
For some $M\times M$ matrix $A$, say that $A$ has \emph{uniform low weight} with constant $c$ if and only if every row and column of $A$ has weight at most $cM$. To denote that $A$ has uniform low weight, we will write $A$ has $ULW(c)$ for short.

Motivated by the uniform low weight condition, for a constant $c>0$ define the event
\[
E^{ULW}_c=\{\exists \vec{v}\in ker\partial\backslash \vec{0} \mid v \ \ \text{has} \ \ ULW(c)\},
\]
and now let $\eta(R)$ be the number of rank $R$ matrices in $ker\partial\backslash \vec{0}$ with $Pr[R]$ denoting the probability that a random rank $R$ matrix of size $M$ has $ULW(c)$. Then
\[
Pr[E^{ULW}_c]\leq \displaystyle\sum^{}_{R\geq 1} \eta(R)Pr[R].
\]
Although this time the sum bounding the probability is exponentially small in $M$ as desired, the issue now is that the original low weight event $E_c$ does not imply this uniform low weight event $E^{ULW}$. Hence, attempting to bound the probability in this way is also insufficient for our purposes.

To further refine the analysis, let $t$ be some constant such that $c<t<1$. Suppose that $\vec{v}\in ker\partial$ is a nontrivial cycle with $weight(\vec{v})<cM^2$. Observe that, as a matrix, $v$ has at least $(1-t)M$ rows and columns with weight $\frac{c}{t}M$. Let $M'=(1-t)M$, and consider the \emph{reduced matrix} $v_{red}$ of size $M'$ that consists of precisely these rows and columns. Then $v_{red}$ has $ULW(c')$ where $c'=\frac{c}{t(1-t)}$.

The relevance of this reduced submatrix satisfying the uniform low weight condition is made apparent through the following lemma proved in \cite{Bravyi} using results from \cite{Terhal}. If the two random input codes have distances greater than$M-M'+1$, and $\vec{v}\in ker\partial$ is cycle such that $v$ has a vanishing reduced matrix $v_{red}$, then $\vec{v}\in im\partial$ is a trivial cycle. This statement can be equivalently interpreted as saying: if $\vec{v}\in ker\partial$ is a nontrivial cycle, then $v$ must contain a nonvanishing reduced matrix $v_{red}$. Note that a similar result holds for cocycles.

By defining a reduced low weight event
\[E^{RULW}_c=\{\exists \vec{v}\in ker\partial\backslash \vec{0} \mid v_{red} \ \ \text{has}   \ \ ULW(c)\},\]
and applying the aforementioned lemma, it is seen that the originally defined event $E_c$ implies the event $E^{RULW}_c$. Therefore $Pr[E_c]\leq Pr[E^{RULW}_c]$, and any upper bound on $Pr[E^{RULW}_c]$ yields an upper bound on $Pr[E_c]$. Now let $\eta(R)$ be the number of reduced submatrices $v_{red}$ that can be extended to a valid matrix $v$ that corresponds to a cycle $\vec{v}\in ker\partial$, and let $Pr[R]$ be the probability that such a rank $R$ matrix of size $M'$ has $ULW(c)$. Then,
\[
Pr[E^{RULW}_c]\leq \displaystyle\sum^{}_{R\geq 1} \eta(R)Pr[R].
\]

In \cite{Bravyi}, bounds on the quantity $\eta(R)$ are calculated to be
\[
\eta(R)\leq O(1)\cdot 2^{(M+H)R-R^2} \ \ \text{if} \ \ R\leq H
\]
and
\[\eta(R)\leq O(1)\cdot 2^{(M+H/2)R-R^2/2} \ \ \text{if} \ \ R\geq H,\]
where $H=H(\partial_1)=H(\partial_2)$ is the homological dimension of the input codes used to construct the product code. Furthermore, it is calculated that the probability $Pr(R)$ is bounded by
\[Pr(R)\leq O(1)\cdot 2^{R^2-2(1-\epsilon)MR},\]
for any $\epsilon>0$ provided that $c$ is chosen accordingly.

These two results imply that $Pr[E^{RULW}_c]$, and hence $Pr[E_c]$, are bounded above by an exponentially small quantity in $M$. Therefore, with high probability the product code will not contain nontrivial cycles having weight less than $cM^2$. Since the same argument holds for the cocycles of the code, this implies that the product code of size $n=M^2=dim(C)$ has linear distance $d=\Omega(n)$ with high probability as desired.

Conclusion

We have seen here how chain complexes from homology theory can naturally be used to define CSS codes, and how the homological product of two complexes can be used to define a product code with nice parameters. Code parameters for the CSS codes resulting from both a single chain complex and a product of two complexes were derived. By considering random complexes, it was argued that a single random complex yields a good code with linear distance with high probability. Through a more involved analysis it was also shown that under some typical conditions, the code that results through the homological product will also have linear distance with high probability. In summary, if $[[n_i,k_i,d_i,w_i]]$ are the parameters of two CSS codes constructed from two chain complexes, then the parameters of the resulting product code are given by $[[n_1n_2,k_1k_2,d, w]]$ where in general $d\leq d_1d_2$ and $w\leq w_1+w_2$. By taking a pair of random CSS codes with $n_1=n_2$ physical qubits, $k_1=k_2$ logical qubits such that $k_i=cn_i$ for some small constant $c$, the resulting product code on $n=n_1n_2$ physical qubits has (with high probability) parameters of the form $[[n,\Omega(n),\Omega(n), O(\sqrt{n})]]$. Thus, homological product codes are an example of good quantum codes, which are almost LDPC.

Tranversal gates in permutation-invariant codes and CSS codes

2014-03-29T22:48:00.000-07:00

Let $S_n$ be the group of permutations of the $n$-element set $[n]:=\{1 , 2, \dots, n\}$, and let $\pi\in S_n$. Define the unitary $U_\pi: (\mathbb{C}^2)^{\otimes n}\mapsto(\mathbb{C}^2)^{\otimes n}$ by
\[
U_\pi(\ket{\varphi_1}\otimes\cdots\otimes\ket{\varphi_n})=\ket{\varphi_{\pi^{-1}(1)}}\otimes\cdots\otimes\ket{\varphi_{\pi^{-1}(n)}}
\]
for all product states $\ket{\varphi_1}\otimes\cdots\otimes\ket{\varphi_n}$ (and extended linearly to all of $(\mathbb{C})^{\otimes{n}}$). Denote by $(i j)$ the transposition of $i,j\in[n]$ for $i\neq j$, and define the subspace
\[
Q_n=\{\ket{\Psi}\in(\mathbb{C})^2)^{\otimes{n}} \mid U_{(i j)}\ket{\Psi}=\ket{\Psi} \ \text{for all} \ i\neq j \}.
\]

Let $\vec{b}=b_1b_2\dots b_n \in\{0,1\}^n$ with $b_i\in\{0,1\}$, and let $\ket{\vec{b}}\in(\mathbb{C}^2)^{\otimes n}$ be a $n$-qubit computational basis state. More explicitly, $\ket{\vec{b}}=\bigotimes_{i=1}^{n}\ket{b_i}$, describes the $n$-qubit state where each qubit is either in the state $\ket{0}$ or $\ket{1}$. Now define the map
\[
\omega: \{0,1\}^n\to \{0,1,\dots,n\} \ \ \text{given as} \ \ \omega(\vec{b})=\SUM{i=1}{n}b_i,
\]
which simply counts the number of $1$s appearing in the bit string $\vec{b}$, and call $\omega(\vec{b})$ the \emph{weight} of $\vec{b}$.

It was shown in a previous post that a basis of $Q_n$ is given by the $n+1$ states of the form

\[
\ket{\omega_k}=\frac{1}{\sqrt{\binom{n}{k}}}\SUM{ \ \ \vec{b} \ : \ \omega(\vec{b})=k}{}\ket{\bar{b}},
    \]

where each basis state $\ket{\omega_k}$, for $k\in\{0,1,\dots, n\}$, consists of an equally weighted superposition of the $\binom{n}{k}$ computational basis states $\ket{\vec{b}}$ with weight $\omega(\vec{b})=k$.

Consider the two dimensional subspace of $Q_n$ which is spanned by the following two states:
\[\begin{align*}
\ket{\overline{0}}&:=\ket{\omega_0}=\TENSOR{i=1}{n}\ket{0}=\ket{\vec{0}}, \\
\ket{\overline{1}}&:=\frac{1}{\sqrt{2^n-1}}\SUM{k=1}{n}\sqrt{\binom{n}{k}}\ket{\omega_k} \\
&=\frac{1}{\sqrt{2^n-1}}\SUM{\vec{b}\neq\vec{0}}{}\ket{\vec{b}},
\end{align*} \]
and note that $\ip{\overline{0}}{\overline{1}}=0$. Moreover, since each of $\ket{\overline{0}}$ and $\ket{\overline{1}}$ are given by superpositions of basis states of $Q_n$ each state is left invariant under the action of $U_{(ij)}$ for any transposition $(ij)\in S_n$. Therefore, the two dimensional subspace spanned by $\ket{\overline{0}}$ and $\ket{\overline{1}}$ is a valid subcode of $Q_n$.

Now, observe the action of the transversal Hadamard gate applied to each of the $n$ qubits of the state $\ket{\overline{0}}$

\[ \begin{align*}
H^{\otimes n}\ket{\overline{0}}&= H^{\otimes n}\ket{\vec{0}} \\
&=\frac{1}{\sqrt{2^n}}\SUM{\vec{b}\in\{0,1\}^n}{}\ket{\vec{b}} \\
&=\frac{1}{\sqrt{2^n}}\ket{\vec{0}}+\frac{1}{\sqrt{2^n}}\SUM{\vec{b}\neq \vec{0}}{}\ket{\vec{b}} \\
=&\frac{1}{\sqrt{2^n}}\ket{\vec{0}}+\frac{\sqrt{2^n-1}}{\sqrt{2^n}}\left(\frac{1}{\sqrt{2^n-1}}\SUM{\vec{b}\neq \vec{0}}{}\ket{\vec{b}}\right) \\
&=\sqrt{2^{-n}}\ket{\overline{0}}+\sqrt{1-2^{-n}}\ket{\overline{1}}. \\
\end{align*}\]

Instead, observe the action of $H^{\otimes n}$ on the state $\ket{\overline{1}}$:
\[ \begin{align*}
H^{\otimes n}\ket{\overline{1}}&= \frac{1}{\sqrt{2^n-1}}\SUM{\vec{b}\neq\vec{0}}{}H^{\otimes n}\ket{\vec{b}} \\
&= \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{b}\neq\vec{0}}{}\SUM{ \ \vec{c}\in\{0,1\}^n}{}(-1)^{\vec{b}\cdot\vec{c}}\ket{\vec{c}} \\
&= \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{b}\neq\vec{0}}{}(-1)^{\vec{b}\cdot\vec{0}}\ket{\vec{0}} + \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{b}\neq\vec{0}}{}\SUM{ \ \vec{c}\neq\vec{0}}{}(-1)^{\vec{b}\cdot\vec{c}}\ket{\vec{c}} \\
&= \frac{2^n-1}{\sqrt{2^n-1}\sqrt{2^n}}\ket{\vec{0}} + \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{b}\neq\vec{0}}{}\SUM{ \ \vec{c}\neq\vec{0}}{}(-1)^{\vec{b}\cdot\vec{c}}\ket{\vec{c}} \\
&= \sqrt{1-2^{-n}}\ket{\vec{0}} + \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{b}\neq\vec{0}}{}\SUM{ \ \vec{c}\neq\vec{0}}{}(-1)^{\vec{b}\cdot\vec{c}}\ket{\vec{c}} \\
&= \sqrt{1-2^{-n}}\ket{\vec{0}} + \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{c}\neq\vec{0}}{}\left(\SUM{ \ \vec{b}\neq\vec{0}}{}(-1)^{\vec{b}\cdot\vec{c}}\right)\ket{\vec{c}} \\
&= \sqrt{1-2^{-n}}\ket{\vec{0}} + \frac{1}{\sqrt{2^n}}\frac{1}{\sqrt{2^n-1}}\SUM{\vec{c}\neq\vec{0}}{}\left(-1\right)\ket{\vec{c}} \\
    &= \sqrt{1-2^{-n}}\ket{\overline{0}} - \sqrt{2^{-n}}\ket{\overline{1}}. \\
\end{align*}\]

Thus, in summary the action of $H^{\otimes n}$ on $\ket{\overline{0}}$ and $\ket{\overline{1}}$ is given by:
\[\begin{align*}
\ket{\overline{0}}&\overset{H^{\otimes n}}\mapsto \sqrt{2^{-n}}\ket{\overline{0}}+\sqrt{1-2^{-n}}\ket{\overline{1}} \\
\ket{\overline{1}}&\overset{H^{\otimes n}}\mapsto \sqrt{1-2^{-n}}\ket{\overline{0}}-\sqrt{2^{-n}}\ket{\overline{1}},
\end{align*}\]

which is given by the logical operation $\overline{U}$ given by the matrix expressed in the logical basis as
\[
\overline{U}=\begin{pmatrix}
\sqrt{2^{-n}} & \sqrt{1-2^{-n}}\\
\sqrt{1-2^{-n}}&-\sqrt{2^{-n}} \\
\end{pmatrix}.
\]

Transitioning now into the setting of CSS codes, let $H_1, H_2$ be parity check matrices such that $Q=CSS(H_1,H_2)$ is a $[[n,k,d]]$-CSS code. The matrix of stabilizers for this code is given by the binary symplectic matrix
\[
S= \left(\begin{array}{c | c}
0&H_1 \\
H_2 & 0
\end{array}\right)
\]

The action of conjugation by a transversal Hadamard applied to the symplectic matrix $S$ transforms it to
    \[
S'=H^{\otimes n}SH^{\otimes n \dagger} \left(\begin{array}{c | c}
0&H_2 \\
H_1 & 0
\end{array}\right)
\]

Suppose that $H^{\otimes n}$ preserves the code space so that $H^{\otimes n}Q=Q$. This action must send codewords of $Q$ to codewords of $Q$ and only permute the set of stabilizers of the code. Since $H^{\otimes n}$ effectively turns $X$-type generators into $Z$-type generators, then it must be the case that any row $r_i$ of $H_1$ lies in the span of the rows of $H_2$, and likewise that any row $r'_i$ of $H_2$ lies in the span of the rows of $H_1$. That is, $Row(H_1)\subseteq Row(H_2)$ and $Row(H_2)\subseteq Row(H_1)$, which is equivalent to the condition that
\[
Row(H_1)=Row(H_2),
\]
where $row(H_i)$ represents the space spanned by rows of $H_i$.

Now, suppose instead that $Row(H1)=Row(H_2)$. This implies that a row of $H_1$ (or $H_2$) can be expressed in terms of the rows of $H_2$ (or $H_1$). Then after the action of a transversal Hadamard $H^{\otimes n }$, each $X$-type (or $Z$-type) stabilizer will be transformed into a $Z$-type ($X$-type) stabilizer that can be generated by the original set of $Z$-type ($X$-type) stabilizers. Hence, the action of $H^{\otimes n}$ preserves the codespace: $H^{\otimes n}Q=Q$.

Therefore, the condition that $Row(H_1)=Row(H_2)$ is both a necessary and sufficient condition for $H^{\otimes n}Q=Q$. If $H_1=H_2$, then trivially $Row(H_1)=Row(H_2)$ so that $H^{\otimes n}Q=Q$.

Suppose that $H=H_1=H_2$ and $n$ is odd. Let $Row(H)$ be the space spanned by the rows of $H$, and assume that $|v|=\sum_{j=1}^{n}=0 \ (mod \ 2)$ for every $v\in Row(H)$. Note that this implies that $\vec{1}\in Row(H)^\perp\backslash Row(H)$, where $\vec{1}=(1,\dots,1)$ ($n$ times).

Therefore, the operators given by $X^{\otimes n}$ and $Z^{\otimes n}$ are in the normalizer of the stabilizer for $Q$, because each stabilizer will commute with $X^{\otimes n}$ and $Z^{\otimes n}$ as the size of the intersection of the supports of any stabilizer with either $X^{\otimes n}$ and $Z^{\otimes n}$ is even. Moreover, $X^{\otimes n}$ and $Z^{\otimes n}$ are not in the stabilizer since every stabilizer acts nontrivially only on an even number of physical qubits whereas $X^{\otimes n}$ and $Z^{\otimes n}$ acts on all $n$ qubits (where $n$ is odd). Hence, the operators $X^{\otimes n}$ and $Z^{\otimes n}$ yield a logical operation on $Q$ on some encoded qubit. Without loss of generality associate these logical operations to act on the first encoded qubit:
\[\begin{align*}
\overline{X}_1&=X^{\otimes n} \\
\overline{Z}_2&=Z^{\otimes n}
\end{align*}\]

Consider then the logical basis states for the $1$st encoded qubit: $\ket{\overline{0}}_1$ and $\ket{\overline{1}}_1$. In this basis, these states can be expressed as:
\[\begin{align*}
\ket{\overline{0}}_1\bra{\overline{0}}_1&=\frac{I+\overline{Z}}{2} \\
\ket{\overline{1}}_1\bra{\overline{1}}_1&=\frac{I-\overline{Z}}{2}.
\end{align*}\]

Then since $HXH^\dagger=Z$ and $HZH\dagger=X$, this implies that $H^{\otimes n}\overline{X}H^{\otimes n \dagger}=\overline{Z}$ and $H^{\otimes n}\overline{Z}H^{\otimes n \dagger}=\overline{X}$. Therefore,
\[\begin{align*}
H^{\otimes n}\left(\frac{I+\overline{Z}}{2}\right)H^{\otimes n \dagger}&=\frac{I+\overline{X}}{2}=\ket{\overline{+}}_1\bra{\overline{+}}_1 \\
H^{\otimes n}\left(\frac{I-\overline{Z}}{2}\right)H^{\otimes n \dagger}&=\frac{I-\overline{X}}{2}=\ket{\overline{-}}_1\bra{\overline{-}}_1,
\end{align*}\]
where $\ket{\overline{+}}=\frac{1}{\sqrt{2}}(\ket{\overline{0}}_1+\ket{\overline{1}}_1)$ and $\ket{\overline{-}}=\frac{1}{\sqrt{2}}(\ket{\overline{0}}_1-\ket{\overline{1}}_1)$. Hence, the action of $H^{\otimes n}$ on the logical computational basis of the first encoded qubit is given by a logical Hadamard on that encoded qubit:
\[ \begin{align*}
\ket{\overline{0}}&\overset{\overline{H}}\mapsto\ket{\overline{+}},\\
\ket{\overline{1}}&\overset{\overline{H}}\mapsto\ket{\overline{-}}.
\end{align*}\]

In regards to the action of $H^{\otimes n}$ on the rest of the $k-1$ encoded qubits, recall that $H^{\otimes n}$ preserves the code space. Therefore, $H^{\otimes n}$ merely permutes the individual stabilizers. More generally, $H^{\otimes n}$ is in the Clifford group $C_n$. Thus, it must be the case that $H^{\otimes n}=\overline{H}\otimes \overline{C}$, where $\overline{C}$ is some logical Clifford operation since $H^{\otimes n}$ is a Clifford operation.

Probabilistic implementation of gates by cyclic permutation and measurement

2014-03-22T22:38:00.000-07:00

Given an $n$- qubit Pauli operator $P=P_1\otimes \cdots \otimes P_n$, with $P_j\in\{I,X,Y,Z\}$, let $\eta(P)=P_2\otimes P_3\otimes\cdots\otimes P_n\otimes P_1$ be the Pauli operator obtained by cyclically permuting the factors. Let $S$ be a stabilizer with generators $\{M_j\}_j$, and let $\{\overline{X}_l,\overline{Z}_l\}_l$ be generators of $N(S)/S$. Consider the stabilizer $\eta(S)$ defined by generators $\{M'_j:=\eta(M_j)\}_j$, and the associated generators $\{\overline{X}'_l=\eta(\overline{X}_l), \overline{Z}'_l=\eta(\overline{Z}_l)\}_l$ of $N(\eta(S))/\eta(S)$.
Consider the following process:

Start with some encoded state $\ket{\overline{\Psi}}$ for $S$.
Measure all the stabilizer generators of $\eta(S)$.
Output `success' and the post-measurement state if all outcomes are $+1$.

Let $S$ be the stabilizer of the $[[5,1,3]]$-code, and let $\ket{\overline{\Psi}}\in\mathcal{T}(S)$ be an arbitrary encoded state.

The stabilizer generators $S=<M_1,M_2,M_3,M_4>$ for this 5-qubit code, are given by
\[\begin{align*}
M_1=&X \ Z \ Z \ X \ I \\
M_2=& I \ X \ Z \ Z \ X \\
M_3=& X \ I \ X \ Z \ Z \\
M_4=& Z \ X \ I \ X \ Z ,
\end{align*}\]
and so the resulting generators for $\eta(S)=<M'_1,M'_2,M'_3,M'_4>$ are given by
\[\begin{align*}
M'_1=&Z \ Z \ X \ I \ X \\
M'_2=&X \ Z \ Z \ X \ I \\
M'_3=& I \ X \ Z \ Z \ X \\
M'_4=& X \ I \ X \ Z \ Z. \\
\end{align*}\]

Observe that $M'_2=M_1$, $M'_3=M_2$, $M'_4=M_3$, and that $M'_1=M_1M_2M_3M_4$. Thus, each of the generators $M'_j$ can be expressed in terms of the generators of $S$, which implies that $\eta(S)=S$. Since each $M\in S=\eta(S)$ satisfies $M\ket{\overline{\Psi}}=\ket{\overline{\Psi}}$, measuring the stabilizer generators $M'_j$ will always yield an outcome corresponding to $+1$, and the state will be left invariant as $\ket{\overline{\Psi}}$. In regards to the procedure outlined above, the success probability will be $1$, and the post-measurement state will be $\ket{\overline{\Psi}}$.

Consider the set of stabilizers $S=<M_1,M_2>$ and generators of $N(S)/S=<\overline{X}_1,\overline{Z}_1,\overline{X}_2,\overline{Z}_2>$ given by
\[\begin{align*}
M_1=&X \ X \ X \ I \\
M_2=& I \ Z \ Z \ I \\
\overline{X}_1=& I \ X \ X \ I \\
\overline{Z}_1=& Z \ I \ Z \ I \\
\overline{X}_2=& I \ I \ I \ X \\
\overline{Z}_3=&I \ I \ I \ Z.
\end{align*}\]

Then the stabilizer generators of $\eta(S)=<M'_1,M'_2>$ are given by
\[\begin{align*}
M'_1=&X \ X \ I \ X \\
M'_2=& Z \ Z \ I \ I. \\
\end{align*}\]
It is readily seen that $\{M'_1,M_i\}=0$ and $[M'_2,M_i]=0$, for $i=1,2$, implying that $M'_1\notin N(S)$, and that $M'_2\in N(S)$ but $M'_2\notin S$.

Consider the set of stabilizer generators of $S=<M_1,M_2>$ and generators of \\ $N(S)/S=<\overline{X}_1,\overline{Z}_1,\overline{X}_2,\overline{Z}_2>$ given by
\[\begin{align*}
M_1 =& X \ X \ X \ I \\
M_2 =& I \ Z \ Z \ I \\
\overline{X}_1=& I \ X \ X \ I \\
\overline{Z}_1=& Z \ I \ Z \ I \\
\overline{X}_2=& I \ I \ I \ X \\
\overline{Z}_2=& I \ I \ I \ Z, \\
\end{align*}\]

and the generators of $\eta(S)$ and $N(\eta(S))/\eta(S)$ given by

\[\begin{align*}
M'_1 =& X \ X \ I \ X \\
M'_2 =& Z \ Z \ I \ I \\
\overline{X}'_1=& X \ X \ I \ I \\
\overline{Z}'_1=& I \ Z \ I \ Z \\
\overline{X}'_2=& I \ I \ X \ I \\
\overline{Z}'_2=& I \ I \ Z \ I. \\
\end{align*}\]

Suppose the initial state is an encoded state $\ket{\overline{\Psi}}=\ket{\overline{+}}_1\otimes\ket{\overline{0}}_2$. Here, $\ket{\overline{+}}_1$ is interpreted as an encoded state of $3$ physical qubits, and the other encoded state $\ket{\overline{0}}_2$ in the tensor product is an encoded state represented by a single qubit.

More explicitly, consider the state $\frac{1}{\sqrt{2}}(\ket{000}+\ket{111})$ and observe that
\[
\overline{X}_1\frac{1}{\sqrt{2}}(\ket{000}+\ket{111})=\frac{1}{\sqrt{2}}(\ket{011}+\ket{100}),
\]
which suggests the encoded basis is given by
\[
\ket{\overline{0}}_1= \frac{1}{\sqrt{2}}(\ket{000}+\ket{111}) \ \ \text{and} \ \ \ket{\overline{1}}_1=\frac{1}{\sqrt{2}}(\ket{011}+\ket{100}).
\]
For the second encoded qubit, it must be that $\ket{\overline{0}}_2=\ket{0}$ and $\ket{\overline{1}}_2=\ket{1}$.

Hence, the intitial encoded state is of the form
\[ \begin{align*}
\ket{\overline{\Psi}}=&\ket{\overline{+}}_1\ket{\overline{0}}_2 \\
=&\frac{1}{\sqrt{2}}(\ket{\overline{0}}_1+\ket{\overline{1}}_1)\ket{\overline{0}}_2 \\
=&\frac{1}{\sqrt{2}}(\ket{000}+\ket{111}+\ket{011}+\ket{100})\ket{0} \\
=&\frac{1}{\sqrt{2}}(\ket{0000}+\ket{1110}+\ket{0110}+\ket{1000}).
\end{align*}\]

Consider the two permuted stabilizer $M'_1$ and $M'_2$. Note that $M'_1$ anticommutes with all of the logical Pauli operators in $N(S)/S$ so that $M'_1\notin N(S)$. On the contrary, note that $M'_2$ does commute with all generators of $N(S)/S$, yet is not in the stabilizer $S$ since it cannot be generated by $M_1$ and $M_2$. Therefore, $M'_2\in N(S)\backslash S$, and can be expressed as a logical operation as $M'_2=\overline{Z}_1M_2$.

Now suppose the operator $M'_2$ is measured on the encoded state $\ket{\overline{\Psi}}$. Since $M'_2=\overline{Z}_1M_2$ has the effect of applying a logical $Z$ operation on the first encoded qubit, which is in the state $\ket{\overline{+}}$, the probability of observing a $+1$ eigenvalue is $1/2$. In this case, the state after measuring $M'_2$ can be determined by applying the projector into the $+1$ eigenstate of $M'_2$:
\[ \begin{align*}
\frac{1}{2}(I+M'_2)\ket{\overline{\Psi}}=&\frac{1}{2}(I+\overline{Z}_1M_2)(\ket{\overline{+}}_1\ket{\overline{0}}_2)\\
=&\frac{1}{2}(\ket{\overline{+}}_1\ket{\overline{0}}_2+\ket{\overline{-}}_1\ket{\overline{0}}_2) \\
=&\frac{1}{2}\ket{\overline{0}}_1\ket{\overline{0}}_2
\end{align*} \]

Thus, after renormalization the resulting state is given by $\ket{\overline{0}}_1\ket{\overline{0}}_2$.

Now, consider measuring $M'_1$. Since $M'_1\notin N(S)$, it commutes with exactly half of the Paulis and anticommutes with the other half. This gives a $1/2$ probability of measuring a $+1$ eigenvalue. Hence, the overall success probability of measuring a $+1$ outcome for both operators $M'_1$ and $M'_2$ is given by $1/4$.

Moreover, the resulting state after this second measurement is made is determined via the projector:
\[ \begin{align*}
\frac{1}{2}(I+M'_1)\ket{\overline{0}}_1\ket{\overline{0}}_2=&\frac{1}{2\sqrt{2}}(I+M'_1)(\ket{000}\ket{0}+\ket{111}\ket{0}) \\
=&\frac{1}{2\sqrt{2}}(\ket{000}\ket{0}+\ket{111}\ket{0}+\ket{110}\ket{1}+\ket{001}\ket{1})\\
\rightarrow& \ \frac{1}{2}(\ket{000}\ket{0}+\ket{111}\ket{0}+\ket{110}\ket{1}+\ket{001}\ket{1}),
\end{align*} \]
where the resulting state has been normalized accordingly.

In the context of the permuted stabilizer $\eta(S)$, the $2$nd encoded qubit now appears physically in the $3$rd register of the system. Then by effectively permuting the qubits cyclically, the final state can be expressed in the form
\[\begin{align*}
\ket{\psi_{final}}=& \frac{1}{2}(\ket{000}\ket{0}+\ket{011}\ket{1}+\ket{111}\ket{0}+\ket{100}\ket{1}) \\
=&\frac{1}{2}((\ket{000}+\ket{111})\ket{0}+(\ket{011}+\ket{100})\ket{1})\\
=&\frac{1}{2}(\ket{\overline{0}}_1\ket{\overline{0}}_2+\ket{\overline{1}}_1\ket{\overline{1}}_2),
\end{align*}\]
showing that the state is entangled between logical qubits.

Clifford circuits for stabilizer codes

2014-03-22T22:33:00.000-07:00

Consider the quantum code $T(S)$ defined by the encoding circuit shown below:

The initial state is of the form $\ket{\psi}\ket{0}\ket{+}$, where $\ket{\psi}$ is an arbitrary single qubit state to be encoded. Two stabilizers $M_1$ and $M_2$ acting on the three qubits can be associated to this initial state such that $\ket{0}$ is a $+1$ eigenstate of $M_1$ and $\ket{+}$ is a $+1$ eigenstate of $M_2$. Since $Z\ket{0}=\ket{0}$ and $X\ket{+}=\ket{+}$, then $M_1=IZI$ and $M_2=IIX$. The ``logical" Pauli operators in this case are simply $\overline{X}=XII$ and $\overline{Z}=ZII$.

When passing through some gate $U$ in an encoding circuit the stabilizer generators and logical Pauli operators after the action of the gate $U$ can be updated to new stabilizer generators $M'_1, M'_2$ and new logical Paulis $\overline{X}', \overline{Z}'$ given by conjugation by $U$:
\[
M'_1=UM_1U^\dagger , M'_2=UM_2U^\dagger, \overline{X}'=U\overline{X}U^\dagger, \overline{Z}'=U\overline{Z}U^\dagger.
\]

In the table shown below, such an updating procedure is shown for every gate in the encoding circuit. The following relations were used throughout
\[\begin{align*}
&HXH=Z, \ \ HZH=X , \ \ \ \ RXR^\dagger=Y, \ \ RZR^\dagger=Z \\
&CNOT(X\otimes I)CNOT=X\otimes X, \ \ CNOT(I\otimes X)CNOT=I\otimes X \\
&CNOT(Z\otimes I)CNOT=Z\otimes I, \ \ CNOT(I\otimes Z)CNOT=Z
\otimes Z
\end{align*}\]

Therefore, the stabilizer generators of $S$ are $M_1=-YZY$ and $M_2=-XXY$, and the logical Pauli operators corresponding to generators of $N(S)/S$ are $\overline{X}=IYX$ and $\overline{Z}=-XXY$.

Let $\delta$ be an $n\times n$ matrix and $S$ a $2n\times 2n$ matrix given as
\[
S=\begin{pmatrix}
0 & \delta \\
\delta & 0
\end{pmatrix},
\]
where here $0$ is an $n\times n$ matrix. By definition, $S$ is symplectic if and only if $J=S^TJS$, where
\[
J=\begin{pmatrix}
0 &I \\
I & 0
\end{pmatrix}.
\]
Then since
\[
\begin{pmatrix}
0 &I \\
I & 0
\end{pmatrix}=\begin{pmatrix}
0 &\delta^T \\
\delta^T & 0
\end{pmatrix}\begin{pmatrix}
0 &I \\
I & 0
\end{pmatrix}
\begin{pmatrix}
0 &\delta \\
\delta & 0
\end{pmatrix}
=\begin{pmatrix}
0 &\delta^T\delta \\
\delta^T\delta & 0
\end{pmatrix},
\]
in order for $S$ to be a symplectic matrix it must be the case that $I=\delta^T\delta$. Let the columns of the matrix $\delta$ be given by $\vec{r}_i\in\mathbb{Z}^n_2$ for $i\in\{1,\dots,n\}$. Then the condition $I=\delta^T\delta$ can then be interpreted as requiring each row $\vec{r}_i$ to have unit norm, $\vec{r}_i\cdot\vec{r}_i=1 (mod \ 2)$, and also that distinct rows be orthogonal, $\vec{r}_i\cdot\vec{r}_j=0 (mod \ 2)$ for $i\neq j$.

Now let the matrix $\delta=(d_{a,b})_{a,b}$ be given by
\[
d_{a,b}=
\left\{
\begin{array}{ll}
1 \ \ \text{for} \ \ a \neq b\\
\\
0\ \ \text{for} \ \ a=b\\

\end{array}
\right.
\]

Then the condition $\delta^T\delta=I$, implies that $\delta^T\delta=\delta^2=(\sum_{i=1}^{n}d_{a,i}d_{i,b})_{a,b}=(\delta_{a,b})_{a,b}=I$. Therefore, $\sum_{i=1}^{n}d_{a,i}d_{i,a}=\sum_{i=1}^{n}d^2_{a,i}=n-1(mod 2)=1$ and $\sum_{i=1}^{n}d_{a,i}d_{i,b}=n-2(mod 2)=0$, which implies that $n$ must be even. Hence the matrix
\[
S=\begin{pmatrix}
0 & \delta \\
\delta & 0
\end{pmatrix},
\]
is symplectic if and only if $n$ is even.

Consider the case $n=4$ with $\delta$ defined as above. Then
\[
S=\begin{pmatrix}
0 &0 & 0& 0 & 0 & 1 & 1&1 \\
0 &0 & 0& 0 & 1 & 0 & 1&1 \\
0 &0 & 0& 0 & 1 & 1 & 0&1 \\
0 &0 & 0& 0 & 1 & 1 & 1&0 \\
0 &1 & 1& 1 & 0 & 0 & 0&0 \\
1 &0 & 1& 1 & 0 & 0 & 0&0 \\
1 &1 & 0& 1 & 0 & 0 & 0&0 \\
1 &1 & 1& 0 & 0 & 0 & 0&0 \\
\end{pmatrix}
\]
The objective here will be to find symplectic matrices $V_{L,k},V_{R,k}\in\{H, CNOT\}$ such that
\[
\left(\PROD{k=1}{g_L}V_{L,k} \right)S\left(\PROD{k=1}{g_R}V_{R,k} \right)=I.
\]
This would then imply that
\[
\left(\PROD{k=g_l}{1}V^\dagger_{L,k} \right)\left(\PROD{k=g_R}{1}V^\dagger_{R,k} \right)=S,
\]
giving a decomposition of $S$ in terms of $H$ and $CNOT$ gates alone. Actually, in what follows, only matrices acting from the left will be considered, so that $g_R=0$ and
\[
\left(\PROD{k=1}{g_L}V_{L,k} \right)S=I \ \ \text{so that} \ \ \left(\PROD{k=g_L}{1}V^\dagger_{L,k} \right)=S
\]
In general, for a symplectic matrix in the block form:
\[
U=\begin{pmatrix}
A & B \\
C & D
\end{pmatrix},
\]
the action of multiplying on the left by $H_i$ has the effect of switching the $i$th rows of $(A | B )$ and $(C | D )$. The action of $CNOT_{i,j}$ has the effect of adding the $i$th row of $(A |B)$ to the $j$th row of $(A | B )$, and adding the $j$th row of $( C | D ) $ to the $i$th row of $(C|D)$.

Consider the following sequence of transformations:
\[\begin{align*}
S=\begin{pmatrix}
0 &0 & 0& 0 & 0 & 1 & 1&1 \\
0 &0 & 0& 0 & 1 & 0 & 1&1 \\
0 &0 & 0& 0 & 1 & 1 & 0&1 \\
0 &0 & 0& 0 & 1 & 1 & 1&0 \\
0 &1 & 1& 1 & 0 & 0 & 0&0 \\
1 &0 & 1& 1 & 0 & 0 & 0&0 \\
1 &1 & 0& 1 & 0 & 0 & 0&0 \\
1 &1 & 1& 0 & 0 & 0 & 0&0 \\
\end{pmatrix}
\overset{CN_{4,1}CN_{3,1}CN_{2,1}}\rightarrow&
\begin{pmatrix}
0 &0 & 0& 0 & 1 & 1 & 1&1 \\
0 &0 & 0& 0 & 1 & 0 & 1&1 \\
0 &0 & 0& 0 & 1 & 1 & 0&1 \\
0 &0 & 0& 0 & 1 & 1 & 1&0 \\
0 &1 & 1& 1 & 0 & 0 & 0&0 \\
1 &1 & 0& 0 & 0 & 0 & 0&0 \\
1 &0 & 1& 0 & 0 & 0 & 0&0 \\
1 &0 & 0& 1 & 0 & 0 & 0&0 \\
\end{pmatrix}
\\
\overset{CN_{1,4}CN_{1,3}CN_{1,2}}\rightarrow&
\begin{pmatrix}
0 &0 & 0& 0 & 1 & 1 & 1&1 \\
0 &0 & 0& 0 & 1 & 0 & 0&0 \\
0 &0 & 0& 0 & 0 & 0 & 1&0 \\
0 &0 & 0& 0 & 0 & 0 & 0&1 \\
1 &0 & 0& 0 & 0 & 0 & 0&0 \\
1 &1 & 0& 0 & 0 & 0 & 0&0 \\
1 &0 & 1& 0 & 0 & 0 & 0&0 \\
1 &0 & 0& 1 & 0 & 0 & 0&0 \\
\end{pmatrix}
\\
\overset{CN_{4,1}CN_{3,1}CN_{2,1}}\rightarrow&
\begin{pmatrix}
0 &0 & 0& 0 & 1 & 0 & 0&0 \\
0 &0 & 0& 0 & 0 & 1 & 0&0 \\
0 &0 & 0& 0 & 0 & 0 & 1&0 \\
0 &0 & 0& 0 & 0 & 0 & 0&1 \\
1 &0 & 0& 0 & 0 & 0 & 0&0 \\
0 &1 & 0& 0 & 0 & 0 & 0&0 \\
0 &0 & 1& 0 & 0 & 0 & 0&0 \\
0 &0 & 0& 1 & 0 & 0 & 0&0 \\
\end{pmatrix}
\\
\overset{H_{1}H_{2}H_{3}H_{4}}\rightarrow&
\begin{pmatrix}
1 &0 & 0& 0 & 0 & 0 & 0&0 \\
0 &1 & 0& 0 & 0 & 0 & 0&0 \\
0 &0 & 1& 0 & 0 & 0 & 0&0 \\
0 &0 & 0& 1 & 0 & 0 & 0&0 \\
0 &0 & 0& 0 & 1 & 0 & 0&0 \\
0 &0 & 0& 0 & 0 & 1 & 0&0 \\
0 &0 & 0& 0 & 0 & 0 & 1&0 \\
0 &0 & 0& 0 & 0 & 0 & 0&1 \\
\end{pmatrix}=I
\end{align*}\]

Recalling that $H^\dagger=H$ and $CNOT^\dagger=CNOT$, and reversing the order of the operations used above then yields a sequence of operations that implements $S$ (the rightmost operation is applied first) :

\[
S=CN_{2,1}CN_{3,1}CN_{4,1}CN_{1,2}CN_{1,3}CN_{1,4}CN_{2,1}CN_{3,1}CN_{4,1}H_{4}H_{3}H_{2}H_{1}.
\]

A circuit diagram depicting this sequence is shown below:

CSS codes and the 9-qubit Shor code

2014-02-01T22:23:00.000-08:00

In the 9-qubit Shor code, the stabilizer $S$ is generated by the following operators

\[\begin{align*}
S_1=& Z_1Z_2 \\
S_2=&Z_2Z_3 \\
S_3=&Z_4Z_5 \\
S_4=&Z_5Z_6 \\
S_5=&Z_7Z_8 \\
S_6=&Z_8Z_9 \\
S_7=&X_1X_2X_3X_4X_5X_6 \\
S_8=&X_4X_5X_6X_7X_8X_9. \\
\end{align*}\]
Thus, the the dimension of the code space is $2^{9-8}=2$, which implies that the code only encodes one logical qubit. In this case, $\hat{N}(S)/S\cong \hat{P}_1$, so $\hat{N}/S$ will be generated by some elements $\overline{X}, \overline{Z}\in\hat{N}(S)\backslash S$, satisfying the usual Pauli commutation relations:
\[
\overline{X}^2=\overline{Z}^2=\overline{I}, \ \text{and} \ \overline{X} \ \overline{Z}=-\overline{Z} \ \overline{X}
\]
In this regard, let
\[\begin{align*}
\overline{Z}=&X_1X_2X_3X_4X_5X_6X_7X_8X_9 \\
\overline{X}=& Z_1Z_2Z_3Z_4Z_5Z_6Z_7Z_8Z_9 \\
\end{align*}\]
which satisfy the commutation relations described above. Moreover, $\overline{Z}\in\hat{N}(S)\backslash S$. To see this, note that
\[ \begin{align*}
[\overline{Z},S_i]=0&, \ \text{for} \ i=7,8 \\
[\overline{X},S_i]=0&, \ \text{for} \ i=1,2,3,4,5,6 \\
\end{align*}\]
because each of these ease either a product of only $X$ operators or only $Z$ operators. The less trivial relations
\[ \begin{align*}
[\overline{Z},S_i]=0&, \ \text{for} \ i=1,2,3,4,5,6 \\
[\overline{X},S_i]=0&, \ \text{for} \ i=7,8 \\
\end{align*}\]

also hold, because the intersection of the supports of $\overline{Z}$ and $S_i$ ( for $ i=1,2,3,4,5,6 $) is always an even number of qubits. Likewise, the intersection of the supports of $\overline{X}$ and $S_i$ ( for $ i=7,8 $) is also always an even number of qubits. Thus, $\overline{X},\overline{Z}\in \hat{N}(S)$.

Now, to see that $\overline{X},\overline{Z}\notin S$ observe that the product$S_7S_8=X_1X_2X_3X_7X_8X_9\neq\overline{Z}$, which implies $\overline{Z}\notin S$. For $\overline{X}$, it is impossible to even generate an operator that contains all the factors $Z_1Z_2Z_3$ since the only stabilizers that even have nontrivial support on these first three qubits are $S_1$ and $S_2$, but yet $S_1S_2=Z_1Z_3\neq Z_1Z_2Z_3$. Therefore, $\overline{X}\notin S$.

Consider the binary matrix
\[
\delta=\begin{pmatrix}
1&1&1&0&0 \\
0&0&1&1&1\\
1&1&0&1&1 \\
1&1&1&0&0 \\
0&0&1&1&1\\
\end{pmatrix}.
\]
Associate to each row $j$ an $X$-type stbilizer
\[
M_j=\PROD{l=1}{5}X_l^{\delta_{jl}},
\]
and to each column a $Z$-stabilizer
\[
N_j=\PROD{l=1}{5}Z_l^{\delta_{lj}}.
\]
The product of the matrix $\delta$ with itself (modulo $2$) effectively computes products of all the $M_j$ operators with all the $N_j$ operators. The $i,j$th coefficient in the matrix of $\delta^2$, has a value of $0$ if $M_iN_j$ commute and has a value of $1$ if $M_iN_j$ anticommute. In fact, since $\delta^2$ gives the matrix with all coefficients $0$, each $M_i$ commutes with each $N_j$. Therefore, the set $Q$ generated by all such products of $M_j$ and $N_i$ forms an abelian subgroup of the Pauli group $\hat{P}_5$. Moreover, since each $M_j$ and $N_j$ square to $+I$, it is necessarily the case that $-I\notin Q$. Hence, $Q$ actually defines a stabilizer group. By choosing linearly independent operators $M_j$ and $N_i$, it is seen that the stabilizer $Q$ can be generated by $\{M_1,M_2,N_1, N_4\}$.

Let $C_1$ and $C_2$ be a classical linear codes with parity check matrices $H_1$ and $H_2$ given by

\[
H_1=\begin{pmatrix}
1&0&1&1&0 \\
0&1&1&0&1
\end{pmatrix}
\ \ \text{and} \ \
H_2=\begin{pmatrix}
1&1&1&0&0 \\
0&0&1&1&1
\end{pmatrix}.
\]

Notice that $H_1$ consists of the two $Z$-type stabilizer generators $N_1$ and $N_4$, and $H_2$ consists of the two $X$-type stabilizer generators $M_1$ and $M_2$ defined in (2) that together generate the stabilizer group $Q$. In this way, $Q$ is the CSS code $CSS(H_1,H_2)$. Here, $Q=<S_1\cup S_2>$, where $S_1$ and $S_2$ are the stabilizer groups generated by the $Z$-type and $X$-type operators, respectively. Namely,
\[
S_1=\begin{pmatrix}
0&0&0&0&0 \\
1&0&1&1&0 \\
0&1&1&0&1 \\
1&1&0&1&1
\end{pmatrix}
\ \ \text{and} \ \
S_2=\begin{pmatrix}
1&0&1&1&0 \\
1&1&1&0&0 \\
0&0&1&1&1\\
1&1&0&1&1 \\
\end{pmatrix}.
\]

Let $G_1$ and $G_2$ be corresponding generator matrices of the codes $C_1$ and $C_2$, respectively. For each code, since there are $5$ physical bits and $3$ encoded bits ($H_1$ and $H_2$ are $2 \times 5$ matrices), both generator matrices have dimensions $3\times 5$. These generators must satisfy the constraints
\[
G_iH_i^T=0 \ \ \text{and} \ \ H_iG_i^T=0.
\]
for $i=1,2$. The constraints form a set of $k$ non-singular equations on $n$ bits. It can be checked that the following two particular matrices satisfy these constraints:
\[
G_1=\begin{pmatrix}
1&0&0&1&0 \\
1&1&0&1&1 \\
0&1&1&1&0
\end{pmatrix}
\ \ \text{and} \ \
G_2=\begin{pmatrix}
1&0&1&0&1 \\
1&1&0&0&0 \\
0&1&1&1&0
\end{pmatrix}.
\]

The rows of $G_1$ and $G_2$ form a basis of the code spaces $C_1$ and $C_2$, respectively. Hence, the span of each generates the entire code spaces. The elements of these code spaces are then
\[
C_1=\begin{pmatrix}
0&0&0&0&0 \\
1&0&0&1&0 \\
1&1&0&1&1 \\
0&1&1&1&0 \\
0&1&0&0&1 \\
1&0&1&0&1 \\
   1&1&1&0&0 \\
    0&0&1&1&1
\end{pmatrix}
\ \ \text{and} \ \
C_2=\begin{pmatrix}
0&0&0&0&0 \\
1&0&1&0&1 \\
1&1&0&0&0 \\
0&1&1&1&0 \\
   1&0&1&1&0 \\
    0&1&1&0&1 \\
     1&1&0&1&1 \\
      0&0&0&1&1 \\
\end{pmatrix}.
\]

By associating a $Z$-type (or $X$-type) operator $\overline{Z}$ (or $\overline{X}$) to each row of $C_2$ (or $C_1$), and demanding that $\overline{Z}\notin S_1$ (and $\overline{X}\notin S_2$) and also that $\overline{Z} \ \overline{X}=-\overline{X} \ \overline{Z}$, then the pair $(\overline{X},\overline{Z})$ can be made to correspond to a logical operation on a logical qubit. In this case, since there is only $1$ logical qubit a single pair satisfying these properties is all that is needed. In this regard, choose
\[
\overline{X}=X_1X_4 \ \ \text{and} \ \ \overline{Z}=Z_1Z_3Z_5
\]
corresponding to the second rows of $C_2$ and $C_1$. Then indeed, $\overline{X}\notin S_2$ and $\overline{Z}\notin S_1$, and $\overline{Z} \ \overline{X}=-\overline{X} \ \overline{Z}$. Therefore, the pair $(\overline{X},\overline{Z})$ function as logical operators on the encoded qubit, and hence generate $\mathcal{N}(S)/S\cong P_1$.

In the classical linear code context, the distances of codes $C_1$ and $C_2$ are both given by $d_1=d_2=2$, which can be seen by examining the elements of $C_1$ and $C_2$ and finding the minimum weight of any non-identity codeword. For $CSS(H_1,H_2)$, the distance of the code $Q$ must satisfy $d\geq min\{d_1,d_2\}=2$. In this context, the distance of $Q$ will be the minimum weight of an operator in $\mathcal{N}(S)\backslash S$, which by inspection is seen to be $d=2$.

In what follows, an algorithm will be described which begins with two parity check matrices $H_1$ and $H_2$ of a general CSS code, and computes a set of generators for $\hat{\mathcal{N}}(S)/S$ grouped into pairs $\{(\overline{X}_i,\overline{Z}_i)\}$.

First, using $H_1$ and $H_2$ (matrices with dimensions $(n-k)\times n$ construct generator matrices $G_1$ and $G_2$ (of dimensions $k\times n$) satisfying the constraints $G_iH_i^T=0$ and $H_iG^T=0$ for each $i=1,2$. Moreover, the rows of such $G_i$ should be linearly independent. Such a procedure can be done in general using the Gram-Schmidt process.

Next, consider the codes paces $C_i$ defined as the span of the rows of $G_i$, and also the stabilizer spaces $S_i$ defined as the span of the rows of $H_i$. Then to find generators of $\hat{\mathcal{N}}(S)/S$, it suffices to choose pairs $(\overline{X}_l,\overline{Z}_l)$ with $\overline{Z}_l\in C_1\backslash S_1$ and $\overline{X}_l\in C_2\backslash S_2$, such that $\overline{X}_l$ and $\overline{Z}_l$ anti commute, and all pairs $(\overline{X}_l,\overline{Z}_l)$ and$(\overline{X}_l',\overline{Z}_l')$ with $l\neq l'$ commute. For $k$ encoded qubits, there will be $k$ many many pairs, corresponding to the logical operations applied to each of the $k$ logical qubits.

Graph states

2014-02-01T22:20:00.000-08:00

Let $G=(V,E)$ be a undirected graph with no self-loops. That is, for all $v,v'\in V$, $(v,v)\notin E$, and if $(v,v')\in E$ then also $(v',v)\in E$. Associate a qubit to each vertex in $V$. When necessary, the vertices will be indexed as $v_i$ with $i\in\{1,\dots, N\}$ where $|V|=N$. Let $I_N=I^{\otimes N}$ be the identity operator on all $N$ qubits, where $I$ is the identity operator on a single qubit. For some single qubit unitary $U$, write $U_{v_i}=I\otimes\cdots\otimes I\otimes U\otimes\cdots\otimes I$ to denote the operator on the $N$-qubit space which applies the single qubit unitary $U$ to the qubit associated to vertex $v_i$. Now, define the operators
\[
M_v=X_v\PROD{v':(v,v')\in E}{} Z_{v'}
\]
for $v\in V$, where $X_v$ and $Z_v$ are the single qubit $X$ and $Z$ operators acting on the qubit associated to vertex $v$.

For compatible operators $A$ and $B$, let $[A,B]=ABA^{-1}B^{-1}$. If $[A,B]=I$, then $A$ and $B$ \emph{commute}, and if $[A,B]=-I$ then $A$ and $B$ \emph{anticommute}. Now observe the following commutation relations:
\[
\begin{align*}
\text{for any} \ v_i,v_j\in V:& \ \ \ [I_N,X_{v_i}]=[I_N,Z_{v_i}]=[X_{v_i},X_{v_j}]=[Z_{v_i},Z_{v_j}]=I_N\\
\text{if} \ v_i\neq v_j:& \ \ \ [X_{v_i},Z_{v_j}]=I_N\\
\text{if} \ v_i=v_j:& \ \ \ [X_{v_i},Z_{v_j}] =-I_N.
\end{align*}
\]
These then imply the following commutation relations:
\[
\begin{align*}
\text{for any} \ v_i,v_j\in V:& \ \ \ \left[\PROD{v':(v_i,v')\in E}{} Z_{v'} \ , \ \PROD{v':(v_j,v')\in E}{} Z_{v'}\right]=I_N \\
\text{if} \ (v_i,v_j)\notin E:& \ \ \ \left[X_{v_i} \ , \ \PROD{v':(v_j,v')\in E}{} Z_{v'}\right]=I_N \\
\text{if} \ (v_i,v_j)\in E:& \ \ \ \left[X_{v_i} \ , \ \PROD{v':(v_j,v')\in E}{} Z_{v'}\right]=-I_N.
\end{align*}
\]

Trivially, for any graph $G$ and $v\in V$, the operator $M_v$ commutes with itself since $M_v^2=I_N$ as $M_v$ is only comprised of $X$ and $Z$ terms which satisfy $X^2=I$ and $Z^2=I$. Therefore, consider graphs $G$ with more than one vertex so that $|V|\geq 2$. For any two $M_{v_i}, M_{v_j}\in \{M_v\}_{v\in V}$ such that $v_i\neq v_j$, there are two cases two consider: either $(v_i,v_j)\notin E$ or $(v_i,v_j)\in E$.

Suppose first that $(v_i,v_j)\notin E$. Then $ Z_{v_j}$ does not appear in $M_{v_i}$, and likewise $Z_{v_i}$ does not appear in $M_{v_j}$. Then in the product $M_{v_i}M_{v_j}$, the pairs of operators that act on any particular qubit all commute. Therefore, $M_{v_i}M_{v_j}=M_{v_j}M_{v_i}$ so that $M_{v_i}$ and $M_{v_j}$ commute as well in this case.

Instead, suppose that $(v_i,v_j)\in E$. In this case, $ Z_{v_j}$ does appear in $M_{v_i}$, and likewise $Z_{v_i}$ appears in $M_{v_j}$. Then in the product $M_{v_i}M_{v_j}$ there are two pairs of operators that anticommute--namely, $X_{v_i}Z_{v_i}$ and $Z_{v_j}X_{v_j}$. Then since $X_{v_i}Z_{v_i}=-Z_{v_i}X_{v_i}$ and $Z_{v_j}X_{v_j}=-X_{v_j}Z_{v_j}$, the two resulting $-1$ factors will cancel. Hence, that $M_{v_i}M_{v_j}=M_{v_j}M_{v_i}$ implying that $M_{v_i}$ and $M_{v_j}$ commute. To see this more explicitly, consider the calculation shown below, which makes use of the nontrivial commutation relations given above:
\[
\begin{align*}
M_{v_i}M_{v_j}=&\left(X_{v_i}\PROD{v':(v_i,v')\in E}{} Z_{v'} \right)\left(X_{v_j}\PROD{v':(v_j,v')\in E}{} Z_{v'} \right)\\
&=(-1)X_{v_i}X_{v_j}\PROD{v':(v_i,v')\in E}{} Z_{v'}\PROD{v':(v_j,v')\in E}{} Z_{v'} \\
&=(-1)X_{v_j}X_{v_i}\PROD{v':(v_j,v')\in E}{} Z_{v'}\PROD{v':(v_i,v')\in E}{} Z_{v'} \\
&=(-1)(-1)X_{v_j}\PROD{v':(v_j,v')\in E}{} Z_{v'}X_{v_i}\PROD{v':(v_i,v')\in E}{} Z_{v'} \\
&=\left(X_{v_i}\PROD{v':(v_i,v')\in E}{} Z_{v'} \right)\left(X_{v_j}\PROD{v':(v_j,v')\in E}{} Z_{v'} \right)\\
&=M_{v_j}M_{v_i}.
\end{align*}
\]
It has now been shown that all operators in the set $\{M_v\}_{v\in V}$ commute with one another.

Since each operator $M_v$ contains the operator $X_v$, and any other $M_{v'}$ with $v\neq v'$ contains a different $X_{v'}$ operator, $M_v$ cannot be generated as a product of other $M_{v'}$ operators. Thus, the set $\{M_v\}_{v\in V}$ forms an independent set of commuting Pauli operators. Then the group $S_G:=<M_v>_{v\in V}$ generated by all $M_v$ forms a valid stabilizer group.

The stabilizer group $S_G=<M_v>_{v\in V}$ defines the code space
\[
T(S_G):=\{\ket{\Psi}\in(\mathbb{C}^2)^{\otimes n} \mid M_v\ket{\Psi}=\ket{\Psi}, \ \text{for all} \ v\in V \}.
\]
Furthermore, since are $|V|=N$ qubits and also $r=N$ independent generators, the number of encoded qubits expressed in $T(S_G)$ is given by $2^{N-r}=2^{0}=1$. Therefore, every graph $G$ can be associated to a unique (up to phase) state $\ket{\Psi_G}\in T(S_G)$ which encodes one logical qubit.

Suppose $G=C_n=(V,E)$ is the $n$-cycle graph with $n> 4$. Here, $V=\{v_1,v_2,\dots,v_n\}$ and $E=\{(v_a,v_b)\mid a\equiv b (mod \ n)\}$. Let $\ket{\Psi_{G}}\in T(S_G)$ be the associated graph state. Let $\mathcal{M}_{\bar{b}}\in S_G$, Then the projection onto $T(S_G)$ satisfies
\[
\ket{\Psi_G}\bra{\Psi_G}=\frac{1}{2^n}\SUM{U_a}{}U_a,
\]
since $T(S_G)$ is one dimensional. Denote by $Tr_{(i-1,i,i+1)}(\ket{\Psi_G}\bra{\Psi_G})$, the reduced density operator that results by tracing out all qubit registers except for those associated to neighboring registers $i-1, i$, and $i+1$ (where addition is taken $mod \ n$). Then
\[
Tr_{(i-1,i,i+1)}(\ket{\Psi_G}\bra{\Psi_G})=\frac{1}{2^n}\SUM{U_a}{}Tr_{(i-1,i,i+1)}(U_a).
\]
Hence, consider $Tr_{(i-1,i,i+1)}(U_a)$ for an arbitrary $U_a\in S_G$. Each $U_a$ is a product of some $M_v$ operators. If $U_a$ contains some $M_v$ which has nontrivial support on some vertex other than $v_{i-1},v_i$, or $v_{i+1}$, then $Tr_{(i-1,i,i+1)}(U_a)=0$, because the trace of any Pauli operator is $0$. Hence, the only nonzero contribution comes from $U_a=I$ and $U_a=M_{v_i}$ so that
\[\begin{align*}
Tr_{(i-1,i,i+1)}(\ket{\Psi_G}\bra{\Psi_G})=&\frac{1}{2^n}\SUM{U_a}{}Tr_{(i-1,i,i+1)}(U_a)\\
=&\frac{1}{2^n}\left(Tr_{(i-1,i,i+1)}(I)+Tr_{(i-1,i,i+1)}(M_{v_i})\right)\\
=&\frac{1}{2^{3}}\left(I+M_{v_i}\right) \\
=&\frac{1}{2^{3}}\left(I+X_{v_i}Z_{v_{i-1}}Z_{v_{i+1}}\right).
\end{align*}\]

Let $G=(V_G,E_G)$ and $G'=(M_{G'},E_{G'})$ be graphs and suppose $G'$ is obtained from $G$ by adding a vertex $v'$ which has no edges. Then $V_{G'}=V_G\cup\{v'\}$, and $E_{G'}=E_G$. Let $\ket{\Psi_G}\in T(S_G)$ and $\ket{\Psi_G'}\in T(S_{G'})$ be the associated graph states of $S_G$ and $S_{G'}$, respectively. In terms of $\ket{\Psi_G}$, the state $\ket{\Psi_{G'}}$ can be expressed as
\[
\ket{\Psi_{G'}}=\ket{\Psi_G}\otimes\ket{\varphi},
\]
where $\ket{\varphi}$ is a yet to be determined single qubit state.

In this case, the set of stabilizer generators for $G'$ is given by $\{M_v\}_{v\in V_G}\cup \{M_{v'}\}$, where $M_{v'}=X_{v'}$. All of the operators in $S_G$ act trivially on the state $\ket{\varphi}$ of $\ket{\Psi_{G'}}$, and the operator $M_{v'}$ acts trivially on all registers of the state $\ket{\Psi_{G'}}$ except for the qubit in the state $\ket{\varphi}$. However, since it must be the case that $M_v\ket{\Psi_{G'}}=\ket{\Psi_{G'}}$ for any $M_v\in S_{G'}$, then $\ket{\varphi}$ must be a $+1$ eigenstate of $M_{v'}=X_{v'}$. Therefore, $\ket{\varphi}=\ket{+}:=\frac{1}{\sqrt{2}}(\ket{0}+\ket{1})$ as
\[
X\ket{+}=\frac{1}{\sqrt{2}}(X\ket{0}+X\ket{1})=\frac{1}{\sqrt{2}}(\ket{1}+\ket{0})=\ket{+},
\]
which shows that $\ket{+}$ is the unique $+1$ eigenstate of $X$. Hence, the associated graph state of $S_{G'}$ is given by
\[
\ket{\Psi_{G'}}=\ket{\Psi_G}\otimes\ket{+}.
\]

Now suppose $G'=(V_{G'},E_{G'})$ is obtained from $G=(V_{G},E_{G})$ by adding an edge $(v_1,v_2)$, where $(v_1,v_2)\notin E_{G}$, so that $E_{G'}=E_{G}\cup\{(v_1,v_2)\}$ and $V_{G'}=V_{G}$. Let $S_G=<M^G_v>_{v\in V_G}$ and $S_{G'}=<M^{G'}_v>_{v\in V_{G'}}$ be the stabilzer groups generated by the operators $M^G_v$ and $M^{G'}_v$, respectively. Then all of the operators in $S_{G'}$ and $ S_{G}$ are the same, except the operators $M^{G'}_{v_1},M^{G'}_{v_2}\in S_{G'}$ differ from those in $S_{G}$, by including additional $Z_{v_2}$ and $Z_{v_1}$ terms, respectively. That is,
\[
M^{G'}_{v_i}=
\left\{
\begin{array}{ll}
M^{G}_{v_i} \ \ \text{if} \ \ v_i \neq v_1, v_2 \\
\\
M^{G}_{v_i}Z_2\ \ \text{if} \ \ v_i=v_1\\
\\
M^{G}_{v_i}Z_1\ \ \text{if} \ \ v_i=v_2.\\
\end{array}
\right.
\]

Consider $\ket{\Psi_{G}}\in T(S_{G})$, and let $\ket{\Phi}=CZ_{z_1,z_2}\ket{\Psi_G}$, where $CZ_{z_1,z_2}$ is the two qubit phase gate applied to the qubits associated to vertices $v_1$ and $v_2$ given by
\[
CZ=\begin{pmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 &1 & 0 \\
0 & 0 & 0 & -1
\end{pmatrix}.
\]
Since $M^{G'}_{v_i}=M^{G}_{v_i}$ for $v_i\neq v_1, v_2$, and neither $X_{v_1}$ nor $X_{v_2}$ appear as factors in any of the operators $M^{G'}_{v_i}$ in this case, then each $M^{G'}_{v_i}$ commutes with $CZ_{z_1,z_2}$. Therefore,
\[
M^{G'}_{v_i}\ket{\Phi}=M^{G'}_{v_i}CZ_{z_1,z_2}\ket{\Psi_G}=CZ_{z_1,z_2}M^{G'}_{v_i}\ket{\Psi_G}=CZ_{z_1,z_2}\ket{\Psi_G}=\ket{\Phi},
\]
in the case where $v_i\neq v_1, v_2$.

For the other cases, first observe that $X_{v_1}Z_{v_2}CZ_{v_1,v_2}=CZ_{v_1,v_2}X_{v_1}$. Now suppose $v_i=v_1$, then
\[\begin{align*}
M^{G'}_{v_1}CZ_{z_1,z_2}=&M^{G}_{v_1}Z_{v_2}CZ_{z_1,z_2} \\
=&M^{G}_{v_1}X_{v_1}X_{v_1}Z_{v_2}CZ_{z_1,z_2} \\
=&M^{G}_{v_1}X_{v_1}CZ_{z_1,z_2}X_{v_1}\\
=& CZ_{z_1,z_2}M^{G}_{v_1}X_{v_1}X_{v_1} \\
=&CZ_{z_1,z_2}M^{G}_{v_1}.
\end{align*} \]
Hence, $M^{G'}_{v_1}\ket{\Phi}=M^{G'}_{v_1}CZ_{z_1,z_2}\ket{\Psi_G}=CZ_{z_1,z_2}M^{G}_{v_1}\ket{\Phi_G}=CZ_{z_1,z_2}\ket{\Psi_G}=\ket{\Phi}$. Therefore, for all $v_i\in V_{G'}$, it follows that $M^{G'}_{v_1}\ket{\Phi}=\ket{\Phi}$, implying that $\ket{\Phi}=CZ_{z_1,z_2}\ket{\Psi_G}\in T(S_{G'})$ is the encoded graph state of $S_{G'}$.

The consequences just shown can be applied to construct the graph state $\ket{\Psi_{C_4}}$ for the cycle graph $C_4$. In this approach, a Hadamard gate is applied to each of the four qubits (that all begin in the $\ket{0}$ state) which puts each qubit into the $\ket{+}$ state. Then a controlled phase $CZ_{v_i,v_j}$ for each pair of adjacent vertices is applied. The resulting state produced by this circuit will then be $\ket{\Psi_{C_4}}$.

Permutation-invariant codes

2014-02-01T22:11:00.000-08:00

Let $S_n$ be the group of permutations of the $n$-element set $[n]:=\{1 , 2, \dots, n\}$, and let $\pi\in S_n$. Define the unitary $U_\pi: (\mathbb{C}^2)^{\otimes n}\mapsto(\mathbb{C}^2)^{\otimes n}$ by
\[
U_\pi(\ket{\varphi_1}\otimes\cdots\otimes\ket{\varphi_n})=\ket{\varphi_{\pi^{-1}(1)}}\otimes\cdots\otimes\ket{\varphi_{\pi^{-1}(n)}}
\]
for all product states $\ket{\varphi_1}\otimes\cdots\otimes\ket{\varphi_n}$ (and extended linearly to all of $(\mathbb{C})^{\otimes{n}}$). Denote by $(i j)$ the transposition of $i,j\in[n]$ for $i\neq j$, and define the subspace
\[
Q=\{\ket{\Psi}\in(\mathbb{C})^2)^{\otimes{n}} \mid U_{(i j)}\ket{\Psi}=\ket{\Psi} \ \text{for all} \ i\neq j \}.
\]

An arbitrary state $\ket{\Psi}\in Q$ can be written generally as a linear combination
\[
\ket{\Psi}=\SUM{}{}\gamma_k\ket{\xi_k},
\]
where $\ket{\xi_k}\in\mathcal{B}$ and $\mathcal{B}$ is a complete set of basis vectors of $Q$. Since every $\ket{\xi_k}\in Q$, must also satisfy the constraint $U_{(ij)}\ket{\xi_k}=\ket{\xi_k}$ for all $(ij)\in S_n$, each state $\ket{\xi_k}$ must be comprised of states possessing a property that remains invariant under the $U_{(ij)}$ operations. In this regard, let $\bar{b}=b_1b_2\dots b_n \in\{0,1\}^n$ with $b_i\in\{0,1\}$, and let $\ket{\bar{b}}\in(\mathbb{C}^2)^{\otimes n}$ be a $n$-qubit computational basis state. More explicitly, $\ket{\bar{b}}=\bigotimes_{i=1}^{n}\ket{b_i}$, describes the $n$-qubit state where each qubit is either in the state $\ket{0}$ or $\ket{1}$. Now define the map
\[
\omega: \{0,1\}^n\to \{0,1,\dots,n\} \ \ \text{given as} \ \ \omega(\bar{b})=\SUM{i=1}{n}b_i,
\]
which simply counts the number of $1$s appearing in the bit string $\bar{b}$, and call $\omega(\bar{b})$ the \emph{weight} of $\bar{b}$.

For every $\bar{b}\in\{0,1\}^n$ and any $\pi\in S_n$, it will always be the case that $U_{\pi}\ket{\bar{b}}=\ket{\bar{b'}}$ for some $\bar{b'}\in\{0,1,\}^n$ such that $\omega(\bar{b})=\omega(\bar{b'})$, because $U_{\pi}$ will only permute the bits $b_i$ comprising the string $\bar{b}$ which preserves the weight of the string. That is, if $\ket{\bar{b}}=\bigotimes_{i=1}^{n}\ket{b_i}$, then $U_\pi\ket{\bar{b}}=\ket{\bar{b'}}=\bigotimes_{i=1}^{n}\ket{b_{\pi^{-1}(i)}}$, where the bits of the resulting string $\bar{b'}$ are given by $b'_i=b_{\pi^{-1}(i)}$, and consequently $\omega(\bar{b})=\sum_{i=1}^{n}b_i=\sum_{i=1}^{n}b_{\pi^{-1}(i)}=\omega(\bar{b'})$.

Therefore, let
\[
\ket{\xi_k}=\frac{1}{\sqrt{\binom{n}{k}}}\SUM{ \ \ \bar{b} \ : \ \omega(\bar{b})=k}{}\ket{\bar{b}}
    \]
be an equally weighted superposition of the $\binom{n}{k}$ computational basis states $\ket{\bar{b}}$ with weight $\omega(\bar{b})=k$. Then it follows that $U_{(ij)}\ket{\xi_k}=\ket{\xi_k}$ for all $(ij)\in S_n$, since each $U_{(ij)}$ yields a bijection between the set of states $\ket{\bar{b}}$ having the same weight. Hence, the set of basis vectors of $Q$ is given by $\mathcal{B}=\{\ket{\xi_k} \mid k=0,1,\dots, n\}$ consisting of $n+1$ basis vectors.

A projector $P$ onto the subspace $Q$ can be expressed in terms of the basis vectors in $\mathcal{B}$ as
\[
P=\SUM{k=0}{n}\ket{\xi_k}\bra{\xi_k}.
\]

In terms of the operators $U_\pi$, the projector can be equivalently expressed as an equally weighted superposition over all permutations as
\[
P=\frac{1}{n!}\SUM{\pi\in S_n}{}U_\pi
\]

Consider the set of errors
\[
\mathcal{E}=\left\{\SUM{\pi\in S_n}{}a_\pi U_\pi \mid (a_\pi)_{\pi\in S_n} \subset \mathbb{C} \right\}.
\]
For the group symmetric $S_n$, recall that any permutation $\pi\in S_n$ can be expressed as a product of transpositions only. Actually, $S_n$ can be generated by the $n-1$ transpositions of the form $(i,i+1)$ for $i\in\{1, 2, \dots, n-1\}$. This algebraic structure extends to the operators $U_\pi$, That is, for any $\pi\in S_n$,
\[
U_\pi=\PROD{(ab)\in S_n}{}U_{(ab)},
\]
where $(ab)=(i,i+1)$ for some $i\in\{1, 2, \dots, n-1\}$. Note that in this product, any particular $U_{(ab)}$ can appear multiple times. Therefore, for any $\pi\in S_n$ and any $\ket{\psi}\in Q$
\[
U_\pi\ket{\psi}=\PROD{(ab)\in S_n}{}U_{(ab)}\ket{\psi}=\ket{\psi},
\]
because $U_{(ab)}\ket{\psi}=\ket{\psi}$ for any $(ab)\in S_n$.

In regards to to the correctability of the error set $\mathcal{E}$, consider any $\pi,\pi\in S_n$ and any basis vectors $\ket{\xi_k},\ket{\xi_{k'}}\in\mathcal{B}$. As a consequence of what was just shown $U_{\pi}\ket{\xi_k}=\ket{\xi_k}$ and $U_{\pi'}\ket{\xi_{k'}}=\ket{\xi_{k'}}$. Then,
\[
\bra{\xi_{k'}}U_{\pi'}^{\dagger}U_\pi\ket{\xi_k}=\ip{\xi_{k'}}{\xi_k}=\delta_{k'k}.
\]
Since the error correction conditions for a code state that
   \[
\bra{\xi_{k'}}U_{\pi'}^{\dagger}U_\pi\ket{\xi_k}=c_{\pi'\pi}\ip{\xi_{k'}}{\xi_k},
\]
where the $c_{\pi'\pi}$ are constants that do not depend on the states $\ket{\xi_k}$ and $\ket{\xi_k'}$, then $c_{\pi'\pi}=1$ for all $\pi,\pi'\in S_n$.
More generally, for any $\ket{\psi},\ket{\psi'}\in Q$ this implies
\[
    \bra{\psi'}U_{\pi'}^{\dagger}U_\pi\ket{\psi}=\ip{\psi'}{\psi}
\]

Therefore, each $U_\pi$ for $\pi\in S_n$ is a correctable error. Then since any linear combination of correctable errors can also be corrected, any error of the form
\[
E=\SUM{\pi\in S_n}{}a_\pi U_\pi
\]
can be corrected. Thus, the set $\mathcal{E}$ is correctable.

An operator basis $\{F_j\}_j$ of $\mathcal{E}$ which diagonalizes the matrix $C=(c_{ab})$ is given by $F_\pi =U_\pi$, where $\pi\in S_n$. Since the order of $S_n$ is $n!$ (meaning there are $n!$ different permutation operators), there are $n!$ many different operators $U_\pi$ in this basis. Then upon being diagonalized, the coefficients of the matrix $C$ are given by $c_{\pi'\pi}=1$ if $\pi'= \pi$ and $c_{\pi'\pi}=0$ if $\pi'\neq \pi$.

For the subspace $Q$, the set of detectable errors is defined as
\[
\mathcal{E}_D:=\left\{E \mid \bra{\psi}E\ket{\phi}=c(E)\ip{\psi}{\phi}, \forall \ket{\psi},\ket{\phi}\in Q\right\}.
\]
Consider a single bit flip error, say, $E=X_1$ that applies the bit flip $X$ to the first qubit of the register. Moreover, consider the two basis states $\ket{\xi_0}$ and $\ket{\xi_1}$ of Q. Here, $\ket{\xi_0}=\bigotimes_{i=1}^{n}\ket{0}$ is the unique state with weight $0$, and $\ket{\xi_1}$ is the superposition of all states labelled by weight $1$ bit strings. Observe that $X_1\ket{\xi_0}=\ket{1}\bigotimes_{i=2}^{n}\ket{0}$, which is a state that appears in the superposition of states expressed in $\ket{\xi_1}$. Therefore,
\[
\bra{\xi_1}X_1\ket{\xi_0}=\frac{1}{\sqrt{\binom{n}{k}}}\neq 0,
\]
but yet $\ket{\xi_1}$ and $\ket{\xi_0}$ are orthogonal basis states so that $\ip{\xi_1}{\xi_0}=0$. Hence, the single bit flip error $E=X_1$ is an undetectable error by definition.

Let $E=X^{\otimes n}$, and consider the two basis states $\ket{\xi_0},\ket{\xi_n}\in\mathcal{B}$ of $Q$. Then
\[
E\ket{\xi_0}=X^{\otimes n}\TENSOR{i=1}{n}\ket{0}=\TENSOR{i=1}{n}\ket{1}=\ket{\xi_1},
\]
Which implies that $\bra{\xi_n}E\ket{\xi_0}=\ip{\xi_n}{\xi_n}=1$. However, $\ip{\xi_n}{\xi_0}=0$ since they are orthogonal basis states, and thus
$E=X^{\otimes n}$ is an undetectable error.

More generally, let $U\in SU(2)$, with $U\neq I$, and consider the error $E=U^{\otimes n}$. According to the Shur-Weyl duality, $[E,U_\pi]=0$ for all $U\in SU(2)$ and $\pi\in S_n$. Consider an arbitrary state $\ket{\psi}\in Q$ so that $U_\pi\ket{\psi}=\ket{\psi}$ for all $\pi\in S_n$ as argued in above. Then,
\[
U_\pi E\ket{\psi}=EU_\pi\ket{\psi}=E\ket{\psi},
\]
Where the first equality follows from the Shur-Weyl duality. This implies that $E\ket{\psi}\in Q$. Since, $\ket{\psi}$ was assumed to be arbitrary, this is equivalent to the statement $EQ=Q$.

Consider any basis state $\ket{\xi_j}$. Then since $EQ=Q$, it must be the case that $E\ket{\xi_j}\in Q$ so that $E\ket{\xi_j}$ itself can be expressed in terms of the basis $\mathcal{B}$ of $Q$. That is,
\[
E\ket{\xi_j}=U^{\otimes n}\ket{\xi_j}=\SUM{k=1}{n=1}\alpha_k\ket{\xi_k}.
\]
Then there must exist at least one $\alpha_{k'}\neq 0$, and perhaps more. Moreover, since $U\neq I$ by assumption, $k'\neq j$. Choose one such basis state $\ket{\xi_{k'}}$ with $\alpha_{k'}\neq 0$. With this choice,
\[
\bra{\xi_{k'}}E\ket{\xi_j}>0.
\]
However, since $\ip{\xi_k'}{\xi_j}=0$ this implies that
\[
\bra{\xi_{k'}}E\ket{\xi_j}\neq c(E)\ip{\xi_k'}{\xi_j}.
\]
Therefore, the error $E=U^{\otimes n}$ is undetectable.

Linear Optical Quantum Computation: The KLM Proposal

2014-02-01T00:00:00.000-08:00

Abstract

A brief overview of quantum optics is given, and some early proposals for quantum computation using optics are discussed. The main focus is invested on the Knill, Laflamme, and Milburn proposal \cite{KLM} which achieves scalable universal quantum computation using only linear optics and the ability to prepare and detect single photon states.

(** All figures displayed in this post are borrowed from arXiv:quant-ph/0512104 **)

Introduction

Computation, although often thought of entirely in the abstract, is a physical process. Thus, a physical system able to undergo controlled transformations is a prerequisite for any successful computation. In quantum computation, the physical nature of the necessary system becomes even more relevant in order to harness sufficient quantum phenomenon. Despite many distinct paradigms for physically realizing quantum computation, the setting of quantum optics offers unique advantages. In quantum optics, photons are deployed as the main carriers of information, and various physical devices such as phase shifters and beam splitters can be used to enact transformations on the photon states to perform computation. The ability to operate at high temperatures, and long decoherence times are examples of some of the advantages gained when working with quantum optics. However, the same properties of light that offer such benefits also bring some disadvantages. In particular, it is generally difficult to make photons interact with one another. Such interactions seem necessary in order to implement multiple qubit entangling gates---a main requirement for achieving universal quantum computation.

In this post, a brief overview of quantum optics is given and various proposals for quantum computing in this optical setting are discussed. The main proposal described here is that of Knill, Laflamme and Milburn (KLM) \cite{KLM}, who offer a scheme for achieving scalable universal quantum computation using only linear optics (LOQC) that overcomes many detrimental features that existed in previous proposals.

Bosonic Modes

In quantum optics, photons are the main physical entities at play. When referring to photons in what follows, we will mean noninteracting spin-less bosons. In this setting, the electromagnetic field is quantized in which the energy associated to the physical system is given by the Hamiltonian
\[
\hat{H}=\displaystyle\sum_{k}^{}\hbar\omega(\ani_k\cre_k +\frac{1}{2}),
\]
where $\ani_k$ and $\cre_k$ are the annihilation and creation operators associated to mode $k$. A bosonic mode, $k$, can be considered as a quantum system whose state space is spanned by the number states $\ket{n}_k$, for $n=1,2,3,\dots$, where $n$ represents the number of photons in the mode $k$. The number states of a mode $k$ form a complete orthonormal set so that $\ip{n}{m}=\delta_{nm}$. The state $\ket{0}$ will denote the state in which every mode $k$ is in the state $\ket{0}_k$. In this way the observables of a particular mode $k$ are given by the annihilation and creation operators satisfying
\[
\cre_k\ket{n}_k=\sqrt{n+1}\ket{n+1}_k
\]
and
\[\begin{align*}
\ani_k\ket{n}_k&=\sqrt{n}\ket{n-1}_k \text{for} k\geq 1, \\
\ani_k\ket{0}_k&=0.
\end{align*}
\]
Thus, any number state $\ket{n}_k$ can be expressed in terms $\ket{0}_k$ state
\[
\ket{n}_k=\frac{(\cre_k)^n}{\sqrt{n!}}\ket{0}_k.
\]
Moreover, the creation and annihilation operators satisfy the canonical commutation relations
\[
[\ani_k,\cre_{k'}]=\delta_{kk'} \ \ \ \text{and} \ \ \ [\ani_k,\ani_{k'}]=[\cre_k,\cre_{k'}]=0.
\]

Single Photon Creation and Detection

Necessary requirements for the paradigm of quantum optics are the abilities to both create single photons and to detect them reliably. In addition to the linear elements to be introduced later, controlled photon creation and detection are essential resources necessary for efficient linear optical quantum computation. Even though perfect reliable photon creation and detection are difficult to achieve in practice, this will be taken somewhat for granted in describing later protocols for LOQC where it is implicitly assumed that ideal photon creation and detection is readily available.

In regards to single photon creation, methods can be essentially classified as either deterministic or probabilistic. A probabilistic source, for instance, may involve multiple photon detections where a second photon can be used to signal the creation of the first ``'single" photon. Such schemes are commonly used for quantum information processing. However, deterministic means which are able to produce single photons on demand in a controlled way are beneficial. One way of achieving this is through spontaneous parametric down conversion (SPDC), in which photons interact with a nonlinear crystal to occasionally create a pair of correlated photons \cite{Kok}.

Although SPDC is commonly used for quantum optical purposes, other means involving single photon emission through the controlled stimulus of various physical systems is also possible \cite{Migdall}. It is also worth mentioning that single photon states can be created without the use of nonlinearities using weak squeezing where the Hamiltonian $\hat{s}_{jk}=\ani_j\ani_k+\cre_j\cre_k$ is applied to the vacuum state in order to produce specific number states in the modes. Experimental realizations of such a technique have been demonstrating in \cite{Hong}.

The matter of photon detection comes in two varieties: being able to distinguish merely between zero and some nonzero number of photons in a mode, or the more refined ability of actually being able to count the precise number of photons in a mode. This latter means of photon detection is required for various applications of LOQC. Such measurements involve some particle detector which destructively determines the presence of photons. One way to approximately count the number of photons in some mode $k$ is to use optical elements to effectively split the mode into $N$ other modes $k_1,k_2,\dots, k_N$, and then use $N$ particle detectors to count the number of photons in each of the modes $k_l$. Now suppose that mode $k$ has $n$ photons, then the probability that any of the modes $k_l$ has more than one photon is given by $1-\frac{N!}{(N-n)!N^n}\leq\frac{n(n+1)}{2N}$. Thus, provided that the number of photons $n$ in the mode is not too large, or if the number of split modes $N$ is sufficiently large, the probability of any of the modes $k_l$ having more than one photon is small. Therefore, with high probability the number of actual photons in the main mode $k$ would be given by the number of photon detectors (out of $N$ many) that detect a photon. Further details pertaining to photon detection and its associated errors can be found in \cite{Kok}.

Linear Optical Elements

The main dynamical elements that induce state transformations in linear optics are phase shifters and beam splitters. An important property of such passive linear optical elements is that they preserve the boson number of the states. That is, if $U$ represents the unitary operator associated with such a state transformation, then $U^\dagger\ket{0}=\ket{0}$. The effect of $U$ on the creation operators can then be described as follows:
\begin{align*}
U\cre_k\ket{0}=U\cre_kU^\dagger\ket{0}=\displaystyle\sum_{j}u_{jk}\cre_{j}\ket{0},
\end{align*}
where $U=(u_{jk})_{jk}$ gives the matrix coefficients of $U$. By letting $\hat{b}^{\dagger}_k=U\cre_k$ represent the outmode mode, an optical element corresponding to the transformation $U$, is said to be \emph{linear} if each output mode is a linear combination on the input modes $\cre_j$:
\[
\hat{b}^{\dagger}_{k}=\displaystyle\sum_{j}u_{jk}\cre_{j}\
\]

The phase shift optical element $P_\phi$, which acts on a single mode $k$, has the effect of introducing a phase factor $e^{in\phi}$ on the state $\ket{n}_k$. This transformation can be described by the Hamiltonian $\hat{N}_k=\cre_k\ani_k$ so that the unitary representing the action of the phase shift is given by $U(P_\phi)=e^{i\phi\cre_k\ani_k}$. In an optical network, the phase shift element acting in a particular mode will be depicted schematically as shown in the figure below.

A phase shift element $P_\phi$.

The other main optical element predominantly used in linear optics is a beam splitter $B^{(jk)}_{\theta, \phi}$ that is defined to act on two different modes $j$ and $k$ of the system. The action of a beam splitter can be described by the Hamiltonian $\hat{B}_{jk}=e^{i\phi}\cre_j\ani_k+e^{-i\phi}\ani_j\cre_k$, with corresponding unitary matrix given by
\[
U(B_{\theta, \phi})=\begin{pmatrix}
\cos\theta &-e^{i\phi}\sin\theta \\
e^{-i\phi}\sin\theta & \cos\theta
\end{pmatrix},
\]
where mode indices have been suppressed. The schematic symbol for the beam splitter $B^{(jk)}_{\theta, \phi}$ is shown in the figure below.

A beam splitter element $B_{\theta,\phi}$.

Quantum Computation with Linear Optics

Given the paradigm of quantum linear optics discussed in the previous sections, the objective now is to show how quantum computation can be achieved in this setting. This requires a suitable way to encoding qubits using bosonic modes, and a means of implementing quantum gate operations on the qubits using optical elements consisting of various phase shifters and beam splitters. Again, also implicit in these constructions will be a means of preparing and measuring qubits by means of photon creation and detection. In what follows, after discussing some schemes for quantum computation which fail to do so, a general scheme for quantum computation that does not rely on nonlinear optical elements but yet has efficient scalability properties will be discussed.

Early Proposals

Traditional quantum optics models (e.g. \cite{Chuang1}, \cite{Chuang2}) may use $n$ photons to represent $n$ qubits, and allow the use of nonlinear optical elements to achieve $2$ qubit quantum gates. One example of a nonlinear element is a Kerr medium, which has the property that its refractive index contains nonlinear terms so that a beam traversing the Kerr medium undergoes a phase shift proportional to the beam's intensity. Moreover, photons in two paths incident to the Kerr medium can induce controlled-phase operations allowing for universal quantum computation when considered together with arbitrary single qubit operations. An early example concerning the implementation of a quantum optical Fredkin gate has been achieved through these means as demonstrated in \cite{Milburn}.

The issue with using nonlinear optical elements through a natural Kerr medium, is the high degree of nonlinearity that must be present in order to implement gates of interest making such methods extremely difficult to implement in practice.

A proposal using only linear optics that can achieve universal computation was given by Adami and Cerf in \cite{Adami}. This proposal uses just a single photon that has access to $2^n$ distinct spatial paths or modes to encode $n$ qubits. Simple gate implementations in this scheme are achieved through particular arrangements of only linear optical elements consisting of phase shifters and beam splitters. The major drawback here is that $2^n-1$ beam splitters are needed to set up the $2^n$ paths necessary to encode $n$ qubits. Since this number is exponential in $n$, general quantum computations done this way will require an exponential amount or resources (beam splitters). Hence, this Adami-Cerf proposal does not possess the desired scaling properties for efficient quantum computation to be accomplished in general. Regardless, in \cite{Adami}, Adami and Cerf use their scheme to provide experimental realizations of the quantum teleportation protocol---a nontrivial quantum computation.

The KLM Proposal

Despite initial beliefs that scalable universal quantum computation may only be achieved when nonlinear optical elements are exploited, in \cite{KLM} Knill, Laflamme, and Milburn offered a scheme that yields efficient quantum computation using only linear optics together with single photon sources and detectors. The key insight made here is that a particular two qubit gate can be executed non-deterministically with some probability of success through the exclusive use of linear optical elements. Furthermore, this gate can be implemented through gate teleportation, which effectively reduces the problem of applying the two qubit gate to the problem of constructing a special state instead. The probability of success associated to such a procedure can be increased arbitrarily through the preparation of richer states and by using quantum error correction to eliminate certain errors. With these tools the KLM scheme offers a scalable means of quantum computation. An outline of the KLM proposal is provided in the following sections.

Qubit Encodings
Consider an optical system with two distinct modes $a$ and $b$ representing distinct spatial paths, and suppose a single photon is present in either of the two modes. The possibilities present for the state of the system can be used to encode a single qubit by letting the two computational basis states be given by
\begin{align*}
\ket{0}_q&:=\ket{0}_a\ket{1}_b=\ket{01}_{ab} \\
\ket{1}_q&:=\ket{1}_a\ket{0}_b=\ket{10}_{ab}.
\end{align*}
Here, the presence of the subscript $q$ for the qubit basis states does not refer to a mode, but is included to denote that the state $\ket{\cdot}_q$ represents a qubit. Such an encoding of qubits is often referred to in the literature as ``dual rail logic". Another possible way to encode an optical qubit is through the polarization properties of photons. If $\ket{H}$ and $\ket{V}$ represent the state of a single photon being horizontally and vertically polarized, respectively, then the computational basis states of the qubit can be alternatively identified as
\[ \ket{0}_q:=\ket{H} \ \ \ \text{and} \ \ \ \ket{1}_q:=\ket{V}. \]
Although both of these qubit representations offer their own utilities in certain contexts, the dual rail logic encoding with be the used in what follows. However, it should be noted that one representation can be transformed to the other using linear optical elements as discussed in \cite{Myers}.

Single Qubit Gates
A natural consequence of linear optics using only phase shifters and beam splitters is that single qubit operations are easily implemented. Recall that an arbitrary single qubit unitary $U$ can be expressed as a product of rotations around the $Y$ and $Z$ axis of the Bloch sphere. More explicitly
\[U=e^{i\alpha}R_z(\beta)R_y(\gamma)R_z(\delta),\]
where $R_z(\theta)=e^{i\frac{\theta}{2}\pz}$ and $R_y(\phi)=e^{i\frac{\phi}{2}\py}$ are the rotations about the corresponding axis. Here, $\pz$, $\py$ are the standard Pauli operators. The relevance of this decomposition will become apparent after describing the quantum gates corresponding to phase shifters and beam splitters.

By applying a phase shift to, say, the top mode of a dual rail encoded qubit a relative phase is introduced in the state. A schematic representation of such a gate is shown in the following figure.

A phase shift gate acting on the top mode of a dual rail qubit.

To see this, consider an arbitrary single qubit state $\ket{\psi}_q=\alpha\ket{0}_q+\beta\ket{1}_q$. Then the action of $U(P_\phi)$ on this state is given by
\begin{align*}
U(P_\phi)\ket{\psi}_q&=\alpha U(P_\phi)\ket{0}_q+\beta U(P_\phi)\ket{1}_q \\
&=\alpha U(P_\phi)\ket{0}_a\ket{1}_b+\beta U(P_\phi)\ket{1}_a\ket{0}_b \\
&=\alpha \ket{0}_a\ket{1}_b+\beta e^{i\phi}\ket{1}_a\ket{0}_b \\
&=\alpha\ket{0}_q+\beta e^{i\phi}\ket{1}_q
\end{align*}
Further manipulations of this transformed state gives
\begin{align*}
U(P_\phi)\ket{\psi}_q&=\alpha\ket{0}_q+\beta e^{i\phi}\ket{1}_q \\
&=e^{i\phi/2}(e^{-i\phi/2}\alpha\ket{0}_q+e^{i\phi/2}\beta\ket{1}_q) \\
&=e^{i\phi/2}e^{-i\phi Z_q/2}(\alpha\ket{0}_q+\beta\ket{1}_q) \\
&=e^{i\phi/2}R_z(\phi)(\alpha\ket{0}_q+\beta\ket{1}_q),
\end{align*}
where now $Z_q$ denotes the effective Pauli-$Z$ operation on the encoded qubit. This shows that (up to a physically irrelevant global phase factor $e^{i\phi/2}$) the action of the phase shift $P_\phi$ amounts to a rotation about the $Z$ axis of the Block sphere.

Now consider the beam splitter $B_{\theta,0}$ acting on the two modes of a single dual rail qubit as depicted in the figure.

A beam splitter acting on the two modes of a single dual rail qubit.

In this case, the matrix representation of $B_{\theta,0}$ is given by
\[
U(B_{\theta, 0})=\begin{pmatrix}
\cos\theta &\sin\theta \\
\sin\theta & \cos\theta
\end{pmatrix},
\]
and its action on an arbitrary single qubit $\ket{\psi}_q=\alpha\ket{0}_q+\beta\ket{1}_q$ is given by
\begin{align*}
U(B_{\theta,0})\ket{\psi}_q&=\alpha U(B_{\theta,0})\ket{0}_q+\beta U(B_{\theta,0})\ket{1}_q \\
&=\alpha U(B_{\theta,0})\ket{01}_{ab}+\beta U(B_{\theta,0})\ket{10}_ab \\
&=\alpha(\cos(\theta)\ket{01}_{ab}-\sin(\theta)\ket{10}_{ab})+\beta(\cos(\theta)\ket{10}_{ab}+\sin(\theta)\ket{01}_{ab}) \\
&=\cos(\theta)(\alpha\ket{01}_{ab}+\beta\ket{10}_{ab})-\sin(\theta)(\alpha\ket{10}_{ab}-\beta\ket{01}_{ab}) \\
&=e^{i\theta Y_q}(\alpha\ket{01}_{ab}+\beta\ket{10}_{ab} \\
&=R_y(-2\theta)\ket{\psi}_q,
\end{align*}
where $Y_q$ represents the Pauli-$Y$ operation on the qubit. Hence, the beam splitter $B_{\theta,0}$ has the effect of performing a rotation of $-2\theta$ about the $Y$ acid of the Block sphere.

With these results in mind, together with the decomposition of an arbitrary single qubit unitary in terms of Bloch sphere rotations about the $Z$ and $Y$ axis described above, it is readily seen that any single qubit unitary can be executed by simply using a phase shift $P_\delta$ and a beam splitter $B_{-\gamma/2,0}$ followed by another phase shift $P_\beta$.

A Two Qubit Gate: $\mathbf{\con{Z}}$
In the previous section it was shown how arbitrary single qubit transformations can be accomplished using only linear optical elements consisting of phase shifts and beam splitters. It is known that in order to achieve universal quantum computation, two-qubit gates are also necessary in addition to the single qubit gates. Specifically, the two-qubit gate must be an entangling gate to ensure universality. A standard conventional choice for such a two-qubit entangling gate is the controlled-NOT gate $\con{X}$. However, for our purposes the controlled-SIGN gate $\con{Z}$ will be implemented instead. Abstractly, the action of the $\con{Z}$ gate is given by $\ket{u}\ket{v}\mapsto(-1)^{u\cdot v}\ket{u}\ket{v}$, where $u,v\in\{0,1\}$. In this way, $\con{Z}$ is related to $\con{X}$ through the relation $\con{Z}=(I\otimes H)(\con{X})(I\otimes H)$, where $H$ is the Hadamard transform.

In order to perform the $\con{Z}$ gate, observe that a basic transformation of the form
\[ \alpha\ket{0}+\beta\ket{1}+\gamma\ket{2} \mapsto \alpha\ket{0}+\beta\ket{1}-\gamma\ket{2} \]
is needed. This operation will be called the nonliner sign shift gate, denoted by $NS_{-1}$, for reasons that are suggested by the name. Since $NS_{-1}$ is a nonlinear gate its action can not be given using only linear optical elements. Instead, the $NS_{-1}$ gate will be implemented nondeterministically (probabilistically) using just linear optics and ancilla modes. Once this is achieved for the $NS_{-1}$ gate, the probabilistically implemented $NS_{-1}$ gates will be used to construct a probabilistic implementation of $\con{Z}$.

Consider three spatial bosonic modes where mode $1$ contains the main state of interest $\ket{\psi}_1$, and modes $2$ and $3$ serve as ancilla registers initialized in the states $\ket{1}_2$ and $\ket{ 0}_3$. Let the initial state of mode $1$ be in the state $\ket{\psi}_1=\alpha\ket{0}+\beta\ket{1}+\gamma\ket{2}$. Then it can be shown that the following optical network shown in the figure below produces the state $NS_{-1}\ket{\psi}_1$ in the first mode provided that the to ancillary modes are measured to be precisely $\ket{1}_2\ket{0}_3$.

An optical circuit for the nonlinear sign shift gate $NS_{-1}$.

For the particular action of $NS_{-1}$ the angle parameters of the optical elements are given by
\begin{align*}
&\theta_1=22.5^{\circ},   \phi_1=0^{\circ}\\
&\theta_2=65.5302^{\circ},   \phi_2=0^{\circ} \\
&\theta_3=-22.5^{\circ},   \phi_1=0^{\circ} \\
&\phi_4=180^{\circ}.
\end{align*}
The corresponding unitary matrix $U(NS_{-1})$ acting on the three modes is given by
\[
U(NS_{-1})=\begin{pmatrix}
1-\sqrt{2} & \frac{1}{\sqrt{\sqrt{2}}} & \sqrt{\frac{3}{\sqrt{2}}-2} \\
\frac{1}{\sqrt{\sqrt{2}}} &\frac{1}{2} & \frac{1}{2}-\frac{1}{\sqrt{2}} \\
\sqrt{\frac{3}{\sqrt{2}}-2} & \frac{1}{2}-\frac{1}{\sqrt{2}} & \sqrt{2}-\frac{1}{2}
\end{pmatrix}.
\]
If the two ancillary modes after measurement are in a state different from $\ket{1}_2\ket{0}_3$, the state in the first mode is not the desired state $NS_{-1}\ket{\psi}$. Since there are $4$ possible output states for the two ancilla modes, and only one of these states (namely $\ket{1}_2\ket{0}_3$) yields the desired outcome in the first mode, the probability of success in this nondeterministic implementation of $NS_{-1}$ is $1/4$.

Consider two dual rail qubits $\ket{Q_1}=\alpha\ket{0}_q+\beta\ket{1}_q$ and $\ket{Q_2}=\gamma\ket{0}_q+\delta\ket{1}_q$ in arbitrary states a nondeterministic $\con{Z}$ gate can be performed on these two qubits by making using of two $NS_{-1}$ gates as shown in the figure below. Since there are two $NS_{-1}$ gates, and each requires the use of two ancilla modes and has a individual success probability of $1/4$, there are four ancilla modes in total and the success probability of implementing the $\con{Z}$ gate is $1/16$.

An optical network implementing the controlled-SIGN gate $\con{Z}$.

Gate Teleportation of $\mathbf{\con{Z}}$

It has now been shown how to implement, albeit probabilistically, a two-qubit gate $\con{Z}$ which is universal for quantum computation when considered together with single qubit gates. Although this implementation was accomplished using linear optical elements alone, the inherent probabilistic nature of the gate does not allow for scalable quantum computation due to low success rates. The next key insight utilized in the KLM scheme helps overcome this obstacle by increasing the success probability through the means of gate teleportation. Such a teleportation procedure was originally discussed in \cite{Got}.

In essence, by harnessing quantum teleportation the problem of needing to apply the probabilistic gate $\con{Z}$ is reduced to the problem of having to prepare a certain state "off-line". This has the advantage of first being able to ensure the proper state is created before carrying on with the gate implementation so that the states in the rest of the system do not get corrupted. In this way, if an error does occur during the state preparation phase it will be known. Then the state preparation can simply be repeated until the desired state is created successfully.

To understand this more thoroughly, consider two qubits $\ket{\psi_1}_q$ and $\ket{\psi_2}$ on which a $\con{Z}$ is to be applied. This can be accomplished, although it may seem superfluous, by first teleporting the two qubits and then applying $\con{Z}$ as suggested in the figure below.

Teleportation of two qubits follows by a $\con{Z}$ gate. The region contained in side the dashed/dotted box is the state preparation area.

Since $(\con{Z})^2=I$ consider adding the trivial action of two $\con{Z}$ gates prior to the correction procedure for the teleportation as shown in the following figure.

Teleportation with the trivial addition of two $\con{Z}$ gates before the measurement phase.

Focusing on the two middle registers of the network, and applying certain relations for when a Pauli operator is conjugated by a $\con{Z}$, it is seen that these sequences of gates containing $\con{Z}$ operations can equivalently expressed without any $\con{Z}$ gates as
\[
(\con{Z})(Z^{m'_1}X^{m_1}\otimes Z^{m'_2}X^{m_2})(\con{Z})=Z^{m'_1}X^{m_1}Z^{m_2}\otimes Z^{m_1}Z^{m'_2}X^{m_2} ,\]
where $m_1, m'_1$ and $m_2,m'_2$ are the two classical bits given by the measurements outcomes of the upper and lower teleportations, respectively. This is more clearly depicted in the the following figure.

Teleportation with a $\con{Z}$ gates in the state preparation area and only Pauli gates after measurement.

Thus, it is seen here that there is only a single $\con{Z}$ gate that appears anywhere in the optical circuit. Moreover, this $\con{Z}$ gate has been delegated to the state preparation phase of the teleportation procedure, and the only gates that are to be applied after measurement are simply single qubit Pauli operators. Therefore, the only probabilistic aspect that comes into play when implementing the $\con{Z}$ gate can be thought of has happening ``off-line" so that any failure that may occur in the $\con{Z}$ implementation does not jeopardize the rest of the computation.

To complete the implementation of $\con{Z}$ using only linear optical elements, it remains to describe an optical network that executes the teleportation task. As described in \cite{Myers}, a general teleportation protocol can be accomplished using the network displayed below.

An optical network describing the standard teleportation protocol.

Now, all of the elements introduced thus far can be combined to give a complete linear optical network for the implementation of $\con{Z}$ as shown:

The complete optical network for the nondeterministic implementation of $\con{Z}$ using teleportation.

Increasing the Success probability of $\mathbf{\con{Z}}$

The successes probability associated to the probabilistic implementation of $\con{Z}$ described in the previous sections cannot be made arbitrarily close to $1$ in a scalable manner. In order to achieve better scaling rates, the state constructed during the state preparation phase of the teleportation protocol can be modified. This modification involves creating more complex entangled states which involve more bosonic modes as a resource. By creating richer states to be used, the success probability of $\con{Z}$ can be made arbitrarily close to $1$ in a scalable manner. However, the catch here is that the appropriate states required can be difficult and resource intensive to create. The next improvement in the KLM proposal is to avoid having to prepare increasingly complex states, by using active error correction to correct errors that may occur during this stage. By analyzing the particular nature of the errors in this context, appropriate error correcting codes can be deployed that increase the success probability of $\con{Z}$. For the sake of brevity, further details pertaining to these aspects of the KLM proposal will be omitted here. The interested reader is invited to enlighten themself through references (\cite{KLM},\cite{Kok},\cite{Myers},)

Conclusion

In this present work, a brief general overview of quantum optics was given and various proposals realizing quantum computation through quantum optics were discussed. The main scheme focused on here was the $KLM$ proposal which achieves scalable linear optical quantum computation, using only linear optical elements (such as phase shifters and beam splitters) together with the ability to prepare ancilla modes and make photon detections. Key features of this scheme that allow for scalability are the use of quantum teleportation and error correction to achieve arbitrarily large success probabilities in an efficient manner.

Experimental demonstrations realizing various aspects of the KLM scheme have been carried out in the field. In \cite{Okamoto}, a controlled-NOT, or $\con{X}$ gate, was successfully implemented. In \cite{Pittman}, a demonstration of quantum error correction using linear optics was described. Some more exotic experiments include the observation of anyonic features in a toric code simulation using LOQC in \cite{Pachos}, and another more recent demonstration of topological error correction in \cite{Yao}. The latter experiment makes use of the paradigm of \emph{one-way quantum computation} \cite{Walther} that involves the preparation of an initial \emph{cluster state} \cite{Nielsen}, which is then subject exclusively to various measurements conditions on the outcomes of previous measurements. This approach to quantum computation has its own merits, and when implemented with LOQC each paradigm further benefits the other. That is, cluster state/one-way quantum computation offers yet another means for improving LOQC.

Entanglement destroying channels

2014-01-31T21:40:00.000-08:00

In a previous post, we were concerned with channels of the form $\Phi\in\Channel(\X,\Y)$ such that $\bigl(\Phi\otimes \I_{\Lin(\Z)}\bigr)(\rho) \in \Sep(\Y:\Z)$ for every complex Euclidean space $\Z$ and every density operator $\rho\in\Density(\X\otimes\Z)$. Channels of this form have the effect of destroying entanglement that exists between the register they act on and any other registers.

Theorem:
There exist two channels $\Phi_0,\Phi_1\in\Channel(\X,\Y)$, both having the property described above, such that
\[
\bigtriplenorm{\Phi_0 - \Phi_1}_1
> \bignorm{\Phi_0(\rho) - \Phi_1(\rho)}_1
\]
for every $\rho\in\Density(\X)$. (Channels like this have the strange property that they destroy entanglement, and yet evaluating them on an entangled state helps to distinguish them.)

Proof:

For $\lambda\in[0,1]$, consider the two channels $\Phi_0(X),\Phi_0(X)\in\Channel(X)$ defined by
\[\begin{align*}
\Phi_0(X)&=\frac{\lambda}{n+1}(\tr(X)\I_\X+X^T)+\frac{(1-\lambda)}{n}\tr(X)\I_\X \\
\Phi_1(X)&=\frac{\lambda}{n-1}(\tr(X)\I_\X-X^T)+\frac{(1-\lambda)}{n}\tr(X)\I_\X
\end{align*}\]
Then for sufficiently small $\lambda\in[0,1]$ both of the Choi representations $J(\Phi_0(X))$ and $J(\Phi_1(X))$ are in a separable neighborhood of the maximally mixed state which implies that they are both separable by some theorem. Therefore, from the results in the previous post, we have that $\Phi_0(X)$ and $\Phi_0(X)$ are entanglement destroying as described in the problem statement.

Now considering that
\[
\Phi_0(\rho) - \Phi_1(\rho)=\frac{-2\lambda}{(n+1)(n-1)}\rho^T,
\]
it follows that
\[
\bignorm{\Phi_0(\rho) - \Phi_1(\rho)}_1=\frac{2\lambda}{(n+1)(n-1)}\bignorm{\rho^T}_1=\frac{2\lambda}{(n+1)(n-1)},
\]
since $\rho\in\Density(\X)$.

Moreover, since
\[\begin{align*}
\bigtriplenorm{\Phi_0 - \Phi_1}_1&=max\{\bignorm{((\Phi_0(\rho) - \Phi_1(\rho))\otimes\I_{\Lin(\X)})(xx^\ast)}_1 \ : \ x\in S(\X\otimes\X)\} ,
\end{align*}\]
where $((\Phi_0(\rho) - \Phi_1(\rho))\otimes\I_{\Lin(\X)})(xx^\ast)$ gives the partial transpose of $xx^\ast$ (which is at most $n$ since $x\in S(\X\otimes\X)$) multiplied by the scalar quantity $\frac{2\lambda}{(n+1)(n-1)}$. Therefore $\bigtriplenorm{\Phi_0 - \Phi_1}_1=\frac{2\lambda n}{(n+1)(n-1)}$, which implies that
\[
\frac{2\lambda n}{(n+1)(n-1)}=\bigtriplenorm{\Phi_0 - \Phi_1}_1
> \bignorm{\Phi_0(\rho) - \Phi_1(\rho)}_1=\frac{2\lambda}{(n+1)(n-1)}.
\]

Bounding the norm of the Choi representation of a channel in terms of its operator norm

2014-01-31T21:24:00.000-08:00

Theorem:

Let $\X$ and $\Y$ be complex Euclidean spaces with $\dim(\X) = n$ and let $\Phi\in\Trans(\X,\Y)$. Let $\norm{\cdot}_1$ denote the usual trace norm of an density operator, and $\triplenorm{\cdot}_1$ the operator norm of a channel.
\[
\triplenorm{\Phi}_1 \leq \norm{J(\Phi)}_1 \leq n
\triplenorm{\Phi}_1. \]

Proof:

Since the Choi representation of $\Phi$ is given by $J(\Phi)=(\Phi\otimes\I_{\Lin(\X)})(\vec(\I_\X)\vec(\I_\X)^\ast)$, then the trace norm is given by and also satisfies
\[\begin{align*}
\norm{J(\Phi)}_1&=\norm{(\Phi\otimes\I_{\Lin(\X)})(\vec(\I_\X)\vec(\I_\X)^\ast)}_1 \\
&\leq \norm{(\Phi\otimes\I_{\Lin(\X)})}_1\norm{\vec(\I_\X)\vec(\I_\X)^\ast}_1 \\
&=\norm{(\Phi\otimes\I_{\Lin(\X)})}_1n \\
&=n \triplenorm{\Phi}_1,
\end{align*}\]
where the last two lines from $\norm{\vec(\I_\X)\vec(\I_\X)^\ast}_1=n$ and the definition of the completely bounded trace norm $\triplenorm{\Phi}_1 :=\norm{(\Phi\otimes\I_{\Lin(\X)})}_1$.

Now consider an alternate characterization of the completely bounded trace norm:
\[
\triplenorm{\Phi}_1=max\{\norm{(\I_{\Lin(\Y)}\otimes\sqrt{\rho_0})J(\Phi)(\I_{\Lin(\Y)}\otimes\sqrt{\rho_1})}_1 : \rho_0,\rho_1\in\Density(\X)\}.
\]
Since this norm satisfies the property
\[
\norm{(\I_{\Lin(\Y)}\otimes\sqrt{\rho_0})J(\Phi)(\I_{\Lin(\Y)}\otimes\sqrt{\rho_1})}_\infty\leq\norm{\I_{\Lin(\Y)}\otimes\sqrt{\rho_0}}_\infty\norm{J(\Phi)}_1\norm{\I_{\Lin(\Y)}\otimes\sqrt{\rho_1}}_\infty,
\]
and the spectral norm $\norm{A}_\infty$ of an operator $A$ is given by the largest singular value of $A$, then it follows that $\norm{\I_{\Lin(\Y)}\otimes\sqrt{\rho_a}}_\infty\leq1$ with equality holding in the case where $\rho_a$ is a pure state. Therefore,
\[
\norm{(\I_{\Lin(\Y)}\otimes\sqrt{\rho_0})J(\Phi)(\I_{\Lin(\Y)}\otimes\sqrt{\rho_1})}_\infty\leq\norm{J(\Phi)}_1,
\]
implying that
\[
\triplenorm{\Phi}_1=max\{\norm{(\I_{\Lin(\Y)}\otimes\sqrt{\rho_0})J(\Phi)(\I_{\Lin(\Y)}\otimes\sqrt{\rho_1})}_1 : \rho_0,\rho_1\in\Density(\X)\}\leq \norm{J(\Phi)}_1.
\]
Putting these two bounds together then gives
\[
\triplenorm{\Phi}_1 \leq \norm{J(\Phi)}_1 \leq n
\triplenorm{\Phi}_1.
\]

Separable channels decrease the entaglement of formation

2014-01-31T21:08:00.000-08:00

The entanglement of formation of a density operator $\rho\in\Density(\X^{A}\otimes\X^{B})$ is defined as
\[
E_{f}(\rho) = \inf\Biggl\{\sum_{a\in\Sigma} p(a) E(u_a u_a^{\ast})
\,:\, \rho = \sum_{a\in\Sigma} p(a) u_a u_a^{\ast} \Biggr\},
\]
where $E(u u^{\ast}) = S(\tr_{\X^{B}}(u u^{\ast}))$ denotes the entanglement entropy of the pure state $u u^{\ast}$ and the infimum is over all expressions of $\rho$ of the given form, where $\Sigma$ is any alphabet, $p\in\P(\Sigma)$ is a probability vector, and $\{u_a\,:\,a\in\Sigma\} \subset \X^{A}\otimes\X^{B}$ is a collection of unit vectors.

Theorem:

For every choice of complex Euclidean spaces $\X^{A}$, $\X^{B}$, $\Y^{A}$, and $\Y^{B}$, every density operator $\rho\in\Density(\X^{A}\otimes\X^{B})$, and every separable channel $\Phi\in\SepC(\X^{A},\Y^{A}: \X^{B},\Y^{B})$, it holds that
\[
E_{f}(\Phi(\rho)) \leq E_{f}(\rho).
\]

Proof:

Assuming that $\Phi\in\SepC(\X^{A},\Y^{A}: \X^{B},\Y^{B})$ allows $\Phi$ to be expressed as
\[
\Phi(X)=\sum_{b\in\Gamma}(A_b\otimes B_b)X(A_b^\ast\otimes B_b^\ast),
\]
where $\Gamma$ is some alphabet and $\{A_b : b\in \Gamma\}\subset \Pos(\X^A)$ and $\{B_b : b\in \Gamma\}\subset \Pos(\X^B)$. For $\rho = \sum_{a\in\Sigma} p(a) u_a u_a^{\ast}$, the action of $\Phi$ on $\rho$ is specified by the action of $\Phi$ on each $u_au_a^\ast$ as
\[
\Phi(\rho)=\sum_{a\in\Sigma} p(a) \Phi(u_a u_a^{\ast})=\sum_{a\in\Sigma} p(a)\sum_{b\in\Gamma}(A_b\otimes B_b)u_au_a^*(A_b^\ast\otimes B_b^\ast).
\]
Therefore, represent $\Phi(u_au_a*)$ as
\[
\Phi(u_au_a*)=\sum_{b\in\Gamma}(A_b\otimes B_b)u_au_a^*(A_b^\ast\otimes B_b^\ast)=\sum_{b\in\Gamma}q_a(b) v_{ab} v_{ab}^{\ast},
\]
where $(A_b\otimes B_b)u_a=\sqrt{q_a(b)}v_{ab}$. Now let
\[
C_b=\frac{1}{\sqrt{q_a(b)}}(A_b\otimes B_b),
\]
so that $C_bu_au_a^\ast C_b^\ast=v_{ab}v_{ab}$.

Consider the channel $\Psi_{ab}\in\SepC(\X^{A},\Y^{A}: \X^{B},\Y^{B})$ defined by
\[
\Psi_{ab}(X)=C_bXC_b^\ast+(\tr(X)-\tr(C_bXC_b^\ast))\sigma,
\]
for some arbitrary $\sigma\in\Density(\Y^A\otimes\Y^B)$. Then $\Psi_{ab}$ is indeed a channel since it is completely positive because it is defined in terms of $C_b$ and $\Phi$ is assumed to be completely positive. Likewise, $\Psi_{ab}$ is separable since $\Phi$ is separable. Moreover, $\Psi_{ab}$ is trace preserving since
\[\begin{align*}
\tr(\Psi_{ab}(X))&=\tr(C_bXC_b^\ast)+(\tr(X)-\tr(C_bXC_b^\ast))\tr(\sigma) \\
&=\tr(C_bXC_b^\ast)+\tr(X)-\tr(C_bXC_b^\ast) \\
&=\tr(X).
\end{align*}\]
By construction $\Psi_{ab}(u_au_a^\ast)=v_{ab}v_{ab}^\ast$.
Therefore, by a corollary (6.36) to Nielsen's theorem it follows that for every $a\in\Sigma$ with $\rho_a^A=\tr_{\X^B}(u_au_a^\ast)$ and $\sigma_a^A=\tr_{\X^B}(v_{ab}v_{ab}^\ast)$ and $r=min\{rank(\rho_a^A),rank(\sigma^A_a)\}$ it holds that
\[
\lambda_1(\rho_a^A)+\dots+\lambda_1(\rho_m^A)\leq \lambda_1(\sigma_a^A)+\dots+\lambda_1(\sigma_m^A)
\]
for every $m\in\{1,\dots, r\}$.

Thus, the von Neummann entropy satisfies $S(\sigma_a^A)\leq S(\rho_a^A)$, which implies that the entanglement entropy also satisfies $E(v_{ab}v_{ab}^\ast)\leq E(u_au_a)$ for all $a\in\Sigma$ and $b\in\Gamma$. Then by tracing out system $B$, and taking the weighted average that is described the original state $\rho$ and using the joint convexity of the von Neumann entropy it follows that
\[\begin{align*}
\sum_{a\in\Sigma} p(a)\sum_{b\in\Gamma}q_a(b) E(v_{ab} v_{ab}^{\ast})
\leq \sum_{a\in\Sigma} p(a) E(u_a u_a^{\ast}).
\end{align*}\]

Therefore, by definition $E_{f}(\Phi(\rho)) \leq E_{f}(\rho)$.

The SWAP operator and separable measurements

2014-01-31T20:56:00.000-08:00

Let $\Sigma$ be an alphabet, let $n = \abs{\Sigma}$, and assume $n\geq 2$. Also let $\X^{A} = \mathbb{C}^{\Sigma}$ and $\X^{B} = \mathbb{C}^{\Sigma}$, and recall that the swap operator $W\in\Lin(\X^{A}\otimes\X^{B})$ may be defined as
\[
W = \sum_{a,b\in\Sigma} E_{a,b} \otimes E_{b,a}.
\]
Define $\Pi_0,\,\Pi_1\in\Proj(\X^{A}\otimes\X^{B})$ and $\sigma_0,\sigma_1\in\Density(\X^{A}\otimes\X^{B})$ as follows:
\[
\Pi_0 = \frac{1}{2} \I\otimes\I + \frac{1}{2} W,\qquad
\Pi_1 = \frac{1}{2} \I\otimes\I - \frac{1}{2} W,\qquad
\sigma_0 = \frac{1}{\binom{n+1}{2}}\Pi_0,\qquad
\sigma_1 = \frac{1}{\binom{n}{2}}\Pi_1.
\]

Theorem:

If $\mu:\{0,1\}\rightarrow\Pos(\X^{A}\otimes\X^{B})$ is a separable measurement, then
\[
\frac{1}{2} \ip{\mu(0)}{\sigma_0}
+ \frac{1}{2} \ip{\mu(1)}{\sigma_1}
\leq \frac{1}{2} + \frac{1}{n+1}.
\]

Proof:

Assuming that $\mu$ is a separable measurement allows $\mu(0)$ to be expressed as
\[
\mu(0)=\sum_{a\in\Gamma}P_a\otimes Q_a,
\]
where $\{P_a : a\in \Gamma\}\subset \Pos(\X^A)$ and $\{Q_a : a\in \Gamma\}\subset \Pos(\X^B)$. Moreover, since $\mu$ is a measurement it must satisfy the completeness condition that $\mu(0)+\mu(1)=\I\otimes\I$ implying that $\mu(1)$ can be expressed in terms of $\mu(0)$ as
\[
\mu(1)=\I\otimes\I-\mu(0)=\I\otimes\I-\sum_{a\in\Gamma}P_a\otimes Q_a.
\]
Write $\sigma_0$ and $\sigma_1$ more explicitly as
\[\begin{align*}
\sigma_0 &= \frac{1}{\binom{n+1}{2}}\Pi_0=\frac{1}{(n+1)n}(\I\otimes\I + W), \\
\sigma_1 &= \frac{1}{\binom{n+1}{2}}\Pi_0=\frac{1}{(n-1)n}(\I\otimes\I - W).
\end{align*}\]
Then,
\[\begin{align*}
\ip{\mu(0)}{\sigma_0}
+ \ip{\mu(1)}{\sigma_1}
&= \ip{\mu(0)}{\sigma_0}
+ \ip{\I\otimes\I-\mu(0)}{\sigma_1} \\
&=\frac{1}{(n+1)n}\ip{\mu(0)}{\I\otimes\I + W}
+ \frac{1}{(n-1)n}\ip{\I\otimes\I-\mu(0)}{\I\otimes\I - W} \\
&=\frac{1}{(n+1)n}(\ip{\mu(0)}{\I\otimes\I }+\ip{\mu(0)}{ W}) \\
& \ \ \ + \frac{1}{(n-1)n}(\ip{\I\otimes\I}{\I\otimes\I} -\ip{\I\otimes\I}{W} ) \\
& \ \ \ + \frac{1}{(n-1)n}(\ip{\mu(0)}{W}-\ip{\mu(0)}{\I\otimes\I} ).
\end{align*}\]
Now observe that
\[
\ip{\mu(0)}{\I\otimes\I }=\tr\left(\mu(0)^\ast\I\otimes\I\right)=\tr(\sum_{a\in\Gamma}(P_a\otimes Q_a)(\I\otimes\I))=\tr(\sum_{a\in\Gamma}P_a\otimes Q_a)=\sum_{a\in\Gamma}\tr(P_a)\tr(Q_a),
\]
\[
\ip{\I\otimes\I}{\I\otimes\I}=\tr((\I\otimes\I)(\I\otimes\I))=\tr(\I\otimes\I)=n^2,
\]
\[\begin{align*}
\ip{\mu(0)}{ W}=\tr(\sum_{a\in\Gamma}(P_a\otimes Q_a)W)&=\sum_{a\in\Gamma}\sum_{i,j\in\Sigma}(e_i^\ast\otimes e_j^\ast)(P_a\otimes Q_a)W(e_i\otimes e_j) \\
&=\sum_{a\in\Gamma}\sum_{i,j\in\Sigma}(e_i^\ast\otimes e_j^\ast)(P_a\otimes Q_a)(e_j\otimes e_i) \\
&=\sum_{a\in\Gamma}\sum_{i,j\in\Sigma}(e_i^\ast P_a e_j)\otimes(e_j^\ast Q_a e_i) \\
&=\sum_{a\in\Gamma}\sum_{i,j\in\Sigma}e_i^\ast P_a e_je_j^\ast Q_a e_i \\
&=\sum_{a\in\Gamma}\sum_{i\in\Sigma}e_i^\ast P_a Q_a e_i \\
&=\sum_{a\in\Gamma}\tr(P_a Q_a), \\
\end{align*}\]
and by similar arguments used in the previous calculation $\ip{\I\otimes\I}{W}=\tr(\I\I)=\tr(\I)=n$.
Therefore, the original expression of interest can be simplified as
\[\begin{align*}
\ip{\mu(0)}{\sigma_0}
+ \ip{\mu(1)}{\sigma_1}&=\frac{1}{(n+1)n}(\sum_{a\in\Gamma}\tr(P_a)\tr(Q_a)+\sum_{a\in\Gamma}\tr(P_a Q_a)) \\
& \ \ \ + \frac{1}{(n-1)n}(n^2 -n ) \\
& \ \ \ + \frac{1}{(n-1)n}(\sum_{a\in\Gamma}\tr(P_a Q_a)-\sum_{a\in\Gamma}\tr(P_a)\tr(Q_a) ) \\
&=1+\frac{2}{(n+1)(n-1)n}\left(n\sum_{a\in\Gamma}\tr(P_a Q_a)-\sum_{a\in\Gamma}\tr(P_a)\tr(Q_a)\right) \\
&\leq 1+\frac{2}{(n+1)(n-1)n}(n^2-n) \\
&=1+\frac{2}{(n+1)}.
\end{align*}\]
Here, the inequality follows from the fact that the quantity $\sum_{a\in\Gamma}\tr(P_a)\tr(Q_a)$ is minimized when the projectors $P_a$ and $Q_a$ both have rank $1$ so that $\tr(P_a)\tr(Q_a)=1$ implying that $\sum_{a\in\Gamma}\tr(P_a)\tr(Q_a)=n$.
Furthermore, in this case, $\sum_{a\in\Gamma}\tr(P_a Q_a)=n$.

Therefore, dividing both sides of the inequality by $2$ gives
\[
\frac{1}{2} \ip{\mu(0)}{\sigma_0}
+ \frac{1}{2} \ip{\mu(1)}{\sigma_1}
\leq \frac{1}{2} + \frac{1}{n+1}.
\]

Singular values before and after the action of a unital channel

2014-01-25T00:11:00.000-08:00

Let $\X$ be a complex Euclidean space having dimension $n$, let $\Phi\in\Channel(\X)$ be a unital channel, let $X\in\Lin(\X)$ be an operator, and let $Y = \Phi(X)$. Following our usual conventions, let $s_1(X) \geq \cdots \geq s_n(X)$ and $s_1(Y) \geq \cdots \geq s_n(Y)$ denote the singular values of $X$ and $Y$, respectively, ordered from largest to smallest, and where we take $s_k(X) = 0$ when $k > \rank(X)$ and $s_k(Y) = 0$ when $k > \rank(Y)$.

Theorem:

$s_1(X) + \cdots + s_m(X) \geq s_1(Y) + \cdots + s_m(Y)$ for every $m \in \{1,\ldots,n\}$.

Proof:

Consider the space $\X\oplus\X$ and let
\[
\overline{\X}:=
\begin{pmatrix}
      0 & X\\
      X^{\ast} & 0
    \end{pmatrix}.
\]
Then it holds that $\overline{X}=\overline{X}^\ast$ so that $\overline{X}\in\Herm(\X\oplus\X)$. In addition, consider the channel $\overline{\Phi}\in\Channel(\X\oplus\X)$ defined as
\[
\overline{\Phi} \begin{pmatrix}
      A & B\\
      C & D
    \end{pmatrix}=
    \begin{pmatrix}
      \Phi(A) & \Phi(B)\\
      \Phi(C) & \Phi(D)
    \end{pmatrix}.
\]
Then it follows that
\[
\overline{\Phi}(\I_{\X\oplus\X})     \begin{pmatrix}
      \Phi(\I_{\X}) & 0\\
      0 & \Phi(\I_{\X})
    \end{pmatrix}=
        \begin{pmatrix}
      \I_{\X} & 0\\
      0 & \I_{\X}
    \end{pmatrix}=
    \I_{\X\oplus\X},
\]
which implies that $\overline{\Phi}$ is unital. Moreover, letting $\Phi(X)=Y$ with $\Phi(X^\ast)=Y^\ast$ yields
\[
\overline{\Phi}(\overline{X})=
\begin{pmatrix}
      0&\Phi(X)\\
      \Phi(X^\ast) & 0)
    \end{pmatrix}=
        \begin{pmatrix}
      0 & Y\\
      Y^\ast & 0
    \end{pmatrix}=:\overline{Y}.
\]
It has now been shown that there exists a unital channel $\overline{\Phi}\in \Trans(\X\oplus\X)$ such that $\overline{\Phi}(\overline{X})=\overline{Y}$, where $\overline{X},\overline{Y}\in\Herm(\X\oplus\X)$. Therefore, by Uhlmann's theorem, this is equivalent to the statement that $\lambda(\overline{Y}) \prec \lambda(\overline{X})$, where $\lambda(\overline{Y})$ and $\lambda(\overline{X})$ are the vector of eigenvalues of $\overline{Y}$ and $\overline{X}$, respectively.

In order to determine the singular values of $\overline{Y}$ and $\overline{X}$, consider the singular value decompositions of $Y$ and $X$:
\[
X=\sum_{k=1}^{r_X}s_k(X)x'_kx_k^\ast \ \ \ \text{and} \ \ \ Y=\sum_{k=1}^{r_Y}s_k(Y)y'_ky_k^\ast,
\]
where $r_X=\rank(X)$, $r_Y=\rank(Y)$, $s(X)=(s_1(X),\dots,s_{r_X}(X))$ and $s(Y)=(s_1(Y),\dots,s_{r_Y}(Y))$ are the vectors of the non-zero singular values of $X$ and $Y$ (assumed to be written in decreasing order as the index increases), and
\[
\{x_1,\dots,x_{r_X}\}, \{x'_1,\dots,x'_{r_X}\}\subseteq\X \ \ \ \text{and} \ \ \ \{y_1\dots y_{r_Y}\}, \{y'_1\dots y'_{r_Y}\}\subseteq\Y
\]
are orthonormal sets of vectors in their respective spaces.

Then since the block matrix $\overline{X}$ can be diagonalized as
\[
\overline{\X}:=
U \begin{pmatrix}
      0 & X\\
      X^{\ast} & 0
    \end{pmatrix}U^\dagger=
\begin{pmatrix}
      X & 0\\
      0 & -X^{\ast},
    \end{pmatrix}
\]
with the unitary
\[
U=\frac{1}{\sqrt{2}}\begin{pmatrix}
      \I_\X & \I_\X\\
      \I_\X & -\I_\X
    \end{pmatrix},
\]
the eigenvalues of $\overline{X}$ are given by
\[
\lambda(\X)=\{s_1(X),\dots,s_{r_X}(X),-s_{r_X}(X),\dots,-s_1(X)),
\]
where here they have been arranged in decreasing order. An equivalent argument shows that the the eigenvalues of $\overline{Y}$ are similarly given by
\[
\lambda(\Y)=\{s_1(Y),\dots,s_{r_Y}(Y),-s_{r_Y}(Y),\dots,-s_1(Y)).
\]

However the singular values of $\overline{X}$ and $\overline{Y}$ are related to the eigenvalues via the absolute value. Therefore the singular values $\overline{s}(X)$ of $\overline{X}$ and $\overline{s}(Y)$ of $\overline{Y}$ are positive and there are at least two equal values for each $s_k$. That is,
\[\begin{align*}
\overline{s}(X)&=(s_1(X),s_1(X),\dots,s_{r_X}(X),s_{r_X}(X),\dots) \\
\overline{s}(Y)&=(s_1(Y),s_1(Y),\dots,s_{r_Y}(Y),s_{r_Y}(Y),\dots) ,
\end{align*}\]
where all values $s_{j}(X)$ and $s_{k}(Y)$for $j\geq r_X$ and $k
\geq r_Y$ are assumed to be zero by the convention described in the problem statement.

Then it follows that for all $k\in\{1,\dots, n\}$,
\[
s_1(X)+s_1(X)+\dots+s_{k}(X)+s_{k}(X)\geq s_1(Y)+s_1(Y)+\dots+s_{k}(Y)+s_{k}(Y),
\]
or equivalently that
\[
s_1(X)\dots+s_{k}(X)\geq s_1(Y)+\dots+s_{k}(Y).
\]

When the Choi representation of a channel is seperable

2014-01-25T00:01:00.000-08:00

Let $\X$ and $\Y$ be complex Euclidean spaces, and let $\Phi\in\Channel(\X,\Y)$ be a channel. A positive operator $P\in \Pos(\X\otimes\Y)$ is separable if and only if there exists a positive integer $m$ and positive semi definite operators
\[
Q_1,Q_2, \dots, Q_m\in\Pos(\X) \ \ \text{and} \ \ R_1,R_2, \dots, R_m\in\Pos(\Y)
\]
such that
\[
P=\SUM{j=1}{m}Q_j\otimes R_j
\]
Denote by $\Sep(\X : \Y)$ the collection of all such separable operators.

Theorem:

The following two properties are equivalent:

For every complex Euclidean space $\Z$ and everydensity operator $\rho\in\Density(\X\otimes\Z)$, it holds that $\bigl(\Phi\otimes \I_{\Lin(\Z)}\bigr)(\rho) \in \Sep(\Y:\Z)$.
$J(\Phi) \in \Sep(\Y:\X)$.

Proof:

Recall that the Choi representation $J(\Phi)$ can be expressed as
\[
J(\Phi)=(\Phi\otimes\I_{\Lin(\X)})(\vec(\I_{\X})\vec(\I_{\X})^\ast).
\]
Now first assume that property 1 holds, and let $\Z=\X=\mathbb{C}^\Sigma$ and consider the density operator $\rho\in\Density(\X\otimes\X)$ given by
\[
\rho=\frac{1}{|\Sigma|}(\vec(\I_\X)\vec(\I_\X)^\ast).
\]
The assumption of property 1 then reads
\[
(\Phi\otimes\I_{\Lin(\X)})(\vec(\I_\X)\vec(\I_\X)^\ast)\in\Sep(\Y:\X),
\]
implying that $J(\Phi)\in\Sep(\Y : \X)$, which is the claim of property 2

Instead, now assume that property 2 holds so that 2: $J(\Phi) \in \Sep(\Y:\X)$. Then by the Woronowicz-Horodecki criterion this statement is equivalent to one where for every complex Euclidean space $\Z$ and every positive map $\Xi\in\Trans(\Y,\Z)$
\[
(\Xi\otimes\I_{\Lin(\X)})(J(\Phi))\in\Pos(\Z\otimes\X).
\]

Substituting the expression recalled above for $J(\Phi)$ then gives
\[\begin{align*}
(\Xi\otimes\I_{\Lin(\X)})(J(\Phi))&=(\Xi\otimes\I_{\Lin(\X)})(\Phi\otimes\I_{\Lin(\X)})(\vec(\I_{\X})\vec(\I_{\X})^\ast) \\
&=(\Xi(\Phi)\otimes\I_{\Lin(\X)})(\vec(\I_{\X})\vec(\I_{\X}^\ast))\\
&=J(\Xi(\Phi)).
\end{align*}\]

Hence, $J(\Xi(\Phi))\in\Pos(\Z\otimes\X)$. This implies that there exists a complex Euclidean space $\W$ an an operator $A\in\Lin(\X,\Z\otimes\W)$ such that
\[
\Xi(\Phi)(X)=\tr_{|W}(AXA^\ast)
\]
for all $X\in\Lin(\X)$. Consider any $\rho\Density(\X\otimes\Z)$. Then
\[
(\Xi\otimes\I_{\Lin(\X)})(\Phi\otimes\I_{\Lin(\X)})(\vec(\I_{\X})\vec(\I_{\X})^\ast)(\rho)\in\Pos(\Z\otimes\X),
\]
which should imply that $(\Phi\otimes\I_{L(\Z)})(\rho)\in\Sep(\Y:\Z)$ by the Woronowicz-Horodecki criterion.

Some more facts concerning the von Neumann entropy

2014-01-24T23:25:00.000-08:00

Let $\reg{X}$, $\reg{Y}$, and $\reg{Z}$ be registers, assume that the classical state set of $\reg{X}$ is $\Sigma$, and let $n = \abs{\Sigma}$.

Theorem:
For every state $\rho\in\Density(\X\otimes\Y\otimes\Z)$ of $(\reg{X},\reg{Y},\reg{Z})$ it holds that
\[
      S(\reg{X},\reg{Y} : \reg{Z})
      \leq S(\reg{Y}:\reg{X},\reg{Z}) + 2\log(n).
\]


Proof:

    From the result proved in a previous post, it holds that for every choice of registers $\reg{X}$ and $\reg{Z}$, and for any state of $\Density(\X\otimes\Z)$, $S(\reg{Z})\leq S(\reg{X})+S(\reg{X}, \reg{Z})$, or equivalently that
\[
     0\leq S(\reg{X})+S(\reg{X}, \reg{Z})-S(\reg{Z}).
\]
Also, by sub-additivity $S(\reg{X},\reg{Y})\leq S(\reg{X})+S(\reg{Y})$, or equivalently
\[
0\leq S(\reg{X})+S(\reg{Y})-S(\reg{X},\reg{Y}).
\]
Then by adding these two inequalities, it must also hold that
\[
0\leq S(\reg{X})+S(\reg{X}, \reg{Z})-S(\reg{Z})+S(\reg{X})+S(\reg{Y})-S(\reg{X},\reg{Y}),
\]
and since in general $S(\reg{X})\leq \log(n)$ or $2S(\reg{X})\leq 2\log(n)$,
\[
   0\leq S(\reg{X}, \reg{Z})-S(\reg{Z})+S(\reg{Y})-S(\reg{X},\reg{Y})+2\log(n).
\]
Therefore,
\[
S(\reg{Z})+S(\reg{X},\reg{Y})\leq S(\reg{X}, \reg{Z})+S(\reg{Y})+2\log(n),
\]
Adding $-S(\reg{X},\reg{Y}, \reg{Z})$ to both sides of this inequality yields
\[
S(\reg{Z})+S(\reg{X},\reg{Y})-S(\reg{X},\reg{Y}, \reg{Z})\leq S(\reg{X}, \reg{Z})+S(\reg{Y})-S(\reg{X},\reg{Y}, \reg{Z})+2\log(n),
\]
or equivalently
\[
S(\reg{X},\reg{Y} : \reg{Z})\leq S(\reg{Y}:\reg{X},\reg{Z}) + 2\log(n).
\]

Here is an example, for $\Sigma = \{0,1\}$, of a state $\rho$ for which this inequality becomes an equality.

Consider the three qubit pure state
\[
\left|\psi\right>_{\reg{X},\reg{Y},\reg{Z}}=\frac{1}{\sqrt{2}}(\left|0\right>_{\reg{X}}\left|0\right>_{\reg{Y}}\left|0\right>_{\reg{Z}}+\left|1\right>_{\reg{X}}\left|0\right>_{\reg{Y}}\left|1\right>_{\reg{Z}}).
\]
Then the states of the following particular subystems are also pure :
\[\begin{align*}
\left|\psi\right>_{\reg{Y}}&=\left|0\right>_{\reg{Y}}\\
\left|\psi\right>_{\reg{X},\reg{Z}}&=\frac{1}{\sqrt{2}}(\left|0\right>_{\reg{X}}\left|0\right>_{\reg{Z}}+\left|1\right>_{\reg{X}}\left|1\right>_{\reg{Z}}).
\end{align*}\]

However, the following subsystems are in the maximally mixed state:
\[\begin{align*}
\rho_{\reg{X}}=\frac{1}{2}(\left|0\right>\left<0\right|+\left|1\right>\left<1\right|) \\
\rho_{\reg{Z}}=\frac{1}{2}(\left|0\right>\left<0\right|+\left|1\right>\left<1\right|).
\end{align*}\]
Moreover, the state of the subsystem $\reg{X},\reg{Y}$ is in the tensor product state
\[\begin{align*}
\rho_{\reg{X},\reg{Y}}&=\rho_{\reg{X}}\otimes \rho_{\reg{Y}}\\
&=\frac{1}{2}\bigl(\left|0\right>\left<0\right|+\left|1\right>\left<1\right|\bigr)\otimes \left|0\right>\left<0\right|
\end{align*}\]

The entropy of a pure state is zero and the entropy of a maximally entangled state in this case is $\log(n)=\log(2)$. Then the entropies of the states listed above are
\[\begin{align*}
S(\reg{Y})=S(\reg{X},\reg{Z})&=0 \\
S(\reg{X})=S(\reg{Z})&=\log(2) \\
S(\reg{X},\reg{Y})=S(\reg{X})+S(\reg{Y})&=\log(2).
\end{align*} \]

Therefore,
\[\begin{align*}
S(\reg{X},\reg{Y} : \reg{Z}) - S(\reg{Y}:\reg{X},\reg{Z})&=S(\reg{X},\reg{Y})-S(\reg{Y})-S(\reg{X},\reg{Z})+S(\reg{Z})+\left(S(\reg{X},\reg{Y},\reg{Z})-S(\reg{X},\reg{Y},\reg{Z})\right) \\
&=S(\reg{X},\reg{Y})-S(\reg{Y})-S(\reg{X},\reg{Z})+S(\reg{Z}) \\
&=S(\reg{X}+S(\reg{Y})-S(\reg{Y})-S(\reg{X},\reg{Z})+S(\reg{Z}) \\
&=S(\reg{X}-S(\reg{X},\reg{Z})+S(\reg{Z}) \\
&=\log(2)-0+\log(2) \\
&=2\log(2),
\end{align*}\]
which implies that
\[
S(\reg{X},\reg{Y} : \reg{Z}) = S(\reg{Y}:\reg{X},\reg{Z})+2\log(2).
\]

Theorem:

Let $p\in\P(\Sigma)$ be a probability vector, let $\{\sigma_a\,:\,a\in\Sigma\} \subset \Density(\Y\otimes\Z)$ be a collection of density operators, and let
\[
      \rho = \sum_{a\in \Sigma} p(a) E_{a,a}\otimes \sigma_a.
\]
In other words, $\rho$ is a state of $(\reg{X},\reg{Y},\reg{Z})$ in which we view $\reg{X}$ as a classical register. With respect to the state $\rho$, it holds that
\[
      S(\reg{X},\reg{Y} : \reg{Z})
      \leq S(\reg{Y}:\reg{X},\reg{Z}) + \log(n).
\]


Proof:

First, observe that
\[\begin{align*}
S(\reg{X}|\reg{Y})-S(\reg{X}|\reg{Z})&=S(\reg{X},\reg{Y})-S(\reg{Y})-S(\reg{X},\reg{Z})+S(\reg{Z})\\
&=S(\reg{X},\reg{Y})-S(\reg{Y})-S(\reg{X},\reg{Z})+S(\reg{Z})+\left(S(\reg{X},\reg{Y},\reg{Z})-S(\reg{X},\reg{Y},\reg{Z})\right) \\
&=S(\reg{X},\reg{Y} : \reg{Z}) - S(\reg{Y}:\reg{X},\reg{Z}).
\end{align*}\]

Now consider the individual bounds on the quantities $S(\reg{X}|\reg{Y})$ and $S(\reg{X}|\reg{Z})$ in order to infer a bound on the difference $S(\reg{X}|\reg{Y})-S(\reg{X}|\reg{Z})$. In this case, since the state of register $\reg{X}$ is classical the conditional entropies are at most $S(\reg{X}|\reg{Y})\leq \log(n)$ and likewise $S(\reg{X}|\reg{Z})\leq \log(n)$. On the contrary, it could be the case that $S(\reg{X}|\reg{Y})\leq 0$ or $S(\reg{X}|\reg{Z})\leq 0$ in the presence of stronger entanglement correlations in which case $S(\reg{Y})\leq S(\reg{X},\reg{Y})$ or $S(\reg{Z})\leq S(\reg{X},\reg{Z})$. Therefore, the largest the difference of the two could be is when $S(\reg{X}|\reg{Y})=\log(n)$ and $S(\reg{X}|\reg{Z})=0$. Hence, $S(\reg{X}|\reg{Y})-S(\reg{X}|\reg{Z})\leq \log(n)$, or equivalently $S(\reg{X},\reg{Y} : \reg{Z})-S(\reg{Y}:\reg{X},\reg{Z}) \leq \log(n)$ implying that $S(\reg{X},\reg{Y} : \reg{Z}) \leq S(\reg{Y}:\reg{X},\reg{Z}) + \log(n)$.

Some facts concerning the von Neumann entropy and quantum mutual information

2014-01-24T23:06:00.000-08:00

Here we'll prove some facts concerning the von Neumann entropy and quantum mutual information.

Let $\X$ be an $n$-dimensional complex Euclidean space, and let $\rho\in\Density(\X)$ be a density operator. Recall that the von Neumann entropy of $\rho$ is defined as
\[
S(\rho):=-\tr(\rho \ \text{log}(\rho)),
\]
or equivalently as
\[
S(\rho):=H(\lambda(\rho)),
\]
where $\lambda(\rho)=(\lambda_1(\rho),\lambda_2(\rho),\dots,\lambda_n(\rho))$ is the vector of eigenvalues of $\rho$, and
\[
H(p):=\sum_{a\in\Sigma}-p(a)\log(p(a))),
\]
is the classical Shannon entropy of a vector $p\in\mathbb{R}^{\Sigma}$ over some alphabet $\Sigma$.

Theorem:

For every choice of complex Euclidean spaces $\X$ and $\Y$, and every vector $u \in \X\otimes\Y$, it holds that $S(\tr_{\X}(u u^{\ast})) = S(\tr_{\Y}(u u^{\ast}))$.

Proof:

The vector $u\in\X\otimes\Y$ can be expressed in its Schmidt decomposition after making the unique identification $u=vec(A)$ as
\[
u=\sum_{k=1}^{r}s_kx_k\otimes y_k,
\]
where $r=rank(A)$, $0\leq s_1,\dots, s_r\in\mathbb{R}$ are the singular values, and $\{x_1,\dots,x_r\}\subset\X$ and $\{y_1\dots y_r\}\subseteq\Y$ are orthonormal sets. Then
\[
uu^\ast=\sum_{j,k=1}^{r}s_js_kx_jx_k^\ast\otimes y_jy_k^\ast,
\]
and therefore
\[
\tr_{\X}(uu^\ast)=\sum_{k=1}^{r}s_k^2x_kx_k^\ast \ \ \ \ \text{and} \ \ \ \tr_{\Y}(uu^\ast)=\sum_{k=1}^{r}s_k^2y_ky_k^\ast.
\]
Now let $\lambda=(s_1^2,\dots,s_r^2)$, and observe that $\lambda$ is the vector of non-zero eigenvalues of both $\tr_{\X}(uu^\ast)$ and $\tr_{\Y}(uu^\ast)$ since they are implicitly expressed in their own Schmidt decompositions above.

Hence, (by definition) the von Neumann entropy of each is
\[
S(\tr_{\X}(u u^{\ast})) =H(\lambda) = S(\tr_{\Y}(u u^{\ast})).
\]

Theorem:

For every choice of registers $\reg{X}$ and $\reg{Y}$, and for every state $\rho\in\Density(\X\otimes\Y)$ of these registers, it holds that $S(\reg{X}) \leq S(\reg{Y}) + S(\reg{X},\reg{Y})$.}

Proof:

Choose a complex Euclidean space $\Z$ such that $\dim(\Z)\geq\rank(\rho)$ so that there exists a purification $\rho'=uu^\ast\in D(\X\otimes\Y\otimes\Z)$, and then let $\rho'$ be the joint state of the registers $\reg{X},\reg{Y},\reg{Z}$. Now consider the following. Since $\rho'$ is a pure state $S(\reg{X},\reg{Y}, \reg{Z})=0$. Moreover, $\rho'[\reg{X},\reg{Z}]=\tr_{\Y}(\rho')$ and $\rho'[\reg{Y}]=\tr_{\X\otimes\Z}(\rho')$, but since $\rho'=uu^\ast$ is a pure state the result of part (a) implies that $S(\tr_{\Y}(\rho'))=S(\tr_{\X\otimes\Z}(\rho'))$ or equivalently that $S(\reg{Y}) = S(\reg{X},\reg{Z})$.

By strong sub-additivity, for any possible state of the registers $\reg{X},\reg{Y}, \reg{Z}$,
\[
S(\reg{X},\reg{Y}, \reg{Z})+S(\reg{X})\leq S(\reg{X}, \reg{Z})+S(\reg{X}, \reg{Y}).
\]
However, by previous considerations we have that $S(\reg{X},\reg{Y}, \reg{Z})=0$ and $S(\reg{Y}) = S(\reg{X},\reg{Z})$, which after substituting implies that
\[
S(\reg{X})\leq S(\reg{Y})+S(\reg{X}, \reg{Y}).
\]

Theorem:

Let $\reg{X}$ and $\reg{Y}$ be registers, let $\Sigma$ be an alphabet, let $p\in\P(\Sigma)$ be a probability vector, and let $\{\sigma_a\,:\,a\in\Sigma\}\subset\Density(\X)$ and $\{\xi_a\,:\,a\in\Sigma\}\subset\Density(\Y)$ be arbitrary collections of density operators. For $(\reg{X},\reg{Y})$ being in the state
\[
      \rho = \sum_{a\in\Sigma} \, p(a) \sigma_a\otimes\xi_a,
\]
it holds that $S(\reg{X} : \reg{Y}) \leq H(p)$.

Proof:

In this case, the relative state of the two registers is given by
\[
\rho[\reg{X}]=\tr_\Y(\rho)=\sum_{a\in\Sigma}p(a) \sigma_a \ \ \ \text{and} \ \ \ \rho[\reg{Y}]=\tr_\X(\rho)=\sum_{a\in\Sigma}p(a) \xi_a.
\]
so that
\[\begin{align*}
\rho[\reg{X}]\otimes\rho[\reg{Y}]&=\left(\sum_{a\in\Sigma}p(a) \sigma_a\right)\otimes\left(\sum_{b\in\Sigma}p(b) \xi_b\right) \\
&=\sum_{a\in\Sigma}\sum_{b\in\Sigma}p(a)p(b) \sigma_a\otimes \xi_b.
\end{align*}\]

Then the mutual information $S(\reg{X} : \reg{Y})$ can be expressed as
\[\begin{align*}
S(\reg{X} : \reg{Y})&=S(\rho||\rho[\reg{X}]\otimes\rho[\reg{Y}]) \\
&=S\left( \sum_{a\in\Sigma} \, p(a) \sigma_a\otimes\xi_a || \sum_{a\in\Sigma}\sum_{b\in\Sigma}p(a)p(b) \sigma_a\otimes \xi_b\right) \\
&\leq \sum_{a\in\Sigma}S\left( p(a) \sigma_a\otimes\xi_a || \sum_{b\in\Sigma}p(a)p(b) \sigma_a\otimes \xi_b\right) \\
&=\sum_{a\in\Sigma}S\left( p(a) \sigma_a\otimes\xi_a || \, p(a) \sigma_a\otimes \sum_{b\in\Sigma}p(b)\xi_b\right) \\
&=\sum_{a\in\Sigma}\left(\tr(\xi_a)S(p(a)\sigma_a || p(a)\sigma_a) + \tr(p(a)\sigma_a)S(\xi_a || \sum_{b\in\Sigma}p(b)\xi_b) \right),
\end{align*}\]
but since $S(p(a)\sigma_a || p(a)\sigma_a)=0$ and for $\sigma_a\in\Density(\X)$ it is always the case that $\tr(\sigma_a)=1$, it follows that
\[\begin{align*}
S(\reg{X} : \reg{Y})\leq &\sum_{a\in\Sigma} p(a)S\left(\xi_a || \sum_{b\in\Sigma}p(b)\xi_b\right) \\
=&\sum_{a\in\Sigma} p(a)S\left(\frac{p(a)}{p(a)}\xi_a || \sum_{b\in\Sigma}p(b)\xi_b\right) \\
=&\sum_{a\in\Sigma}p(a)\tr\left( \xi_a\log\left(\frac{p(a)}{p(a)}\xi_a\right) - \xi_a\log\left(\sum_{b\in\Sigma}p(b)\xi_b\right)\right) \\
=&\sum_{a\in\Sigma}p(a)\tr\left(-\xi_a\log(p(a))+ \xi_a\log\left(p(a)\xi_a\right) - \xi_a\log\left(\sum_{b\in\Sigma}p(b)\xi_b\right)\right) \\
=&\sum_{a\in\Sigma}p(a)\tr(-\xi_a\log(p(a)))+p(a)\tr\left( \xi_a\log\left(p(a)\xi_a\right) - \xi_a\log\left(\sum_{b\in\Sigma}p(b)\xi_b\right)\right) \\
=&\sum_{a\in\Sigma}-p(a)\log(p(a)))+p(a)\tr\left( \xi_a\log\left(p(a)\xi_a\right) - \xi_a\log\left(\sum_{b\in\Sigma}p(b)\xi_b\right)\right) \\
=&H(p)+c.
\end{align*}\]
Here, the Shannon entropy is by definition
\[
H(p)=\sum_{a\in\Sigma}-p(a)\log(p(a))),
\]
and the value $c$ has been introduced for convenience to represent the remaining quantity
\[
c:=\sum_{a\in\Sigma}p(a)\tr\left( \xi_a\log\left(p(a)\xi_a\right) - \xi_a\log\left(\sum_{b\in\Sigma}p(b)\xi_b\right)\right).
\]

In general, by the monoticity of the logarithmic function for $0\leq a,b\in \mathbb{R}$ it is the case that $\log(a)\leq(a+b)$. This then implies that

\[
\sum_{a\in\Sigma}p(a)\tr\left( \xi_a\log\left(p(a)\xi_a\right) \leq \xi_a\log\left(\sum_{b\in\Sigma}p(b)\xi_b\right)\right),
\]
so that $c\leq 0$.

Hence $S(\reg{X} : \reg{Y})\leq H(p)+c\leq H(p)$.

Bounding the quantum relative entropy in terms of the classical relative entropy

2014-01-03T22:30:00.000-08:00

Theorem:

Let $\X$ be a complex Euclidean space, let $\Sigma$ be an alphabet, let $p,q\in\P(\Sigma)$ be probability vectors, and let $\{\rho_a\,:\,a\in\Sigma\}\subset\Density(\X)$ and $\{\sigma_a\,:\,a\in\Sigma\}\subset\Density(\X)$ be collections of density operators indexed by $\Sigma$. Assume that $\im(\rho_a)\subseteq\im(\sigma_a)$, $p(a)>0$, and $q(a) > 0$ for all $a\in\Sigma$. For two positive definite operators $P$ and $Q$ acting on $\X$, denote the quantum relative entropy as
\[
S(P || Q )=\tr(P \ \text{log}(P))-\tr(P \ \text{log}(Q))
\]

Then
\[
    S\Biggl(\sum_{a\in\Sigma} p(a) \rho_a \Bigg\|
    \sum_{a\in\Sigma} q(a) \sigma_a \Biggr)
    \leq \sum_{a\in\Sigma} p(a) S(\rho_a \| \sigma_a) + D(p \| q),\]
where
\[
D(p \| q):=\sum_{a\in\Sigma}\Bigl(p(a)\text{log}\Bigl(\frac{p(a)}{q(a)}\Bigr).
\]
is the classical relative entropy of two probability vectors $p,q\in\P(\Sigma)$.

Proof:

Consider the following fact, which states that for a complex Euclidean space $\X$ and operators $P_0,P_1,Q_0,Q_1\in \Pos(\X)$,
\[
S(P_0+P_1\| Q_0+Q_1)\leq S(P_0 \|Q_0)+S( P_1 \| Q_1).
\]
Therefore,
\[
    S\Biggl(\sum_{a\in\Sigma} p(a) \rho_a \Bigg\|
    \sum_{a\in\Sigma} q(a) \sigma_a \Biggr)
    \leq \sum_{a\in\Sigma}S\left( p(a) \rho_a \|
q(a) \sigma_a \right).
\]
Now as a consequence, for $P,Q\in\Pos{\X}$ and scalars $\alpha,\beta\in(0,\infty)$
\[
S(\alpha P \| \beta Q)=\alpha S(P\|Q)+\alpha \text{log}(\alpha/\beta)\tr(P).
\]
Thus,
\[ \begin{align*}
    S\Biggl(\sum_{a\in\Sigma} p(a) \rho_a \Bigg\|
    \sum_{a\in\Sigma} q(a) \sigma_a \Biggr)
   & \leq \sum_{a\in\Sigma}S\left( p(a) \rho_a \|
q(a) \sigma_a \right) \\
&= \sum_{a\in\Sigma} \Bigl(p(a)S(\rho_a \| \sigma_a)+p(a)\text{log}\Bigl(\frac{p(a)}{q(a)}\Bigr)\tr(\rho_a) \Bigr) \\
&=\sum_{a\in\Sigma} \Bigl(p(a)S(\rho_a \| \sigma_a)\Bigr)+\sum_{a\in\Sigma}\Bigl(p(a)\text{log}\Bigl(\frac{p(a)}{q(a)}\Bigr) \Bigr),
\end{align*}\]
since $\rho_a\in\Density(\X)$ implies that $\tr(\rho_a)=1$. Also, by definition of the relative entropy of two probability vectors $p,q\in\P(\Sigma)$,
\[
D(p \| q):=\sum_{a\in\Sigma}\Bigl(p(a)\text{log}\Bigl(\frac{p(a)}{q(a)}\Bigr).
\]
Hence,
\[
\sum_{a\in\Sigma} \Bigl(p(a)S(\rho_a \| \sigma_a)\Bigr)+\sum_{a\in\Sigma}\Bigl(p(a)\text{log}\Bigl(\frac{p(a)}{q(a)}\Bigr) \Bigr)=\sum_{a\in\Sigma} \Bigl(p(a)S(\rho_a \| \sigma_a)\Bigr)+   D(p \| q) \Bigr),
\]
which implies
\[
    S\Biggl(\sum_{a\in\Sigma} p(a) \rho_a \Bigg\|
    \sum_{a\in\Sigma} q(a) \sigma_a \Biggr)
    \leq \sum_{a\in\Sigma} p(a) S(\rho_a \| \sigma_a) + D(p \| q).
\]

When a channel is "optimal"

2014-01-03T22:04:00.000-08:00

Let $\X$ and $\Y$ be complex Euclidean spaces and let $H\in\Herm(\Y\otimes\X)$ be an arbitrary Hermitian operator. Consider the problem of maximizing the value
\[
    \ip{H}{J(\Phi)}
\]
over all choices of a channel $\Phi\in\Channel(\X,\Y)$.

One may observe that there must always exist at least one choice of a channel $\Psi\in\Channel(\X,\Y)$ such that
\[
    \ip{H}{J(\Psi)} = \sup\{\ip{H}{J(\Phi)}\,:\,\Phi\in\Channel(\X,\Y)\},
\]
by virtue of the fact that $\Channel(\X,\Y)$ is a compact set and $\Phi\mapsto\ip{H}{J(\Phi)}$ is a continuous function. For any channel $\Psi\in\Channel(\X,\Y)$ satisfying the identity above, let us say that $\Psi$ is optimal with respect to $H$.

Theorem:

$\Phi\in\Channel(\X,\Y)$ is optimal with respect to $H$ if and only if
\[
    \I_{\Y} \otimes \tr_{\Y} ( H J(\Phi)) - H \in \Pos(\Y\otimes\X).
\]

Proof:

Let $\Z=\X\otimes\Y$. If $\Phi\in\Channel(\X,\Y)$, then the Choi representation $J (\Phi)$ satisfies
\[
J(\Phi)\in\Pos(\Z) \ \text{and} \ \tr_{\Y} (J(\Phi)=\I_{\Y}.
\]
Consider the semidefinite program defined by the triple $(\Omega, H, \I_{\X})$, where $H\in\Herm(\Z), \I_{\X}\in\Herm{\X}$, and $\Omega\in\Channel(\Z,\X)$ is defined as $\Omega(Z)=\tr_{\Y}(Z)$ so that $\Omega^*\in\Channel(\X,\Z)$ is given as $\Omega^*(X)=\I_{\Y}\otimes X$. Then the primal and dual problems can be expressed as
\[ \begin{align*}
Primal & & & &Dual& \\
&\max\ip{H}{J(\Phi)} & & & &\min\ip{\I_{\X}}{X} \\
\text{subject to:} \ & \tr_{\Y}(J(\Phi), & & &\text{subject to:} \ & \I_{\Y}\otimes X \geq H, \\
& J(\Phi)\in\Pos(\Z) &&&& X\in\Herm(\X)
\end{align*}\]
Define the primal and dual feasible sets $\mathcal{A}$ and $\mathcal{B}$, respectively, as
\[
\mathcal{A}:=\{Z\in\Pos(\Z) : \Omega(Z)=\I_{\X}\} \ \text{and} \ \mathcal{B}:=\{ X\in\Herm(\X) : \Omega^*(X)\geq H\}.
\]
Also define the optimate values associated to the primal and dual problems as
\[
\alpha:=\sup\{\ip{H}{Z} : Z\in\mathcal{A}\} \ \text{and} \ \beta:=\inf\{\ip{\I_{\X}}{X} : X\in\mathcal{B}\}.
\]

Since there always exists some $\Psi\in\Channel(\X,\Y)$ that is optimal with respect to $H$ as claimed in the problem statement, the primal feasible set is nonempty.Thus, $\alpha$ is finite. Now, consider the spectral decomposition of $H$ and its spectrum of eigenvalues $spec(H)$. Let $\lambda=\max\{spec(H)\}$ be the largest eigenvalue, and consider the operator $\lambda\I_{\X}\in \Herm(\X)$. Then $\Omega^*(\lambda\I_{\X})=\I_{\Y}\otimes\lambda\I_{\X}>H$. Therefore, strong duality holds by Slater's theorem (Theorem 1.11). This implies that $\alpha=\beta$ and there exists $Z\in\mathcal{A}$ such that $\ip{H}{Z}=\alpha$. Then by complementary slackness (Proposition 1.12), if $Z\in\mathcal{A}$ and $\X\in\mathcal{B}$ satisfy $\ip{H}{Z}=\ip{\I_{\X}}{X}$, it holds that $\Omega^*(X)Z=HZ$.

Now suppose that $\Phi\in\Channel(\X,\Y)$ is optimal with respect to $H$ so that $\ip{H}{J(\Phi)}=\alpha$, and that $\ip{H}{J(\Phi)}=\ip{\I_{\X}}{X}$ for some $X\in\mathcal{B}$. Then by complementary slackness it follows that
\[ \begin{align*}
\Omega^*(X)J(\Phi)&=HJ(\Phi) \\
(\I_{\Y}\otimes X)J(\Phi)&=HJ(\Phi) \\
\tr_{\Y}((\I_{\Y}\otimes X)J(\Phi))&=\tr_{\Y}(HJ(\Phi)) \\
X&=\tr_{\Y}(HJ(\Phi)),
\end{align*}\]
since $\tr_{\Y}(J(\Phi))=\I_{\X}$. Therefore, since $X\in\mathcal{B}$ satisfies $X\in\Herm(\X)$ and $\Omega^*(X)\geq H$. This implies that $\Omega^*(J(\Phi))=\I_{\Y}\otimes \tr_{\Y}(HJ(\Phi))\geq H$. That is, $\I_{\Y}\otimes \tr_{\Y}(HJ(\Phi))-H\geq 0$, or in other words $\I_{\Y}\otimes \tr_{\Y}(HJ(\Phi))-H\in\Pos(\Z)=\Pos(\Y\otimes\X)$.

Suppose instead that $\I_{\Y}\otimes \tr_{\Y}(HJ(\Phi))-H\in\Pos(\Z)=\Pos(\Y\otimes\X)$ holds for some $\Phi\in\Channel(\X,\Y)$. This is equivalent to writing $\I_{\Y}\otimes \tr_{\Y}(HJ(\Phi))\geq H$ or $\Omega^*(\tr_{\Y}(HJ(\Phi))\geq H$. Moreover, since $J(\Phi)\in\Pos(\Z)\subset\Herm(\Z)$ and $H\in\Herm(Z)$, the product $HJ(\Phi)\in\Herm(\Z)$ as well. Also, because $\tr_{\Y}\in\Channel(\Z,\X)$ is Hermiticity-preserving, this implies that $\tr_{\Y}(HJ(\Phi)\in\Herm(\X)$. Hence, $\tr_{\Y}(HJ(\Phi)\in\mathcal{B}$ as it satisfies the conditions for being dual feasible. The quantity $\ip{\I_{\X}}{\tr_{\Y}(HJ(\Phi)}$ therefore places an upper bound on the possible values of $\ip{H}{Z}$ for any primal feasible $Z\in\mathcal{A}$. Thus, $\ip{H}{Z}\leq\ip{\I_{\X}}{\tr_{\Y}(HJ(\Phi)}$. However, observe that
\[ \begin{align*}
\ip{\I_{\X}}{\tr_{\Y}(HJ(\Phi)}=\tr_{\X}(\tr_{\Y}(HJ(\Phi))=\tr_{\Y\otimes\X}(HJ(\Phi))=\ip{H}{J(\Phi)},
\end{align*}\]
which actually implies that $\ip{H}{Z}=\ip{\I_{\X}}{\tr_{\Y}(HJ(\Phi)}$. Hence, it must be the case that $\tr_{\Y}(HJ(\Phi)$ is a solution to the dual problem, and that $J(\Phi)$ is a solution to the primal problem since $J(\Phi)\in\mathcal{A}$ by virtue of $\Phi\in\Channel(\X,\Y)$. Thus, $\Phi\in\Channel(\X,\Y)$ is optimal with respect to $H$.

It has now been shown that $\Phi\in\Channel(\X,\Y)$ is optimal with respect to $H$ if and only if $\I_{\Y}\otimes \tr_{\Y}(HJ(\Phi))-H\in\Pos(\Y\otimes\X)$.

A lower bound on the trace distance of tensor copies of states

2014-01-03T19:28:00.000-08:00

Theorem:
Let $\X$ be a complex Euclidean space, let $\rho_0,\rho_1\in\Density(\X)$ be density operators satisfying
\[
    \bignorm{\rho_0 - \rho_1}_1 \geq \varepsilon
\]
for $\varepsilon > 0$, and let $n$ be an arbitrary positive integer.
Then
\[
    \Bignorm{\rho_0^{\otimes n} - \rho_1^{\otimes n}}_1
    \geq 2 - 2 \exp\biggl(-\frac{n\varepsilon^2}{8}\biggr).
\]
(The notation $\rho^{\otimes n}$ means $\rho$ tensored with itself $n$ times. For example, $\rho^{\otimes 4} = \rho\otimes\rho\otimes\rho\otimes\rho$.)

Proof:
By the Fuchs-van de Graaf inequalities (Theorem 3.34) we have that the following two statements are equivalent:
\[
1-\frac{1}{2}\bignorm{\rho_0-\rho_1}_1\leq\fid(\rho_0,\rho_1)\leq\sqrt{1-\frac{1}{4}\bignorm{\rho_0-\rho_1}_1^2},
\]
\[
2-2\fid(\rho_0,\rho_1)\leq\bignorm{\rho_0-\rho_1}_1\leq 2\sqrt{1-\fid(\rho_0,\rho_1)^2}.
\]
Also, by Proposition 3.16, it follows that
\[
\fid(\rho_0^{\otimes n},\rho_1^{\otimes n})=\fid(\rho_0,\rho_1)^{n}.
\]
Therefore,
\[
\fid(\rho_0^{\otimes n},\rho_1^{\otimes n})^2=\fid(\rho_0,\rho_1)^{2n}\leq\left(1-\frac{1}{4}\bignorm{\rho_0-\rho_1}_1^2\right)^n\leq\left(1-\frac{1}{4}\varepsilon^2\right)^n,
\]
since $ \varepsilon\leq\bignorm{\rho_0 - \rho_1}_1$ by assumption.

Since for $x\in\mathbb{R}$ such that $0\leq x\leq1$, and any positive integer $n$,
\[
(1-x)^n\leq \exp(-nx),
\]
then for $0\leq\varepsilon\leq2$,
\[
\left(1-\frac{1}{4}\varepsilon^2\right)^n\leq \exp\biggl(-\frac{n\varepsilon^2}{4}\biggr).
\]
This implies that
\[
\fid(\rho_0^{\otimes n},\rho_1^{\otimes n})\leq\left(1-\frac{1}{4}\varepsilon^2\right)^{n/2}\leq\exp\biggl(-\frac{n\varepsilon^2}{8}\biggr).
\]
However, by the Fuchs-van de Graaf inequalities, since
\[
2-2\fid(\rho_0^{\otimes n},\rho_1^{\otimes n})\leq\bignorm{\rho_0^{\otimes n}-\rho_1^{\otimes n}}_1,
\]
the previous result implies
\[
2-2\exp\biggl(-\frac{n\varepsilon^2}{8}\biggr)\leq2-2\fid(\rho_0^{\otimes n},\rho_1^{\otimes n})\leq\bignorm{\rho_0^{\otimes n}-\rho_1^{\otimes n}}_1,
\]
which completes the proof.

The Toric Code

2013-12-23T13:50:00.000-08:00

Abstract

The toric code is one of the first examples of a surface code \cite{Kitaev}, where a lattice consisting of qubits is deployed for the purposes of error correction. The toric code can be seen as a tool for realizing a quantum memory by exploiting certain topological properties that result from embedding the lattice on the surface of a torus. The model is best understood through the stabilizer formalism for quantum error correction. This post discusses the general properties of the toric code in this regard, and explains how Pauli errors on the encoded state can be corrected. Errors manifest themselves in the toric code as anyonic excitations. Although the theory of anyons provides an insightful interpretation to the dynamics of the toric code, the model will be presented here with minimal emphasis on the anyonic properties at play without sacrificing the essence of the toric code.

Preliminary remarks

Consider a $k \times k$ square consisting of $k^2$ square faces, $k^2$ vertices, and $2k^2$ edges. By identifying the top of the lattice with the bottom, and the left side with the right side, the lattice can be thought of as being embedded on the surface of a genus-$1$ torus (hence the name 'toric' code). Alternatively, the $k \times k$ lattice can be though of as having periodic boundary conditions, where a path leaving the lattice from the top side returns to the lattice from the corresponding point on the bottom side of the lattice. Likewise, a path leaving the torus from either the left or right sides would return from the opposite side. Instead of visualizing the lattice on the surface of the torus, this latter picture will be used throughout for convenience.

It is worth mentioning that the choice of a $k\times k$ square lattice is somewhat arbitrary in the sense that most of the essential features exhibited by the toric code also hold for $k\times k'$ square lattices with $k\neq k'$. It will be justified later what consequences this has for the code in regards to error correction. Moreover, the choice of using a square lattice can also be modified to include lattice configurations of arbitrary shape. In such cases however, the operators that act on the lattice may also need modification. For the purposes of simplicity, this paper will work with a $k\times k$ square lattice in order to exemplify the properties of the toric code in a more accessible manner.

In the toric code, a qubit is placed on each edge of the lattice so that there are $N:=2k^2$ qubits for a $k \times k$ lattice. Thus, the Hilbert space of the system under consideration is of the form
\[
\Hil^N=\TENSOR{j=1}{N}\Hil^2_j,
\]
where $\Hil^2_j$ is the two-dimensional Hilbert space of quibit $j$. For notational purposes, when some single qubit unitary operator $U$ is applied to only qubit $j$, write $U_j$ in order to specify the appropriate Hilbert space $\Hil^2_j$ that $U$ is meant to act on. That is,
\[
U_j=I\otimes\dots\otimes I\otimes U\otimes I\otimes\dots\otimes I
\]
will denote the unitary operator of the whole system $\Hil^N$ that only applies the single qubit unitary $U$ to the qubit of $\Hil^2_j$ and acts trivially on all other qubits (here, $I$ denotes the single qubit identity operator on $\Hil^2$). In this way, the action of the unitary $U$ on the space $\Hil^2_j\subset\Hil^N$ of a single qubit is extended to an operation that acts on the whole space $\Hil^N$. Therefore, if two single qubit unitaries $U $ and $V$ are to be applied to qubits $j_1$ and $j_2$, respectively, it is well defined to simply write this operation as the product $U_{j_1}V_{j_2}$.

The Stabilizer Group

For the toric code, there are two basic type of operators that act on the qubits of the lattice. These operators will be used to construct an algebraic set of operators important for the purposes of error correction. Define the following two operators on $\Hil^N$ for each vertex $v$ and face $f$ of the lattice, where $\sigma^x$ and $\sigma^z$ are the standard single qubit Pauli operators,
\[
A_v=\PROD{j\in star(v)}{}\sigma^x_j \ \ \ \text{and} \ \ \ B_f=\PROD{j\in boundary(f)}{}\sigma^z_j .
\]
Here, $star(v)$ represents the set of $4$ edges that meet at vertex $v$ so that $A_v$ is the operator that applies $\sigma^x$ to each of the $4$ qubits adjacent to vertex $v$, and $boundary(f)$ represents the set of $4$ edges that border the face $f$ so that $B_f$ is the operator that applies $\sigma^z$ to each of the $4$ qubits that border the particular face $f$. These operators are illustrated in the figure below. Note that, since both $\sigma^x$ and $\sigma^z$ are Hermitian operators, so are the $A_v$ and $B_f$. Moreover, both $A_v$ and $B_f$ have eigenvalues $+1$ and $-1$.

The lattice in the toric code has qubits placed on its edges depicted as black dots. The vertex operators $A_v$ applies $\px$ operators to the four edges around vertex $v$, and the face operators $B_f$ apply $\pz$ operators to the four edges bordering a face $f$.

The set $S$ consisting of all possible products of the operators $A_v$ and $B_f$ forms an abelian subgroup of the Pauli group $P_N$, which makes $S$ a stabilizer group (see Appendix). For this to be the case each of the $A_v$ and $B_f$ must commute with one another. Since $A_v$ and $B_p$ are either products of only $\sigma^x$s or only $\sigma^z$s it follows that $A_v$ and $A_{v'}$ commute for any vertices $v$ and $v'$, and that $B_f$ and $B_{f'}$ commute for any faces $f$ and $f'$, because $\sigma^x$ and $\sigma^z$ each commute with themselves. However, even though $\sigma^x$ and $\sigma^z$ anti-commute with one another $A_v$ and $B_f$ do indeed commute for any vertex $v$ and face $f$. To see this, there are two cases to consider. First, suppose $v$ and $f$ are sufficiently far apart so that there are no qubits in common that are acted on by the operators $A_v$ and $B_f$. In this case, $A_v$ and $B_f$ trivially commute since the operators $\sigma^x_j$ and $\sigma^z_{j'}$, with $j\in star(v)$ and $j'\in boundary(f)$, act on different subspaces. The only other possibility to consider is when the vertex $v$ happens to be one of the corners of the face $f$. In this case, there are two distinct qubits in the intersection of $star(v)$ and $boundary(f)$ as shown in Figure:

Adjacent vertex and face operators commute because they have two edges in common.

Each of these two qubits is acted on by $\sigma^x$ and $\sigma^z$, but since $\sigma^x\sigma^z=-\sigma^z\sigma^x$ there are two minus signs that result from the action of $A_v$ and $B_f$ on each of the two qubits which then cancel implying that $A_vB_f=B_fA_v$. Thus, it has been shown that each of the $A_v$ and $B_f$ commute with one another so that the set $S$ generated by their products is indeed a stabilizer group by definition.

It is worthwhile to calculate the number of independent generators of $S$. A generating set of $S$ is a collection of elements of $S$ such that each element of $S$ can be expressed as some product of elements from the generating set. In addition, it is required that the elements of the generating set be independent, meaning that no element of the generating set can be expressed as a product of the other elements of the generating set. Since the elements of $S$ are all expressible by products of the operators $A_v$ and $B_f$, finding a minimal generating set of $S$ comes down to determining if any of the $A_v$s or $B_f$s can be expressed in terms of others.

In fact, it will now be shown that the following two relationships hold:
\[
\PROD{v}{}A_v=I \ \ \text{and} \ \ \PROD{f}{}B_f=I,
\]
where the products range over all vertices $v$ and faces $f$ of the lattice and $I$ is the identity operator on the whole space $\Hil^N$. This is easily seen by noting that for any operator $A_v$ (or $B_f$) acting on a particular vertex $v$ (or face $f$) of the lattice, there are four adjacent operators $A_{v_i}$ (or $B_{f_i}$) where each of the adjacent operators have one edge in common with $A_v$ (or $B_f$). Therefore, the action of two $\sigma^x$ (or $\sigma^z$) operations on the common edges cancel since $\sigma^x\sigma^x=I=\sigma^z\sigma^z$, which cancels the action of the original $A_v$ (or $B_f$). Similarly, each of these four vertices $v_i$ (or faces $f_i$) have three other vertices (or faces) adjacent to them not including the original vertex $v$ (or face $f$). Then the action of the corresponding $A$ (or $B$) will cancel the $\sigma^x$ (or $\sigma^z$) operations that act on the shared edges. Continuing in this way, the simultaneous action of every $A_v$ (or $B_f$) on the lattice will cancel each other resulting in the trivial action $I$. This pattern is illustrated in the following two figures:

The product of all face operators gives the identity $I$ since each edge of the lattice is acted on by to $\pz$ operators and ${\pz}^2=I$.

The product of all vertex operators gives the identity $I$ since each edge of the lattice is acted on by to $\px$ operators and ${\px}^2=I$.

The relationship derived above implies that any single $A_{v'}$ (or $B_{f'}$) can be expressed as the product of all other $A_{v}$ (or $B_{f}$) with $v\neq v'$ (and $f\neq f'$). That is, since $A_v^2=I$ and $B_f^2=I$, each $A_v$ and $B_f$ is its own inverse. It then follows that
\[
\PROD{v\neq v'}{}A_v=A_{v'} \ \ \text{and} \ \ \PROD{f\neq f'}{}B_f=B_{f'}.
\]
Hence, since there are $k^2$ many $A_v$ operators and also $k^2$ many $B_f$ operators defined on the $k\times k$ lattice, there are only $k^2-1$ independent operators of each variety. This shows that a minimal generating set for the stabilizer group $S$ consists of $G:=2(k^2-1)$ independent elements.

This size of the generating set of $S$ will be relevant when analyzing the code space of the toric code to be presented in the next section. The elements of $S$, and in particular the operators forming the generating set of $S$, will play a crucial role in the correction of errors by serving as check operators. Measuring the generators on the qubits of the lattice will (in the ideal case) yield information on whether or not errors have been inflicted on the system, and also serve as means to correct possible the possible errors.

The Code Space

The stabilizer group $S$ generated by the operators $A_v$ and $B_f$ consists of operators that all commute with each other. It is a general result in the theory of linear algebra that such a collection of operators can all be simultaneously diagonalized. This means that there exists a simultaneous eigenspace which is a subspace of $\Hil^N$ consisting of eigenvectors of every element of $S$. Thus, consider the space
\[
\Hil_S:=\{\ket{\psi}\in\Hil^N : A_v\ket{\psi}=\ket{\psi}, B_f\ket{\psi}=\ket{\psi} \ \text{for all} \ v \ \text{and} \ f\},
\]
that is the simultaneous eigenspace of eigenvectors of each $A_v$ and $B_f$ having eigenvalue $+1$. Recall that such considerations are possible since the $A_v$ and $B_f$ have eigenvalues $+1$ and $-1$. The subspace $\Hil_S\subset\Hil^N$ is called the code space of the stabilizer group $S$. This space is to be thought of as encoding information that is to remain protected from errors. Each of the states $\ket{\psi}\in\Hil_S$ are states of all $N$ physical qubits of the system, but they only effectively encode some number $N_{c}<N$ of logical qubits, where $N_c=log_2(dim(\Hil_S))$ is the dimension of $\Hil_S$. That is, the states $\ket{\psi}$ in $\Hil_S$ are $N$ qubit states constructed to protect the information contained in the state of only $N_c$ effective qubits through the means of redundancy. In the conventional error correcting nomenclature, the states $\ket{\psi}\in\Hil_S$ are often called code words.

Of course, now the natural question to ask is how many qubits $N_c$ are encoded by $\Hil_S$, or equivalently what is the dimension $N_c:=dim(\Hil_S)$ of the space. As consequence of the stabilizer formalism, the number of logical qubits $N_c$ encoded by the space $\Hil_S$ in general is given by $2^{N-G}$, where $N$ is the total number of qubits under consideration and $G$ is the number of independent stabilizer operators that generate $S$. In the toric code defined on a $k \times k$ lattice, $N=2k^2$ and $G=2(k^2-1)$ as calculated in the previous section. Then the dimension of the code pace $\Hil_S$ is given by $N_c=2^{N-2(k^2-1)}=2^2$. Hence $dim(\Hil_S)=4$ so that the code space $\Hil_S$ only encodes $2$ logical qubits. Intuitively, this result holds because each independent generator of $S$ can be thought of as halving the dimension of the global space $\Hil^N$. Note that this number $N_c$ does not depend on the characteristic length scale $k$ of the lattice. This implies that no matter how large the lattice is made in the toric code, the number of encoded qubits always remains the same. This will have interesting consequences when considering the error correcting abilities of the toric code for different lattice sizes $k$.

After describing the properties of errors in the toric code and how the errors are to be corrected in the proceeding sections, it will be interesting to revisit and analyze the codespace $\Hil_S$ in order to more explicitly understand the structure of the space in a more topological context. This will be done in a later section.

Errors and Anyons

Assume now that the state of the qubits of the system $\Hil^N$ are prepared in an encoded state of $\Hil_S$ and no errors are present in the system. Then by construction, measuring the $A_v$ and $B_p$ operators will all yield eigenvalue $+1$ since $A_v\ket{\psi}=\ket{\psi}$ and $B_f\ket{\psi}=\ket{\psi}$ for $\ket{\psi}\in\Hil_S$. The violation of any of these conditions signals a possible error having occurred on the encoded state $\cw$. Thus, errors will be detected by performing syndrome measurements using the stabilizer generators $A_v$ and $B_f$.

In what follows, the schemes for detecting possible errors occurring on the encoded state $\ket{\psi}\in\Hil_S$ will initially be described in a case-by-case basis until enough intuition is gathered to describe a more general error correcting scheme. In particular, first we will only analyze $\pz$ errors, and then use this understanding to analogously reason about $\px$ errors. Further, some additional terminology will be introduced to ease the analysis and motivate further topological implications of the toric code. The new entities here will be quasiparticles that are commonly referred to in the literature as anyons. In essence, errors in the toric code manifest themselves as pairs of anyon excitations. Despite having a rich theory in themselves, the relevant details regarding anyon theory will not be presented here.The interested reader is invited to consult the following sources: \cite{Kitaev}, \cite{Rao}. Fortunately, because of the way the toric code model will be presented here, the unacquainted reader need not suffer from their ignorance.

The case of a single $\pz$ error

Suppose now that some $\sigma^z_j$ error occurs on qubit $j$ of the lattice (but the location is not known) so that the encoded state $\cw$ is transformed into the erred state $\ket{\xi}=\pz_j\cw$. In this case, any $B_v$ commutes with $\pz_j$ as $B_v$ is also in terms of $\pz$ operators. Therefore, measuring any of the $B_v$ operators will also return a $+1$ eigenvalue since
\[
B_f\ket{\xi}=B_f\pz_j\cw=\pz B_f\cw=\pz\cw=\ket{\xi}.
\]

Any $A_v$ such that $v$ is not one of the two vertices at the end of edge $j$, will trivially commute with $\pz_j$ by the mere fact that these operators act on different qubits of the lattice. However, there exists precisely two $A_v$ operators that will anti-commute with $\pz_j$. Namely, the two $A_v$ that correspond to the two vertices at the ends of edge $j$. These anti-commute because the $A_v$ operators are in terms of $\px$ operators and $\px$ and $\pz$ anti-commute. Hence, when the two $A_v$ operators that act on the qubit on edge $j$ are measured they will give an eigenvalue $-1$:
\[
A_v\ket{\xi}=A_v\pz_j\cw=-\pz A_v\cw=-\pz\cw=-\ket{\xi}.
\]
The result of measuring these two $A_v$ operators gives information on exactly which qubit $j$ was inflicted with the $\pz$ error. Moreover, the error can be corrected by applying $\pz_j$ to the erred state which returns it back to the original state $\cw\in\Hil_S$.

The pair of vertices at the ends of the edge $j$ corresponding to the locations of the $\pz_j$ error can be thought of as the locations of a pair of anyon excitations. We will now introduce the notation $z$ to represent a $z$-type anyon, and place two $z$ anyons at these two vertex locations as shown in the figure below. In this way, the syndrome measurements given by the $A_v$ operators can be thought of as detecting the presence of a $z$ anyon at a particular vertex as shown in the figure:

A pair of $z$-anyons are present on the two vertices that uniquely determine the location of the $\pz$ error.

The case of a single $\px$ error

Instead, suppose know that single $\px_j$ error is inflicted on the qubit located on edge $j$, but the precise location of $j$ is not known. The detection and correction of such an error proceeds in an analogous manner to the $\pz_j$ error described in the previous section. This time, the encoded state is transformed into the erred state of the form $\ket{\xi}=\px_j\cw$. Since all the $A_v$ operators are in terms of $\px$ as well, they all commute with $\px_j$. Thus measuring each $A_v$ on the lattice gives an eigenvalue $+1$ yielding no information pertaining to the location of the error since
\[
A_v\ket{\xi}=A_v\px_j\cw=\px A_v\cw=\px\cw=\ket{\xi}.
\]
On the contrary, every $B_f$ operator will commute with the $\px_j$ error with the exception of two $B_f$ operators. In this case, the two anti-commuting operators will correspond two the adjacent faces of the lattice that share the common edge $j$. Measuring these these two operators leads to the detection of the $\px_j$ error due to a $-1$ eigenvalue:
\[
B_f\ket{\xi}=B_f\px_j\cw=-\px B_f\cw=-\px\cw=-\ket{\xi}.
\]
Since these two $B_f$ operators correspond to the only adjacent faces next to the error, they uniquely determine the error's location and the error can be corrected by simply applying another $\px_j$ operator to bring the erred state $\ket{\xi}$ back to an encoded state $\cw\in\Hil_S$. To signify the presence of this error, a $x$-type anyon denoted by $x$ will be introduced. This time, however, two $x$ anyons will be placed on the two faces adjacent to the location of the $\px_j$ error as shown in the figure below. Similar to how the $A_v$ operators are able to detect the presence of a $z$} anyons representing $\pz$ errors, the presence of the two $x$ anyons representing $\px$ errors are detected instead by the $B_f$ operators.

A pair of $x$-anyons are present on the two faces that uniquely determine the location of the $\px$ error.

Strings of multiple $\pz$ errors

In the previous section, it was shown how a single $\pz$ and a single $\px$ error can be detected and corrected. This strategy will also work for correcting multiple $\pz$ and $\px$ errors provided that the errors act on qubits (edges) that are not adjacent. If there happens to be, say, multiple $\pz$ errors such that the location of the errors forms a chain as depicted in the figure shown below, then the error syndrome will be inherently ambiguous and more care must be taken in attempting to correct the errors.

In order to better reason about a chain of adjacent errors given by the path $\overline{p}$ on the lattice, introduce the following string operator
\[
S^z(\overline{p})=\PROD{j\in\overline{p}}{}\pz_j,
\]
which applies a $\pz_j$ along each edge $j$ that is a part of the path $\overline{p}$. The string $\St{z}{p}$ then represents the case where a sequence of adjacent $\pz$ errors have inflicted the qubits along the path $\overline{p}$ on the lattice as shown in the figure below. The case of a $\pz$ error occurring on a single qubit, as discussed in the previous section, is a special case of a string operator $\St{z}{p}$ where the path $\overline{p}$ simply consists of a single edge.

An error $\St{z}{p}$ represents $\pz$ errors applied to all edges along the path $\overline{p}$.

When the string $\St{z}{p}$ effects an encoded state $\cw\in\Hil_S$ the erred state is of the form $\ket{\xi}=\St{z}{p}\cw$. It will now be shown that detecting the exact locations of all the errors induced by the string $\St{z}{p}$ using the stabilizer operators $A_v$ and $B_f$ is inherently ambiguous. Naturally, all $B_f$ commute with $\St{z}{p}$ since $B_f$ consists of $\px$ operators, and so will not yield any useful syndrome information. One may be tempted to think that any $A_v$ operator whose vertex $v$ lies on any part of the path $\overline{p}$ will anti-commute with $\St{z}{p}$ thereby detecting the presence of all the $\pz$ errors, but this is not the case. Actually, any $A_v$ whose vertex $v$ lies on the path $\overline{p}$ with the exception of the two vertices at the endpoints of the path $\overline{p}$ will also commute with the string $\St{z}{p}$, because the path $p$ will always pass through two edges in $star(v)$. For each of these two edges the $\px$ from $A_v$ anti-commutes with the error $\pz$ on that edge producing a $-1$, but since the other $\px$ and $\pz$ acting on the other common edge also produces a $-1$ the effect of the two will cancel yielding a trivial syndrome measurement. The only two $A_v$ that manage to detect an error produced by $\St{z}{p}$ via a $-1$ syndrome are the two $A_v$ corresponding to the endpoints of the path $\overline{p}$. These two $A_v$ anti-commute with $\St{z}{p}$ because $A_v$ and $\St{z}{p}$ only act on a single common edge in this case. Hence, there are two $z$ anyons that reside at the endpoints of $\overline{p}$ as shown in the figure below. This exemplifies the important property of the $z$ anyons that they will always appear as pairs whenever $\pz$ errors are present.

Two $z$-anyons are placed at the endpoints of the error string $\St{z}{p}$ to signify the vertex locations with nontrivial syndrome measurements.

Since the syndrome measurements in the case of a string $\St{z}{p}$ of $\pz$ errors only gives information specifying the endpoints of the error string $\St{z}{p}$, how then are all the errors to be corrected? It would be ideal if the exact path defining the string $\St{z}{p}$ was known, because then the errors can be corrected by simply applying all the $\pz$ operators comprising the string. To understand how to overcome this obstacle consider the following.

Since the exact form of the error string $\St{z}{p}$ is unknown, and only the endpoints of the string can be detected, the best one could do is to guess a path $\overline{p_g}$ that has the common endpoints with the actually error path $\overline{p}$. Thus, consider some string operator $\St{z}{p_g}$, where the path $\overline{p_g}$ has the same endpoints as the actual error path $\overline{p}$. Recall, that these endpoints are just the locations at which the pair of \cir{$z$} anyons reside. The union of these two paths $\overline{p}$ and $\overline{p_g}$ (denoted by the concatenation $\overline{pp_g}$) forms a closed loop $\overline{L}=\overline{pp_g}$ on the lattice. The union of the two strings can then be written as $\St{z}{L}=\St{z}{p_g}\St{z}{p}\cw$. If the string $\St{z}{p_g}$ is applied to the erred state $\ket{\xi}=\St{z}{p}\cw$, it is transformed to the state
\[
\ket{\xi'}=\St{z}{p_g}\ket{\xi}=\St{z}{p_g}\St{z}{p}\cw=\St{z}{L}\cw.
\]

Depending on the structure of the loop $\overline{L}$ it may be the case that $\ket{\xi'}=\cw$ implying that the state is returned to its original encoded state. However, it can also be the case that $\ket{\xi'}=\ket{\psi'}\in\Hil_s$ where $\ket{\psi'}\neq\cw$. In this latter case, the erred state $\ket{\xi}$ is not returned to the original state $\cw\in\Hil_S$. Instead, it is transformed to some other different encoded state $\ket{\psi}\in\Hil_S$ and error recovery fails. When this occurs a logical operation has been performed on the encoded state---an undesired effect when the objective is to merely preserve the state $\cw$.

A loop detour on the torus
This approach of guessing a string $\St{z}{p_g}$ in hopes of correcting some error string $\St{z}{p}$ by forming a loop $\overline{L}=\overline{pp_g}$ succeeds depending on the nature of the loop $\overline{L}$. Keeping in mind that the lattice under considerations is embedded on the surface of a torus, any loop on the lattice come in two varieties. Wether or not the error is corrected depends on which type of loop is formed in $\St{z}{L}$.

In general, a loop in a plane always partitions the planar region into two disjoint parts: an inside and an outside. Such loops can always be contracted to a point on the surface. A loop of this variety will be referred to as a trivial loop. On the surface of a torus, this property of a loop being able to contract to a point does not always hold. For instance, consider a closed loop that wraps around the hole of the torus. It is impossible to contract such a loop to a point on the surface of the torus. A non contractible loop on the torus does not partition the surface of the torus into two disjoint regions, and thus does not have a well defined 'inside' or 'outside'. A loop that cannot be contracted to a single point will be called a nontrivial loop. For a torus, there are three such classes of nontrivial loops: ones that loop around the hole, ones that loop around the `equator' of the torus, and ones that loop around both the hole and the equator. Of course, nontrivial loops may loop multiple times around the torus in these various ways.

In the lattice picture with periodic boundary conditions representing the surface of a torus, the nontrivial loops are ones that pass through the periodic boundary on any of the sides. A path on this lattice will form a trivial loop if it never passes through a boundary. If a path does pass through a boundary, it can still form a trivial loop provided that it passes back through that same boundary before joining itself.

Recovering from $\pz$ errors
In regards to error recovery, consider some string loop $\St{z}{L}$ that has been constructed such that $\overline{L}$ is a trivial loop on the torus. In this case, the loop $\overline{L}$ forms the boundary of an inner region as shown in the figure below. In fact, this loop string $\St{z}{L}$ can be expressed completely in terms of certain $B_f$ operators as
\[
\St{z}{L}=\PROD{f\in inside(\overline{L})}{}B_f,
\]
where the set $inside(\overline{L})$ consists of all the faces inside of the loop $\overline{L}$. This is true because in such a product of $B_f$ operators, any edge inside of the loop is acted on by two adjacent $B_f$ operators so that the action of both of the two $\pz$ operators on that edge is the identity map. The only participating edges in this product of $B_f$ operators that are acted on nontrivially are precisely those that comprise the loop $\overline{L}$. Now, since every $B_f$ is an element of the stabilizer $S$, the action of $\St{z}{L}$ on any state $\cw\in\Hil_S$ is trivial. Then if some erred state is of the form $\ket{\xi}=\St{z}{p}\cw$ after the encoded state $\cw$ is inflicted with a the string $\St{z}{p}$, and another guessed string $\St{z}{p_g}$ is applied so that $\St{z}{L}$ (where $\overline{L}=\overline{pp_g}$) forms a trivial loop, the erred state $\ket{\xi}$ is transformed as
\[
\St{z}{p_g}\ket{\xi}=\St{z}{p_g}\St{z}{p}\cw=\St{z}{L}\cw=\PROD{f\in inside(\overline{L})}{}B_f\cw=\cw.
\]
This shows that error recovery is successful if the error string is made into a trivial loop.

A trivial loop can be expressed as the product of all face operators $B_f$ corresponding to the faces contained in the loop $\overline{L}$.

On the other hand, in the presence of a $\St{z}{p}$ error string, suppose a string $\St{z}{p_g}$ was guessed so that the union of the two form a loop string $\St{z}{L}$, such that $\overline{L}$ is a nontrivial loop. The loop $\overline{L}$ no longer partitions the surface into two disjoint regions. In this scenario, it is impossible to express $\St{z}{L}$ exclusively in terms of $B_f$ operators as done in the case of a trivial loop. This means that $\St{z}{L}\notin S$, since it cannot be generated by elements of $S$. Yet, $\St{z}{L}$ still commutes with every element of $S$, because the loop $\overline{L}$ has no endpoints by definition. More explicitly, for any vertex $v$ on the loop, the loop will always pass through exactly two edges in $star(v)$ making $A_v$ commute with $\St{z}{L}$. Hence, no element of $S$ is able to detect any of the errors inflicted by $\St{z}{L}$. Therefore, despite attempting to correct the error on the state $\cw\in\Hil_S$, a logical operation is inadvertently applied to the state transforming it to some other $\ket{\psi'}\in\Hil_S$ and error recovery fails.

An elegant interpretation of the error recovery procedure is provided in terms of anyons. When the qubits of the lattice encode some state $\cw\in\Hil_S$ which is error free, no anyons are present. As mentioned previously, the existence of an open error string $\St{z}{p}$ results in a pair of $z$ anyons at the strings endpoints. In creating a loop $\overline{L}=\overline{pp_g}$ by applying another string $\St{z}{p_g}$ that starts at one of the end points (where one of the $z$ anyons is located) and then joining the path to the other endpoint (where the other $z$ anyon is located), the anyon pair can be thought of as fusing or annihilating one another. In a sense, $\sigma^z$ errors manifest themselves as pairs of $z$ anyons that reside on the lattice's vertices, and can be made to move around the lattice by applying strings of $\pz$ operators. The objective of successful error recovery is not only just to bring these pairs of $z$ anyons together so they annihilate and disappear, but to do so in such a way that when they fuse no anyon would have made a non-trivial loop around the torus in order to prevent some logical operation from occurring to the encoded state.

The Dual Lattice

Up until now, the discussion has been mostly focused on $\pz$ errors and how to correct them. The focus will now shift to correcting $\px$ errors. Fortunately, the understanding and intuition developed in the previous sections naturally extend over to this case. This transition will be assisted by considering the dual lattice as an abstract aid to reason about $\px$ errors.

Relative to the actual square lattice under consideration, the dual lattice is the lattice that has vertices at the center of the faces of the main lattice, and has the center of its faces at the locations of the vertices of the main lattice. In the dual lattice, the qubits still reside on the edges and remain in the same location as the main lattice. Naturally, the dual of the dual lattice is just the main lattice again. The main lattice (in solid lines) and the dual lattice (in dashed lines) are depicted together in the figures shown below.

The utility in considering the dual lattice comes from being able to reason about $\px$ errors analogously to the way $\pz$ errors were analyzed. Just as paths were considered on the main lattice, paths $\overline{p'}$ will be considered on the dual lattice and will be referred to as co-paths. When displaying figures in the rest of this paper the dual lattice will not be depicted, and co-paths will be drawn as dashed lines as shown in the figure above. One useful consequence of considering the dual lattice is that the vertex operators $A_v$ represented in terms of $star(v)$ on the main lattice can now be perceived as face operators on the dual lattice as shown in the figure:

The main lattice and the dual lattice are depicted with the dual lattice as dashed lines. A co-path $\overline{p'}$ of the dual lattice as shown. The vertex operators $A_v$ can be thought of as a face operator on the dual lattice.

A somewhat pointless, yet kind of neat, interpretation of how the lattice relates to its dual is the following. By rotating each edge about its center by an angle of $\pi/2$, the lattice can be transformed into is dual. With this in mind it becomes more clear how a vertex operator on the main lattice becomes a face operator on the dual lattice as animating in the figure below:

The square lattice can be transformed into its dual by rotating each edge about its center by an angle of $\pi/2$. This turns vertex operators on the main lattice into face operators on the dual lattice.

Correcting $\px$ Errors
To represent multiple $\px$ errors that are adjacent to each other, define the string operator
\[
\St{x}{p'}=\PROD{v\in \overline{p'}}{}A_v,
\]
where the product ranges over vertices $v$ on the co-path $\overline{p'}$ of the dual lattice. For an open co-path $\overline{p'}$, a string $\St{x}{p'}$ manifests two $x$ anyons at its endpoints which lie on the faces of the main lattice. Similar to the correction of $\pz$ errors via the fusion of $z$ anyons, the objective of error recovery for $\px$ errors will be to fuse pairs of $x$ together so that they only form trivial co-loops on the dual lattice. If two $x$ are fused after a nontrivial co-loop has been travelled, a logical operation will be performed on the encoded state instead.

Any trivial co-loop $\overline{L'}$ can be expressed in terms of $A_v$ operators as
\[
\St{x}{L'}=\PROD{v\in inside(L')}{}A_v,
\]
where now the product ranges over all vertices of the main lattice contained inside of the co-loop $\overline{L'}$. Thus, $\St{x}{L'}\in S$ and so acts trivially on any encoded state in $\Hil_S$. Moreover, any nontrivial loop $\overline{L}$ lies outside of the stabilizer and so cannot be expressed in terms of the operators in $S$. However, for a nontrivial loop $\overline{L'}$ the operator $\St{x}{L'}$ commutes with every element of the stabilizer $S$. Hence, an error of this form can be detected by any syndrome measurement. If non-trivial loops of $\px$ operators inflict some encoded state to be protected, a logical operation will inevitably be applied and error recovery fails. The reason why this is all the case follows from analogous arguments described in the previous section.

General Errors

In the previous sections, $\pz$ and $\px$ errors were analyzed seperately in special cases where only a single string of errors if either type were described. Here, we saw that strings of errors manifest themselves as pairs of anyons---$z$ anyons on the vertices of the lattice for $\pz$ errors and pairs of $x$ anyons on the faces of the lattice for $\px$ errors. In the most general setting, it is expected that multiple strings of errors of both types can occur on the lattice at any time. Moreover, a particular qubit of the lattice may suffer from both a $\pz$ and a $\px$ error, in which case a $\py$ error is produced. In this case, a $y$ anyon will be introduced and should be thought of as the quasiparticle associated to the pair of $z$ and $x$ anyons at that site.

When multiple strings of errors of the same type are present on the lattice, there is an inherent ambiguity of how the pairs of anyons should be fused in order to correct the errors since the only information that is provided from a syndrome measurement is the locations of the anyons. Any such matching of the pairs will result in loops/co-loops (perhaps multiple) on the lattice or dual lattice. Actually, the choice made in this fusion process is somewhat arbitrary. For proper error recovery to take place all that is necessary is for none of the anyons to traverse a nontrivial loop. However, ensuring that error recovery proceeds in this way is still difficult and there is no sure way to guarantee all loops are made trivially. A configuration of errors illustrating this general setting is shown in the figure below. In addition, two other following figures are shown where guesses are made in an attempt to correct the errors. The lighter paths are meant to denote the actually string of errors that were originally present on the lattice. The darker paths represent the paths that were guessed. In both cases, it can be seen that some of the loops formed are trivial loops in which case those particular errors will be corrected, but there are also some nontrivial loops present which result in logical operations being applied to the encoded state.

In the general case of multiple errors, all that is revealed from the syndrome measurements is the locations os anyon pairs.

A possible correction attempt, where the lighter paths are the actually error strings, and the darker paths are the guessed paths. All errors in the case are corrected, except for the leftmost string of $\pz$ errors where a nontrivial loop has been created.

Another possible correction attempt, where the lighter paths are the actually error strings, and the darker paths are the guessed paths. All $\px$ errors in the case are corrected, but the $\pz$ error strings have been formed into one nontrivial loop.

The problem of being able to correct multiple errors in the toric code thus reduces to another problem: that of matching anyon pairs accordingly as to only form trivial loops. This is essentially a problem that requires additional post-processing utilizing the information given from the syndrome measurements. The most natural strategy is to guess strings that are of minimal length in fusing the anyon pairs. In the literature, such a strategy is referred to as minimal weight matching \cite{Dennis}. One reason why this strategy is justified is because if the probability of a single error occurring on a qubit is small, then it us unlikely that the error strings will be very long. It is more likely that say, $m$, isolated errors occur at different locations of the lattice, than it is for all $m$ errors to occur along a single string. If this is the case, then anyon pairs will tend to stay close to one another. Correcting the errors may then be achieved by fusing anyon pairs that are closest to each other.

For some string of of either $\pz$ or $\px$ errors, define the length of the string to be the number of edges in the path defining the string. Then for a $k\times k$ lattice, if some string were to form a nontrivial loop its length must be at least $k$. This number $k$ is the code distance of the toric code. This means that, at least in principle, any error string of length less than $k$ can be corrected provided that error recovery proceeds in an appropriate fashion. However, for errors of length greater than $k$ it may not be possible to recover from the error and a logical operation being performed on the encoded state may be inevitable. Let $\lfloor c \rfloor$ denotes the floor function defined as the largest integer less than or equal to $c$. If some error string has length $\lfloor \frac{k-1}{2}\rfloor$, then the error can always be corrected by joining the anyon pair through a minimal length path.

Despite the $k \times k$ toric code only ever being able to encode $2$ logical qubits in the code space regardless of the size $k$ of the lattice, the benefit of using a larger lattice is apparent. If the probability for an error happening on a single qubit is $p$, and different errors remain uncorrelated, then it can be seen that the probability of an error string with length $k$ occurring decreases exponentially. Thus, the larger $k$ is, the more unlikely it is for an error string to form a nontrivial loop.

Logical Operations

In general error correcting schemes, a state $\cw\in\Hil_S$ of the code space is prepared and the main objective is to keep the state invariant. In the presence of errors the state can be mapped outside of the encoded space. This may not always be the case however. It is possible that the original state get mapped to another state $\ket{\psi'}\in\Hil_S$ such that $\cw\neq\ket{\psi'}$. In such an occurrence, it was said the the encoded state $\cw$ experiences a logical operation. In the toric code, these logical operations occurred when a nontrivial loop was executed on the lattice.

To understand what kind of operations these logical operations on the encoded state correspond to, recall that their were two types of nontrivial loops on the surface of the lattice. One of the nontrivial loops corresponded to a loop that wraps around the hole of the torus, and the other corresponded to a loop that traverses the `equator' of the torus. In the lattice picture, one of these loops pass through the 'north-south' boundary, and the other through the 'east-west' boundary. Likewise, there were two analogous nontrivial co-loops for the dual lattice.

By representing the encoded state as $\cw=\ket{\overline{00}}\in\Hil_S$, where the label $\overline{00}$ has been used to emphasize that the state encodes two logical qubits, the nontrivial loops can be interpreted as logical $\pz$ and $\px$ operations. For a nontrivial string of $\pz$ operators looping in the `north-south' direction, write $\overline{Z_1}$ to represent a logical $\pz$ operation applied to the first logical qubit of $\cw$. The nontrivial loop in the 'east-west' direction then corresponds to a logical $\pz$ operation applied to the second logical qubit of $\cw$, denoted by $\overline{Z_2}$. Likewise, the two nontrivial loops of $\px$ errors on the dual lattice correspond to logical $\px$ operations denoted by $\overline{X_1}$ and $\overline{X_2}$. It does not matter how any of these nontrivial loops are formed, in the sense that the exact path described by a particular string operators forming the nontrivial loop is irrelevant. All that matters is which topological class the loop falls in. If a certain loop is traversed $n$ times, then the logical operation executed is one of $Z_i^n$ or $X_i^n$ depending on the direction and wether or not the loop was on the lattice or dual lattice, respectively. If a loop traverses multiple topologically distinct nontrivial loops, then the corresponding logical operation that is performed is the composition of the respective logical operations from each loop in the order that the loops were traversed.

In terms of logical operations on the encoded qubits, all that can be performed are logical Pauli operations on the two encoded qubits. Hence, universal quantum computation cannot be performed using only these limited operations, since arbitrary two qubit gates cannot be realized. In particular, any two qubit entangling gate can not be performed on the logical qubits since only tensor products of two Pauli gates can be executed by the logical operations $Z_i$ and $X_i$.

Generalizations and Conclusion
In summary, the $k\times k$ toric code uses $2k^2$ total qubits placed on the edges of the square lattice to effectively encode $2$ logical qubits. Its main function is to serve as a quantum memory that protects the state of two qubits from Pauli errors. Errors on the lattice manifest themselves as pairs of anyon excitations. The objective of error correction is to fuse the anyons together by forming only trivial loops with the anyons in the process. If nontrivial loops were performed then a logical operation is performed on the encoded state corresponding to $\px$ and $\pz$ operations or combinations thereof. The distance of the code was shown to be $k$, which corresponds to the shortest length path needed to form a non-trivial loop on the lattice. The larger the lattice dimension $k$, the more unlikely it is to form a nontrivial loop, and thus the chances of proper error recovery increase.

It is not necessary that the dimensions of the square lattice be symmetrical. In the case of a $k\times k'$ lattice, with $k\neq k'$, the code distance will correspond to the minimum value of $k$ and $k'$ since this will correspond to the shortest possible length an error string needs to make before it forms a nontrivial loop. The limitation of only being able to encoded two logical qubits can also be lifted if the lattice is embedded on a higher genus surface. For a surface of genus $g$ the number of encoded qubits is given by $4^g$, since each additional hole effectively adds 4 new topological classes of nontrivial loops. It is worth mentioning that it is not absolutely essential to embed the lattice on a non-planar surface with nonzero genus. Remaining on the plane, the toric code can be generalized by redefining the boundaries to have appropriate properties. In this scenario, more logical qubits can be encoded by adding "holes" to the planar region by removing qubits and stabilizers from the region. The holes are present, the more logical qubits can be encoded. Such generalizations are considered in \cite{Dennis}.

In the toric code model introduced here, the permissible logical operations are not rich enough to allow for universal quantum computation. The toric code model is captured more generally in the quantum double models defined for any finite group $G$. Abstractly, this uses the mathematical theory concerning group representations of quasi-trianglular Hopf algebras to define more general anyon dynamics. The toric code model presented here is just the special case of a quantum double over the cyclic group of order $2$. Generalizations in this regard have been made in the field, and can be found in \cite{Shor}. In order to achieve universal quantum computation the group under consideration must be non-abelian and non-solvable.\cite{Mochon} Even then the complexity of the scenario is increased significantly since the smallest group satisfying this property is the alternating group, $A_5$, consisting of all even permutations on five letters, which has order $60$.

The Channel Capacity of the Quantum Erasure Channel

2013-12-15T20:00:00.000-08:00

Abstract

The quantum channel capacity $Q(\Phi)$ of a quantum channel $\Phi$ is defined as the maximum asymptotic rate at which information can be reliably sent through the channel. In this post, the channel capacity $Q(\Er_p)$ of the erasure channel $\Er_p$ is proved, where the erasure channel $\Er_p$ is defined as the quantum operation on a qubit which erases the state with probability $p$ and leaves it unchanged otherwise. It is proved that $Q(\Er_p)=1-2p$ for $0\leq p\leq 1/2$ and that $Q(\Er_p)=0$ for $1/2\leq p\leq 1$. The latter case is proved via a contradiction with the no-cloning theorem, and the former is proved through the existence of a quantum error correcting code in the stabilizer formalism which achieves the optimum rate.

Introduction

The theory of quantum information is ultimately concerned with defining and understanding the properties and essence of quantum systems, how they transform, and how they can be used as resource in various contexts. Moreover, it is of great theoretical interest and practical importance to obtain quantitative measures of what can or cannot be accomplished in the quantum realm. Perhaps the most prevalent feature of quantum systems is their inherent fragility and high sensitivity to unwanted noise. Unlike the seemingly robust nature of the classical world, quantum systems are vulnerable to phenomenon which seem to "destroy" their very own quantum mechanical nature.

With this is mind, consider the task of trying to send quantum state to another party through a quantum channel capable of transmitting quantum information. Ideally, we would hope for perfect transmitting capabilities of such a channel, but it is more realistic to acknowledge some likelihood of error in which the quantum system is subjected to noise--or unwanted transformations of its state---before being received at the other end. Equivalently, we may be interested in merely preserving the state of a quantum system over time as opposed to spatially relocating the system. Even then it is possible that the system may experience undesired errors to its state. Thus, it is worthwhile to understand the precise conditions under which a quantum state could be reasonably recovered when transformed under a noisy quantum channel. The channel capacity of a quantum channel gives precise bounds on the rate at which quantum information can be reliably transferred through the channel and still be recovered with sufficiently high fidelity. In order to successfully transfer information through a noisy quantum channel it is beneficial and often necessary to encode the desired state into another quantum system, which is then subjected to errors and subsequently decoded back to the original state that was intended to be sent. This latter scheme is the domain of quantum error correcting codes. This reveals an intimate connection between channel capacities and error correction, that is, the channel capacity sets fundamental limits on the existence of successful error correcting codes which attempt to correct errors induced by that channel.

In what follows, the notion of the channel capacity of a channel will be made more precise and formal. With this framework the channel capacity of the erasure channel will be rigorously calculated. In short, the erasure channel can be simply be thought of as some channel which "erases" or "destroys'' some quantum bit that passes through the channel with some probability $p$, and leaves the quantum bit unchanged otherwise. The erasure channel, although relatively primitive and crude in its action, characterizes the natural phenomenon of information being lost in some process. This differs from the related depolarizing channel, which instead of completely destroying the state effectively randomizes it. One crucial difference worth mentioning is that in the context of the depolarizing channel, the sender or receiver may not know whether the system being transmitted has been randomized. In the case of the erasure channel, we will assume that at least the receiver of the system is aware of the existence of an erasure when it occurs.

The proof for obtaining the channel capacity will proceed in three steps. For an erasure channel with a probability of erasure occurring given by $p>1/2$, it will be shown that the channel capacity must be $Q=0$ in order to avoid contradiction with a fundamental result of quantum theory concerning the inability to clone arbitrary quantum states---the no-cloning theorem. Then in the case for $p\leq 1/2$, it will first be shown that the channel capacity must be bounded above by $Q\leq1-2p$ due to a property of channel capacities that prevents the total number of bits transmitted from being super-additive when a noisy and noiseless channel are considered together.\cite{Ben1} Finally, it will be argued through the construction of random stabilizer codes that error correcting codes do indeed exist \cite{Got} that achieve the channel capacity $Q=1-2p$ for the case of $p\leq 1/2$. Interestingly, our ability to obtain exact bounds for the erasure channel is in contrast for the channel capacity of the depolarizing channel (and most other channels), for which only upper and lower bounds are known.

As a note on the required background knowledge needed to follow this proof, it will be assumed that the reader is already aware of some basic concepts and axioms in the theory of quantum computation and information \cite{NC} and possesses a fair degree of mathematical maturity---namely, a working knowledge of linear algebra together with the basics of finite group theory will be assumed. Prior knowledge on the theory of error correction and the stabilizer formalism will be beneficial.

Notational Remarks

For some Hilbert space $\Hil$, let $D(\Hil)$ denote the space of linear operators on $\Hil$ that are valid quantum density states (positive operators with unit trace). Moreover, let $C(\Hil_1,\Hil_2)$ denote the space of valid quantum channels (completely-positive and trace preserving maps) that transform states $\rho_1\in D(\Hil_1)$ to states $\rho_2\in D(\Hil_2)$. For convenience, define $C(\Hil):=C(\Hil,\Hil)$ for the case where the quantum channels map states from a particular space $\Hil$ to the same space $\Hil$.

Quantum Error Correcting Codes

To formally define the channel capacity $Q(\Phi)$ of a quantum channel $\Phi$, we first need to formalize the notion of a quantum error correcting code. The objective of quantum error correction is to encode a quantum state $\rho$ into a particular subspace $\Hil_{code}\subset\Hil$ of a larger Hilbert space $\Hil$ by the means of some quantum operation $U_c: \rho\mapsto \rho_c$, where $\rho_c\in\Hil_{code}$. The operation $U_c$ is called the \emph{encoding operation}, the state $U_c(\rho)=\rho_c$ is called the \emph{encoded state} of $\rho$, and the space $\Hil_{code}$ is often referred to as the \emph{code space} of a quantum error correcting code. States of $\Hil_{code}$ should have the desired property that they can be recovered when subjected to errors. More formally, let $\Phi\in C(\Hil)$ be some quantum channel of interest that represents the action of a possible error effecting the state $\rho_c$, and let $\Phi(\rho_c)=\rho_e$. Along with the existence of an encoding operation $U_c$, an error correcting code also has a \emph{decoding operation} given by $U_d$. This operation serves the purpose of decoding the possibly erred state $\rho_e$, and should ideally satisfy $U_d(\rho_e)=\rho$. An error correcting protocol as described here is summarized schematically in the figure below.

A circuit representing the general error error correction scheme.

In a less idealistic scenario, when $U_d(\rho_e)=\rho'\neq\rho$, we may be satisfied with having the resulting state $\rho' $ being approximately equal to $\rho$ through some measure of closeness. One such useful measure is given by the \emph{fidelity} $F(\rho,\rho')$ of two states $\rho,\rho'\in D(\Hil)$ of some Hilbert space $\Hil$, where
\[
F(\rho,\rho')=Tr(\sqrt{\rho^{\frac{1}{2}}\rho'\rho^{\frac{1}{2}}}),
\]
and $Tr(.)$ denotes the trace operation. The fidelity of two states satisfies $0\leq F(\rho,\rho')\leq1$, where increasing values of $F(\rho,\rho')$ implying that the states $\rho$ and $\rho'$ are closer.

Noisy Channel Coding and Channel Capacity

Let $\Phi\in C(\Hil)$ be some quantum channel, and consider its $n$-fold extension
\[
\Phi^{\otimes n}=\Phi\otimes\dots\otimes\Phi \in C(\Hil^{\otimes n})=C(\Hil)\otimes\dots\otimes C(\Hil),
\]
which essentially represents the channel $\Phi$ acting individually on each of the $n$ copies of $\Hil$. Now, suppose there is some error correcting code with code space $\Hil_{code}\subset\Hil$. For noisy channel coding \cite{Nielsen}, we consider the $n$-fold code space $\Hil_{code}^{\otimes n}\subset \Hil^{\otimes n}$, and are interested in source states $\rho\in D(\Hil)$ such that there exists an encoding operation $\overline{U}_c$ that takes a source state and maps it to an encoded state $\overline{U}_c(\rho)=\rho_c\in D(\Hil_{code}^{\otimes n})$, which is then acted on by $\Phi^{\otimes n}$ and subsequently decoded by a decoding operation $\overline{U}_d$ yielding a state $\rho'\in D(\Hil)$ close to the original source state $\rho$ in regards to the fidelity $F(\rho,\rho')$.

Suppose that the dimension of the space $\Hil$ is $2^m$ so that $\rho\in D(\Hil)$ is a state consisting of $m$ qubits. In this context, the error $\Phi$ can be thought of as operating on $n$ blocks, each consisting of $m$ qubits, as illustrating in the circuit below. Thus, this coding model can be thought of using $n$ applications of the channel $\Phi$ to successfully send $m$ qubits through the channel provided that the encoding and decoding operations $\overline{U}_c$ and $\overline{U}_d$ exist which preserve the original source state $\rho$.

A circuit representing a particular noisy channel coding scheme, where an input state $\rho$ of some number $m$ of qubits is acted on by $4$ blocks consisting of $m$ qubits each.

The quantity $R=m/n$ is defined as the rate of the channel $\Phi$. In this way, the channel capacity $Q(\Phi)$ is defined as the maximum asymptotic rate at which information can be reliably sent through the channel $\Phi$. More precisely, the channel capacity is the largest value $Q(\Phi)$ such that for any $R\leq Q(\Phi)$ and $\epsilon>0$, there exists an error correcting code $\Hil_{code}^{\otimes n}$ with rate at least $R$ where for any $\rho$ with $\overline{U}_c(\rho)=\rho_c\in\Hil_{code}^{\otimes n}$, the resulting recovered state $\rho'$ satisfies $F(\rho,\rho')=1-\epsilon$.

Modelling the Erasure Channel

Let $\ket{0}, \ket{1} \in \Hil^2$ denote the standard computational basis states of a Hilbert space $\Hil^2$ of dimension $2$ representing a qubit. To model an erasure we will embed the states $\ket{0}$ and $ \ket{1}$ into a higher dimensional space and introduce a third basis state $\ket{2}\in\Hil^3$ that is orthogonal to both $\ket{0}$ and $ \ket{1}$ so that the three states together span $\Hil^3$. This third state $\ket{2}$ then serves the purpose of representing an erasure of either the states $\ket{0}$ and $ \ket{1}$.

Now consider appending to the state of a qubit an ancilliary register belonging to some Hilbert space $\Hil_E$ that represents the ``environment" of some appropriate dimension (which may depend on the context). Then the action on the basis states of the qubit of the erasure channel, denoted by $\mathcal{E}_p$, is defined as
\begin{align*}
\ket{0}\otimes\ket{0}_E& \ \ \overset{\mathcal{E}_p}\mapsto \ \ \sqrt{1-p}\ket{0}\otimes\ket{0}_E+\sqrt{p}\ket{2}\otimes\ket{1}_E \\
\ket{1}\otimes\ket{0}_E& \ \ \overset{\mathcal{E}_p}\mapsto \ \ \sqrt{1-p}\ket{1}\otimes\ket{0}_E+\sqrt{p}\ket{2}\otimes\ket{2}_E, \\
\end{align*}
where the states $\ket{0}_E,\ket{1}_E,\ket{2}_E\in\Hil_E$ are orthogonal states of the environment, and the map $\mathcal{E}_p$ has been parametrized by some real number $0\leq p\leq1$ that gives the probability that the qubit will be erased.

The Channel Capacity of the Erasure Channel

To calculate the channel capacity $Q(\mathcal{E}_p)$ of the erasure channel, we must consider the scenario described in the previous section on noisy channel coding, where now the channel of interest is $\Phi=\mathcal{E}_p$. In what follows we will prove the following theorem:

Theorem: The channel capacity of the erasure channel.
For the erasure channel $\mathcal{E}_p$, which erases the qubit state with probability $p$ and leaves the state intact with probability $1-p$, the channel capacity $Q(\mathcal{E}_p)$ is given by

$$ Q(\mathcal{E}_p)=\left\{
\begin{array}{rl}
1-2p &, 0\leq p\leq\frac{1}{2}\\
0 &, \frac{1}{2}\leq p\leq 1
\end{array}\right. $$

The proof of this theorem will proceed in three parts. First, for the case $1/2\leq p \leq 1$, we will appeal to the no cloning theorem\cite{NC} to show that no error correcting procedure can exist which allows arbitrary high recovery of states through the channel $\Er_p$. Then for the case $0\leq p \leq 1/2$, the bound $Q(\Er_p)\leq 1-2p$ will be established arguing that the number of qubits transmitted of a noisy and perfectly noiseless channel is sub-additive\cite{Ben1}. Lastly, using the stabilizer formalism of quantum error correcting codes\cite{Got}, it will be shown that an error correcting code must exist which can actually achieve the rate $1-2p$.

The $p\geq 1/2$ Case

That fact that $Q(\Er_p)=0$ for $p\geq 1/2$ shows that it is impossible to transmit any information through the erasure channel $\Er_p$. This implies that there cannot exist any error correcting procedure that can effectively recover the input state $\rho$ with a fidelity arbitrarily close to $1$. To see why this is the case, we will appeal to the \emph{no-cloning} theorem, which states that it is impossible to have a general quantum operation which has the ability to ``clone", or make copies, of arbitrary quantum states.\cite{NC} Thus, we will argue that if there did exist an error correcting procedure with valid encoding and decoding operations which effectively preserved the initial state $\rho$, then such a procedure would also provide a way of cloning these quantum states and therefore contradict the no-cloning theorem.

For the sake of argument, it is worth anthropomorphizing the situation where quantum states pass through an erasure channel. Consider three different parties Alice, Bob, and Charlie. The erasure channel can be effectively realized through a particular interaction of these parties as follows. Suppose Alice has $n$ qubits and would like to send these qubits to Bob. Moreover, suppose Charlie has the ability to intercept the qubits Alice is trying to send to Bob, and also has the ability of sending erased states $\ket{2}$ to Bob. If Charlie intercepts Alice's qubit with probability $p$, in which case he sends Bob the erased state, and with probability $1-p$ Charlie allows Alice's qubit to pass onto Bob, then for large $n$ Bob will have received approximately $(1-p)n$ qubits and Charlie will be in possession of $pn$ qubits. For $p\geq 1/2$, Charlie will end up with more qubits than Bob, and if it were possible for Bob to recover the initial states of the qubits from some error correcting procedure then surely Charlie could as well using the same procedure. Therefore, considering both of their recovered states together would constitute two copies of original state. However, by the no-cloning theorem, such a procedure cannot exist. Hence, no recovery procedure can exist in this case, which implies that the channel capacity $Q(\Er_p)$ for $p\geq 1/2$ must be $0$.

Upper bounding $Q(\Er_p)$ for $0\leq p \leq1/2$

Consider some imperfect channel $\Phi$ with channel capacity $0<Q(\Phi)<1$ and a related perfect channel $\Phi_I$ with channel capacity $Q(\Phi_I)=1$. Then if the perfect channel is used $a$ times, $aQ(\Phi_I)=a$ bits can be transmitted as expected, whereas if the imperfect channel is used $b$ times the number of bits that can be transmitted is given by $bQ(\Phi)<b$. When considered jointly, where the perfect and imperfect channels are used $a$ and $b$ times, respectively, let $c$ denote the number of bits that can be transferred when both channels are used. The number of qubits this joint channel is considered to be additive if $c=a+bQ(\Phi)$, sub-additive if $c\leq a+bQ(\Phi)$, and super-additive if $c\geq a+bQ(\Phi)$.

It will now be shown that the number of qubits that can be jointly transmitted for a perfect and imperfect channel, as described above, must be sub-additive. The proof proceeds in a manner similar to that presented in \cite{Ben1}. For the sake of contradiction, suppose that the number of jointly transmitted qubits is super-additive so that $c>a+bQ(\Phi)$ when the perfect channel is used $a$ times and the imperfect channel is used $b$ times. Observe that the perfect channel $\Phi_I$ being used $a$ times can be simulated by its imperfect counterpart $\Phi$ with channel capacity $Q(\Phi)$ by using the imperfect channel $d$ times such that $dQ(\Phi)=a$. Thus, letting $Q'(\Phi)$ denote the channel capacity of this channel, it follows that
\[
Q'(\Phi)=\frac{c}{d+b}>\frac{dQ(\Phi)+bQ(\Phi)}{d+b}=Q(\Phi),
\]
where the inequality follows from the assumption $c>a+bQ(\Phi)=dQ(\Phi)+bQ(\Phi)$. However, here only the original channel $\Phi$ was used so it cannot be the case that $Q'(\Phi)=Q(\Phi)> Q(\Phi)$. Therefore it must be the case that the number of bits jointly transmitted of the perfect and noisy channel is sub-additive implying that $c\leq a +bQ(\Phi)$.

With this result in mind, consider $p_2\leq p_1$ and suppose $n$ qubits are to be sent and it is known ahead of time that $n(1-p_2/p_1)$ qubits will arrive intact and the remaining $n(p_2/p_1)$ qubits will be erased with probability $p_1$. This scenario can be though of as a perfect channel which transmits the $n(1-p_2/p_1)$ qubits and an imperfect erasure channel which is used $n(p_2/p_1)$ times with erasure probability $p_1$ and corresponding channel capacity $Q(\Er_{p_1})$. Then the joint channel will effectively transmit $c$ qubits where,
\[
c\leq n\left(1-\frac{p_2}{p_1}\right)+n\left(\frac{p_2}{p_1}\right)Q(\Er_{p_1}),
\]
due to the sub-additivity result just shown. Then the rate $R$ of this channel is given by
\[
R=\frac{c}{n}\leq 1-\frac{p_2}{p_1}+\frac{p_2}{p_1}Q(\Er_{p_1}).
\]

Now for sufficiently large $n$, the number of qubits transmitted intact will be $n(1-p_2/p_1)\approx n(1-p_2)$, and the number of qubits erased is $n(p_2/p_1)\approx np_2$. Which corresponds to the case of the erasure channel $\Er_{p_2}$ with erasure probability $p_2$. In this case, using the inequality just derived, it follows that the channel capacity $Q(\Er_{p_2})$ must satisfy
\[
Q(\Er_{p_2})\leq 1-\frac{p_2}{p_1}+\frac{p_2}{p_1}Q(\Er_{p_1}),
\]
which holds for all $p_2\leq p_1$. From the previous section we had already shown that $Q(\Er_{1/2})=0$ for the erasure channel with erasure probability $1/2$. Furthermore, it trivially holds that $Q(\Er_{0})=1$ since this corresponds to the special case of a perfect channel where no erasures take place. Thus, for all $p\leq p_1=1/2$, we arrive at
\[
Q(\Er_{p})\leq 1-\frac{p}{1/2}+\frac{p}{1/2}Q(\Er_{1/2})=1-2p,
\]
giving an upper bound on the channel capacity $Q(\Er_{p})$ in the case where $p\leq 1/2$.

The existence of a code achieving the upper bound on $Q(\Er_p)$

It has now been shown that the quantum channel capacity of the erasure channel satisfies $Q(\Er_p)\leq 1-2p$ for erasure probability $0\leq p \leq 1/2$. Therefore, if it can also be shown that an error correcting code exists that allows for successful recovery of the initial source state, it must necessarily follow that $Q(\Er_p)= 1-2p$ in the case where $0\leq p \leq 1/2$. Although an explicit code will not be constructed, it will now be shown that a stabilizer code does indeed exist which is able to recover the state through counting arguments that exploit the properties of the stabilizer formalism. The existence proof constructed here borrows ideas deployed by Gottesman for a proof of the same claim \cite{Got}. The unacquainted reader may feel free to consult this post for relevant properties pertaining to stabilizer codes needed for this analysis.

Suppose the error correcting protocol in this case takes place over a Hilbert space of $n$ qubits, which encodes $k$ qubits. Thus, consider a random stabilizer group $S\subset P_n$ of the Pauli group on $n$ qubits, where $S$ consists of $n-k$ generators. The operators contained in the stabilizer $S$ then preserve the code space $\Hil_{code}$ encoding $k$ qubits. These $n-k$ generators of $S$ are chosen at random from the $4^n$ possible choices in $P_n$ for stabilizer generators, with the condition that each choice of a generator commute with the previous choices and be independent from the others in order to form a minimal generating set.

Now, consider some state of $\Hil_{code}\subset \Hil^{2^n}$ consisting of $n$ qubits which is sent through the erasure channel with probability of erasure $p$. Then for large, $p$ the approximate number of states that will be erased is given by $np$. Therefore, in order to attempt correcting these $np$ qubits, a syndrome measurement must be done by measuring the $n-k$ generators of $S$. Ideally, this syndrome will yield information for uniquely determining the operator $E\in P_n$ that represents the error that was inflicted on the qubits. By applying $E^\dagger\in P_n$, the error can be corrected returning the corrupted state to its original state with high fidelity.

However, error recovery may fail if two distinct errors $E_1$ and $E_2$ have the same syndrome measurement. It is important to consider the likelihood $P_{fail}$ of failing to recover the state in this case. In general, for some error $E\in P_n$, a stabilizer generator either commutes or anti-commutes with $E$ and does so with a probability of $1/2$ in each case. Since there are $n-k$ generators the probability of two operators in $P_n$ having the same syndrome is given by $(1/2)^{n-k}$. Moreover, since there are $4^{pn}$ elements of $P_n$ that have support on the $pn$ erased qubits, the probability that two distinct errors have the same syndrome measurement must be bounded by
\[
P_{fail}\leq4^{pn}\left(\frac{1}{2}\right)^{n-k}=2^{-n(1-2p-R)},
\]
where $R=k/n$ is the rate. Provided that the rate satisfies $R<1-2p$, the right-hand-side will converge to $0$ as $n$ gets arbitrarily large. This implies that the probability of failing to recover the state approaches zero in this limit, so that such random stabilizer codes succeed in correcting the states for any $0\leq p\leq 1/2$. Hence, since this rate of $1-2p$ coincides with the previously derived upper bound on the channel capacity, it must be the case that $Q(\Er_p)=1-2p$.

Conclusion

In this post it is proved that the quantum channel capacity $Q(\Er_p)$ for the erasure channel $\Er_p$ satisfies $Q(\Er_p)=1-2p$ for $0\leq p\leq 1/2$ and $Q(\Er_p)=0$ for $1/2\leq p \leq $. It was argued in the latter case that the channel capacity must be $0$ in order to prevent a violation of the no-cloning theorem. The former case first involved finding an upper bound of $1-2p$ of the channel capacity, and then it was argued through the existence of random stabilizer codes that it is always possible to recover the state with high fidelity through such means, which implies that the upper bound in the rate of the channel can actually be achieved. The context of the erasure channel provides a rare instance where the the quantum channel capacity of the channel can be computed exactly. It seems that for most other channels of interest only upper and lower bounds have been found. There has been much on going research in the past and currently on open questions pertaining to these matters with many interesting qualitative and quantitative results.

The Stabilizer Formalism

2013-12-14T19:36:00.000-08:00

Consider the single qubit unitary matrices commonly referred to as the Pauli matrices defined as:

$$I=\begin{pmatrix}1&0 \\ 0 &1\end{pmatrix},
X=\begin{pmatrix}0&1 \\ 1&0 \end{pmatrix},
Y=\begin{pmatrix}0&1 \\ -1&0 \end{pmatrix},
Z=\begin{pmatrix}1&0 \\ 0&-1 \end{pmatrix}.$$

It is worth noting that the operator $Y$ is commonly defined instead as the operator $\sigma_Y=iY$, but the definition of $Y$ introduced here will be convenient for our purposes. These operators satisfy the following properties:
\begin{align*}
X^2&=Z^2=-Y^2=I \\
XY&=-YX=Z, \\
YZ &=-ZY=X, \\
ZX &=-XZ=Y.
\end{align*}

These Properties give the set $P=\{\pm I,\pm X, \pm Y, \pm X\}$ a group structure under the usual matrix multiplication. Define $P_n:=\{U_1\otimes\dots\otimes U_n \ | \ U_j\in P, 0\leq j\leq n\}$, as the set of $n$-fold tensor products of Pauli operators from $P$. The set $P_n$ also forms a group structure under the natural multiplication and is called the Pauli group with order $|P_n|=2^{2n+1}$.

An important property about the Pauli operators is that they span the space of unitary operators acting on a single qubit. That is, any single qubit unitary $U$ can be expressed as
\[
U=c_II+c_XX+c_YY+c_ZZ,
\]
where the vector $(c_I,c_X,c_Y,c_Z)$ consists of complex numbers and is of unit norm. Similarly, any unitary operator acting on a $n$-qubit Hilbert space can be expressed in terms of elements of the Pauli group $P_n$.

Moreover, the Pauli Group $P_n$ also satisfies the following properties:

Every $M\in P_n$ in unitary: $M^\dagger=M^{-1}$.
Every $M\in P_n$ satisfies $M^2=\pm I^{\otimes n}$.
If $M^2=I^{\otimes n}$, then $M=M^\dagger$; if $M^2=-I$, then $M=-M^\dagger$.
For any $M, N \in P_n$, either $MN=NM$ (they commute) or $MN=-NM$ (they anti-commute).

Consider some abelian subgroup $S\subset P_n$, consisting of elements that all commute with one another. Then all elements of $S$ can be simultaneously diagonalized. The subspace $\Hil_S\subset\Hil^{2^n}$ defined as
$$\Hil_S:=\{\ket{\psi}\in\Hil^{2^n} \ | \ M\ket{\psi}=\ket{\psi} \text{for all} M\in S \}$$
consists of the simultaneous eigenspace with eigenvalue $+1$ of elements of $S$. The space $\Hil_S$ is called the \emph{stabilizer code} associated with $S$, and $S$ is called the stabilizer of the code.

A generating set of $S$ is a collection of elements of $S$ such that each element of $S$ can be expressed as some product of elements from the generating set. In addition, it is required that the elements of the generating set be independent, meaning that no element of the generating set can be expressed as a product of the other elements of the generating set. It can be shown \cite{Got} that if $S$ has $n-k$ generators, then the codes space $\Hil_S$ has dimension $2^k$ implying that it can effectively encode $k$ qubits.

Index the elements of a generating set of a stabilizer $S$ as $\{M_1,\dots,M_{n-k}\}$. The utility of the stabilizer formalism for quantum error correction comes from the fact that the elements of $S$ serve as operators for diagnosing possible errors that may occur to an encoded state of $\ket{\psi}\in\Hil_S$. In general, an error can be represented in terms elements $E_a\in P_n$. Then since every $E_a$ either commutes or anti-commutes with some generator $M_j\in S$, the following two cases may occur.

If $E_a$ anti-commutes with some $M_j$, then for $\ket{\psi}\in\Hil_S$,
\[ M_jE_a\ket{\psi}=-E_aM_j\ket{\psi}=-E_a\ket{\psi},\] which implies that the error can be detected if the the erred state $E_a\ket{\psi}$ is acted on by $M_j$.

If $E_a$ commutes with some $M_j$, then for $\ket{\psi}\in\Hil_S$,
\[ M_jE_a\ket{\psi}=E_aM_j\ket{\psi}=E_a\ket{\psi}, \] and the error may go undetected when the erred state $E_a\ket{\psi}$ is acted on by $M_j$.

A more thorough error syndrome can be provided by measuring each of the $n-k$ stabilizer generators. That is for a particular error $E_a$, consider the set of values $\{s_{a,j} \}$, where each $s_{a,j}\in\{0,1\}$ satisfies
\[
M_jE_a=(-1)^{s_{a,j}}E_aM_j.
\]
If it is the case that for every $a\neq b$, with $s_{a,j}\neq s_{b,j}$ for all $j$, then the code is considered to be non degenerate and there will be no ambiguity in what error occurred allowing for the error to be corrected by measuring the $n-k$ generators of $S$.

Another condition which must be satisfied by the stabilizer $S$ in order to ensure complete error recovery due to arbitrary errors is that, for each possible error $E_a, E_b$ and any $\ket{\psi}\in\Hil_S$,
\[
\bra{\psi}E_a^\dagger E_b\ket{\psi}=C_{ab},
\]
such that the constants $C_{ab}$ are independent of $\ket{\psi}$. This condition can be equivalently shown to hold if one of the following holds for each possible pair of errors $E_a$ and $E_b$:

$E_a^\dagger E_b\in S$,
There exists an $M\in S$ that anti-commutes with $E_a^\dagger E_b$.

In this way, error recovery may fail if both conditions are violated. That is, if there exists some $E_a^\dagger E_b$ that commutes with every element of $S$, but yet $E_a^\dagger E_b\not\in S$. In this circumstance, the operator $E_a^\dagger E_b$ that preserves the code space $\Hil_S$ but still modifies it in a non trivial way, implying that encoded information may still be transformed. In addition, both $E_a$ and $E_b$ will have the same syndrome leaving an inherent ambiguity on how either error should be corrected, and any mistake in diagnosis can apply a nontrivial transformation to the encoded space.

An impossible operation: the transpose

2013-12-10T17:40:00.000-08:00

Question: Is the transpose a valid quantum operation?

To make things a little more rigorous, let an operation $\Lambda$ on qubits be defined as $\Lambda(\rho)=\rho^T$, where $\rho^T$ denotes the transpose of $\rho$.

Consider the one qubit state $\ket{\psi_+}=\frac{1}{\sqrt{2}}(\ket{0}+i\ket{1})$ so that
\[ \begin{align*}
\ket{\psi_+}\bra{\psi_+}&=\frac{1}{2}(\ket{0}+i\ket{1})(\bra{0}-i\bra{1}) \\
&=\frac{1}{2}(\ket{0}\bra{0}-i\ket{0}\bra{1}+i\ket{1}\bra{0}+\ket{1}\bra{1}) \\
&=\frac{1}{2}\begin{pmatrix} 1&-i \\ i&1\end{pmatrix}.
\end{align*}\]
Then
\[ \begin{align*}
\Lambda(\ket{\psi_+}\bra{\psi_+})=\frac{1}{2}\begin{pmatrix} 1&-i \\ i&1\end{pmatrix}^T
&=\frac{1}{2}\begin{pmatrix} 1&i \\ -i&1\end{pmatrix} \\
&=\frac{1}{2}(\ket{0}\bra{0}+i\ket{0}\bra{1}-i\ket{1}\bra{0}+\ket{1}\bra{1}) \\
&=\frac{1}{2}(\ket{0}-i\ket{1})(\bra{0}+i\bra{1}) \\
&=\ket{\psi_-}\bra{\psi_-},
\end{align*}\]
where $\ket{\psi_-}=\frac{1}{\sqrt{2}}(\ket{0}-i\ket{1})$. Then the inner product of $\ket{\psi_-}$ and $\ket{\psi_+}$ is
\[ \begin{align*}
\ip{\psi_-}{\psi_+}&=\frac{1}{2}(\bra{0}+i\bra{1})(\ket{0}+i\ket{1}) \\
&=\frac{1}{2}(\ip{0}{0}+i\ip{0}{1}+i\ip{1}{0}-\ip{1}{1}) \\
&=\frac{1}{2}(1+0+0-1) \\
&=0.
\end{align*}\]
Thus, $\ket{\psi_+}$ and $\ket{\psi_-}$ are orthogonal pure states such that $\Lambda(\ket{\psi_+}\bra{\psi_+})=\ket{\psi_-}\bra{\psi_-}$.

Now, it will be proven that there does not exist a unitary operation $U$ such that $\Lambda(\rho)=U\rho U^\dagger$ for all $\rho$.

It suffices to show that such a unitary does not exist in the single qubit (two dimensional) case. For the sake of contradiction suppose that such a unitary $U$ does exist. Moreover, consider the two states $\ket{0}$ and $\ket{\psi_+}=\frac{1}{\sqrt{2}}(\ket{0}+\ket{1})$, whose density operators are given by
\[
\ket{0}\bra{0}=\begin{pmatrix} 1&0 \\0&0 \end{pmatrix} \ \ \ \text{and} \ \ \ \ket{\psi_+}\bra{\psi_+}=\frac{1}{2}\begin{pmatrix} 1&1\\1&1 \end{pmatrix}.
\]
Then
\[ \begin{align*}
\Lambda(\ket{0}\bra{0})&=\begin{pmatrix} 1&0 \\0&0 \end{pmatrix}^T=\begin{pmatrix} 1&0 \\0&0 \end{pmatrix}=\ket{0}\bra{0} \\
\Lambda(\ket{\psi_+}\bra{\psi_+})&=\frac{1}{2}\begin{pmatrix} 1&1\\1&1 \end{pmatrix}^T=\frac{1}{2}\begin{pmatrix} 1&1\\1&1 \end{pmatrix}=\ket{\psi_+}\bra{\psi_+},
\end{align*}\]
which shows that these two states remain the same under the transposition operation $\Lambda$. Therefore, these two states must also be left unchanged by the action of the unitary $U$. That is, it must be the case that
\[\begin{align*}
U\ket{0}\bra{0}U^\dagger&=\ket{0}\bra{0},\\
U\ket{\psi_+}\bra{\psi_+}U^\dagger&=\ket{\psi_+}\bra{\psi_+},
\end{align*}\]
or equivalently that $U\ket{0}=\ket{0}$ and $U\ket{\psi_+}=\ket{\psi_+}$. By thinking of these states and the action of $U$ as a rotation on the Block sphere, this implies that both of these states remain fixed and must therefore lie on the axis of rotation of $U$. More explicitly, recall that a one-qubit unitary can be represented as
\[
U=e^{i\alpha}\cos(\theta/2)I-ie^{i\alpha}\sin(\theta/2)\left(c_x X +c_Y Y + c_Z Z\right),
\]
where $\alpha$ is just a phase factor, $\theta$ gives the angle of rotation, and $(c_X,c_Y,c_Z)$ is a unit vector.

Then since the state $\ket{0}$ lies on the $z$-axis of the Bloch sphere, the rotation of $U$ must have its axis of rotation as the $z-$ axis in order to keep $\ket{0}$ fixed, which implies that $c_X,c_Y=0$. Furthermore, since the state $\ket{\psi_+}$ lies along the $y$-axis, this also implies that $U$ must rotate about the $y$-axis in order to ket $\ket{\psi_+}$ fixed, but this implies that $c_X,C_Z=)$. These implications together imply that $c_X,c_Y,c_Z=0$,so that $U=I$. However, as seen above the transpose operation $U$ does not act as the identity operation all states. This is a contradiction, and therefore there cannot exist a unitary $U$ satisfying the desired conditions.

Another, perhaps more general, way to see why such a unitary cannot exist is too note that any operation of the form $\Phi(\rho)=U\rho U^\dagger$ defined with some unitary $U$ always yields a valid quantum operation. This means that $\Phi(\rho)$ is both trace preserving and a completely positive map by definition. On the other hand, although the transpose operation $\Lambda$ is trace preserving it is not completely positive in general, and thus cannot define a valid quantum operation by definition.

An impossible operation: mapping every state to an orthogonal state

2013-12-10T17:34:00.000-08:00

Here, we'll question the existence of a quantum operation that maps every quantum state to an orthogonal state relative to its input. More specifically: is there a one-qubit unitary operation $U$ that maps each pure state $\ket{\psi}$ to some state $U\ket{\psi}=\ket{\psi'}$ such that $\ip{\psi}{\psi'}=0$.

I claim that there does not exist such a unitary!

Suppose, for the sake of contradiction, that such a unitary $U$ did exist. Consider an arbitrary one-qubit state $\ket{\psi}$. Let $U\ket{\psi}=\ket{\psi'}$ so that $\ip{\psi}{\psi'}=0$. Moreover, let $U\ket{\psi'}=\ket{\psi''}$ so that $\ip{\psi''}{\psi'}=0$ as well by the assumption of the existence of such a unitary. Now, consider the state $\ket{\phi}=\frac{1}{\sqrt{2}}(\ket{\psi}+\ket{\psi'})$, and let $U\ket{\phi}=\ket{\phi'}$ so that $\ip{\phi}{\phi'}=0$. Then it must also follow that
\[
U\ket{\phi}=\frac{1}{\sqrt{2}}(U\ket{\psi}+U\ket{\psi'})=\frac{1}{\sqrt{2}}(\ket{\psi'}+\ket{\psi''})=\ket{\phi'}.
\]
This implies that the inner product of $\ket{\phi}$ and $\ket{\phi}$ must also satisfy
\[\begin{align*}
\ip{\phi}{\phi'}&=\frac{1}{2}(\bra{\psi'}+\bra{\psi''})(\ket{\psi'}+\ket{\psi''}) \\
&=\frac{1}{2}(\ip{\psi'}{\psi'}+\ip{\psi'}{\psi''}+\ip{\psi''}{\psi'}+\ip{\psi''}{\psi''}) \\
&=\frac{1}{2}(1+0+0+1) \\
&=1,
\end{align*}\]
but then $1=\ip{\phi}{\phi'}=0$ which a contradiction. Therefore, no such unitary can exist.

A particular instance of Grover's search

2013-12-10T17:30:00.000-08:00

Here, we will analyze how Grover's search algorithm can be used in the particular cases when the density of marked items is $1/4$ and $1/2$.

Consider a function $f:\{0,1\}^n \to \{0,1\}$, define the sets
\[\begin{align*}
A&=\{x\in\{0,1\}^n : f(x)=1\} \\
B&=\{x\in\{0,1\}^n : f(x)=0\},
\end{align*} \]
and let $a=|A|$ and $b=|B|$.

The equally weighted superposition of all basis states in a $2^n=N$ dimensional Hilbert space can be expressed as
\[
\ket{\psi_0}=\frac{1}{\sqrt{N}}\SUM{x\in\{0,1\}^n}{}\ket{x}=\sqrt{\frac{a}{N}}\SUM{f(x)=1}{}\ket{x}+\sqrt{\frac{b}{N}}\SUM{f(x)=0}{}\ket{x}.
\]
Assuming that $a=|A|$ is known, choose $\theta$ such that $\sin(\theta)=\sqrt{\frac{a}{N}}$. Then the superposition $\ket{\psi_0}$ can be equivalently written as
\[
\ket{\psi_0}=\sin(\theta)\SUM{f(x)=1}{}\ket{x}+\cos(\theta)\SUM{f(x)=0}{}\ket{x}.
\]
In Grover's search algorithm, $\ket{\psi_0}$ is prepared as an initial state, and then a sequence of \emph{Grover} iterations are applied, which after $k$ iterations, results in the state
\[
\ket{\psi_k}=\sin((2k+1)\theta)\SUM{f(x)=1}{}\ket{x}+\cos((2k+1)\theta)\SUM{f(x)=0}{}\ket{x}.
\]
The objective is to choose the number of iterations $k$ so that $\sin((2k+1)\theta)\approx 1$ so that some state $\ket{x}$ such that $x\in{A}$ is measured with high probability. For this to be the case, it suffices that the following conditions hold:
\[\begin{align*}
\sin((2k+1)\theta)&\approx 1 \\
(2k+1)\theta&\approx \frac{\pi}{2} \\
k&\approx \frac{\pi}{4\theta}-\frac{1}{2} \\
\end{align*}\]

Consider the case when $a=\frac{1}{4}2^n=\frac{1}{4}N$ so that the initial angle $\theta$ is chosen to satisfy $\sin(\theta)=\sqrt{\frac{a}{N}}=\sqrt{\frac{N}{4N}}=\frac{1}{2}$ implying that $\theta=\frac{\pi}{6}$. In this case, the number of desired iterations that need to be performed is actually given by
\[
k= \frac{\pi}{4\theta}-\frac{1}{2}=\frac{\pi 6}{4\pi}-\frac{1}{2}=1,
\]
and thus Grover's algorithm is guaranteed to find an $x$ such that $x\in A$ in just a single iteration.

Now consider the case where $a=\frac{1}{2}2^n=\frac{1}{2}N$ so that the initial angle $\theta$ is chosen to satisfy $\sin(\theta)=\sqrt{\frac{a}{N}}=\sqrt{\frac{N}{2N}}=\frac{1}{\sqrt{2}}$ implying that $\theta=\frac{\pi}{4}$. In this case, the number of desired iterations that need to be performed is approximately
\[
k\approx \frac{\pi}{4\theta}-\frac{1}{2}=\frac{\pi 4}{4\pi}-\frac{1}{2}=\frac{1/2}.
\]
However, since $k$ is not an integer and is equally close to the integers $0$ and $1$ it is seen that a state $\ket{x}$ such that $x\in A$ is not guaranteed to be found with certainty if the state $\ket{\psi_k}$ were to be measured. In this instance, Grover's search algorithm provides no benefit, in the sense that the likelihood of observing a state $\ket{x}$ such that $x\in A$ is equally probably if $1$ iteration is performed or if $0$ iterations are performed to the initial state.