Abstract
The quantum channel capacity $Q(\Phi)$ of a quantum channel $\Phi$ is defined as the maximum asymptotic rate at which information can be reliably sent through the channel. In this post, the channel capacity $Q(\Er_p)$ of the erasure channel $\Er_p$ is proved, where the erasure channel $\Er_p$ is defined as the quantum operation on a qubit which erases the state with probability $p$ and leaves it unchanged otherwise. It is proved that $Q(\Er_p)=1-2p$ for $0\leq p\leq 1/2$ and that $Q(\Er_p)=0$ for $1/2\leq p\leq 1$. The latter case is proved via a contradiction with the no-cloning theorem, and the former is proved through the existence of a quantum error correcting code in the stabilizer formalism which achieves the optimum rate.
The theory of quantum information is ultimately concerned with defining and understanding the properties and essence of quantum systems, how they transform, and how they can be used as resource in various contexts. Moreover, it is of great theoretical interest and practical importance to obtain quantitative measures of what can or cannot be accomplished in the quantum realm. Perhaps the most prevalent feature of quantum systems is their inherent fragility and high sensitivity to unwanted noise. Unlike the seemingly robust nature of the classical world, quantum systems are vulnerable to phenomenon which seem to "destroy" their very own quantum mechanical nature.
With this is mind, consider the task of trying to send quantum state to another party through a quantum channel capable of transmitting quantum information. Ideally, we would hope for perfect transmitting capabilities of such a channel, but it is more realistic to acknowledge some likelihood of error in which the quantum system is subjected to noise--or unwanted transformations of its state---before being received at the other end. Equivalently, we may be interested in merely preserving the state of a quantum system over time as opposed to spatially relocating the system. Even then it is possible that the system may experience undesired errors to its state. Thus, it is worthwhile to understand the precise conditions under which a quantum state could be reasonably recovered when transformed under a noisy quantum channel. The channel capacity of a quantum channel gives precise bounds on the rate at which quantum information can be reliably transferred through the channel and still be recovered with sufficiently high fidelity. In order to successfully transfer information through a noisy quantum channel it is beneficial and often necessary to encode the desired state into another quantum system, which is then subjected to errors and subsequently decoded back to the original state that was intended to be sent. This latter scheme is the domain of quantum error correcting codes. This reveals an intimate connection between channel capacities and error correction, that is, the channel capacity sets fundamental limits on the existence of successful error correcting codes which attempt to correct errors induced by that channel.
In what follows, the notion of the channel capacity of a channel will be made more precise and formal. With this framework the channel capacity of the erasure channel will be rigorously calculated. In short, the erasure channel can be simply be thought of as some channel which "erases" or "destroys'' some quantum bit that passes through the channel with some probability $p$, and leaves the quantum bit unchanged otherwise. The erasure channel, although relatively primitive and crude in its action, characterizes the natural phenomenon of information being lost in some process. This differs from the related depolarizing channel, which instead of completely destroying the state effectively randomizes it. One crucial difference worth mentioning is that in the context of the depolarizing channel, the sender or receiver may not know whether the system being transmitted has been randomized. In the case of the erasure channel, we will assume that at least the receiver of the system is aware of the existence of an erasure when it occurs.
The proof for obtaining the channel capacity will proceed in three steps. For an erasure channel with a probability of erasure occurring given by $p>1/2$, it will be shown that the channel capacity must be $Q=0$ in order to avoid contradiction with a fundamental result of quantum theory concerning the inability to clone arbitrary quantum states---the no-cloning theorem. Then in the case for $p\leq 1/2$, it will first be shown that the channel capacity must be bounded above by $Q\leq1-2p$ due to a property of channel capacities that prevents the total number of bits transmitted from being super-additive when a noisy and noiseless channel are considered together.\cite{Ben1} Finally, it will be argued through the construction of random stabilizer codes that error correcting codes do indeed exist \cite{Got} that achieve the channel capacity $Q=1-2p$ for the case of $p\leq 1/2$. Interestingly, our ability to obtain exact bounds for the erasure channel is in contrast for the channel capacity of the depolarizing channel (and most other channels), for which only upper and lower bounds are known.
As a note on the required background knowledge needed to follow this proof, it will be assumed that the reader is already aware of some basic concepts and axioms in the theory of quantum computation and information \cite{NC} and possesses a fair degree of mathematical maturity---namely, a working knowledge of linear algebra together with the basics of finite group theory will be assumed. Prior knowledge on the theory of error correction and the stabilizer formalism will be beneficial.
Notational Remarks
For some Hilbert space $\Hil$, let $D(\Hil)$ denote the space of linear operators on $\Hil$ that are valid quantum density states (positive operators with unit trace). Moreover, let $C(\Hil_1,\Hil_2)$ denote the space of valid quantum channels (completely-positive and trace preserving maps) that transform states $\rho_1\in D(\Hil_1)$ to states $\rho_2\in D(\Hil_2)$. For convenience, define $C(\Hil):=C(\Hil,\Hil)$ for the case where the quantum channels map states from a particular space $\Hil$ to the same space $\Hil$.
Quantum Error Correcting Codes
To formally define the channel capacity $Q(\Phi)$ of a quantum channel $\Phi$, we first need to formalize the notion of a quantum error correcting code. The objective of quantum error correction is to encode a quantum state $\rho$ into a particular subspace $\Hil_{code}\subset\Hil$ of a larger Hilbert space $\Hil$ by the means of some quantum operation $U_c: \rho\mapsto \rho_c$, where $\rho_c\in\Hil_{code}$. The operation $U_c$ is called the \emph{encoding operation}, the state $U_c(\rho)=\rho_c$ is called the \emph{encoded state} of $\rho$, and the space $\Hil_{code}$ is often referred to as the \emph{code space} of a quantum error correcting code. States of $\Hil_{code}$ should have the desired property that they can be recovered when subjected to errors. More formally, let $\Phi\in C(\Hil)$ be some quantum channel of interest that represents the action of a possible error effecting the state $\rho_c$, and let $\Phi(\rho_c)=\rho_e$. Along with the existence of an encoding operation $U_c$, an error correcting code also has a \emph{decoding operation} given by $U_d$. This operation serves the purpose of decoding the possibly erred state $\rho_e$, and should ideally satisfy $U_d(\rho_e)=\rho$. An error correcting protocol as described here is summarized schematically in the figure below.
In a less idealistic scenario, when $U_d(\rho_e)=\rho'\neq\rho$, we may be satisfied with having the resulting state $\rho' $ being approximately equal to $\rho$ through some measure of closeness. One such useful measure is given by the \emph{fidelity} $F(\rho,\rho')$ of two states $\rho,\rho'\in D(\Hil)$ of some Hilbert space $\Hil$, where
\[
F(\rho,\rho')=Tr(\sqrt{\rho^{\frac{1}{2}}\rho'\rho^{\frac{1}{2}}}),
\]
and $Tr(.)$ denotes the trace operation. The fidelity of two states satisfies $0\leq F(\rho,\rho')\leq1$, where increasing values of $F(\rho,\rho')$ implying that the states $\rho$ and $\rho'$ are closer.
Noisy Channel Coding and Channel Capacity
Let $\Phi\in C(\Hil)$ be some quantum channel, and consider its $n$-fold extension
\[
\Phi^{\otimes n}=\Phi\otimes\dots\otimes\Phi \in C(\Hil^{\otimes n})=C(\Hil)\otimes\dots\otimes C(\Hil),
\]
which essentially represents the channel $\Phi$ acting individually on each of the $n$ copies of $\Hil$. Now, suppose there is some error correcting code with code space $\Hil_{code}\subset\Hil$. For noisy channel coding \cite{Nielsen}, we consider the $n$-fold code space $\Hil_{code}^{\otimes n}\subset \Hil^{\otimes n}$, and are interested in source states $\rho\in D(\Hil)$ such that there exists an encoding operation $\overline{U}_c$ that takes a source state and maps it to an encoded state $\overline{U}_c(\rho)=\rho_c\in D(\Hil_{code}^{\otimes n})$, which is then acted on by $\Phi^{\otimes n}$ and subsequently decoded by a decoding operation $\overline{U}_d$ yielding a state $\rho'\in D(\Hil)$ close to the original source state $\rho$ in regards to the fidelity $F(\rho,\rho')$.
Suppose that the dimension of the space $\Hil$ is $2^m$ so that $\rho\in D(\Hil)$ is a state consisting of $m$ qubits. In this context, the error $\Phi$ can be thought of as operating on $n$ blocks, each consisting of $m$ qubits, as illustrating in the circuit below. Thus, this coding model can be thought of using $n$ applications of the channel $\Phi$ to successfully send $m$ qubits through the channel provided that the encoding and decoding operations $\overline{U}_c$ and $\overline{U}_d$ exist which preserve the original source state $\rho$.
The quantity $R=m/n$ is defined as the rate of the channel $\Phi$. In this way, the channel capacity $Q(\Phi)$ is defined as the maximum asymptotic rate at which information can be reliably sent through the channel $\Phi$. More precisely, the channel capacity is the largest value $Q(\Phi)$ such that for any $R\leq Q(\Phi)$ and $\epsilon>0$, there exists an error correcting code $\Hil_{code}^{\otimes n}$ with rate at least $R$ where for any $\rho$ with $\overline{U}_c(\rho)=\rho_c\in\Hil_{code}^{\otimes n}$, the resulting recovered state $\rho'$ satisfies $F(\rho,\rho')=1-\epsilon$.
Modelling the Erasure Channel
Let $\ket{0}, \ket{1} \in \Hil^2$ denote the standard computational basis states of a Hilbert space $\Hil^2$ of dimension $2$ representing a qubit. To model an erasure we will embed the states $\ket{0}$ and $ \ket{1}$ into a higher dimensional space and introduce a third basis state $\ket{2}\in\Hil^3$ that is orthogonal to both $\ket{0}$ and $ \ket{1}$ so that the three states together span $\Hil^3$. This third state $\ket{2}$ then serves the purpose of representing an erasure of either the states $\ket{0}$ and $ \ket{1}$.
Now consider appending to the state of a qubit an ancilliary register belonging to some Hilbert space $\Hil_E$ that represents the ``environment" of some appropriate dimension (which may depend on the context). Then the action on the basis states of the qubit of the erasure channel, denoted by $\mathcal{E}_p$, is defined as
\begin{align*}
\ket{0}\otimes\ket{0}_E& \ \ \overset{\mathcal{E}_p}\mapsto \ \ \sqrt{1-p}\ket{0}\otimes\ket{0}_E+\sqrt{p}\ket{2}\otimes\ket{1}_E \\
\ket{1}\otimes\ket{0}_E& \ \ \overset{\mathcal{E}_p}\mapsto \ \ \sqrt{1-p}\ket{1}\otimes\ket{0}_E+\sqrt{p}\ket{2}\otimes\ket{2}_E, \\
\end{align*}
where the states $\ket{0}_E,\ket{1}_E,\ket{2}_E\in\Hil_E$ are orthogonal states of the environment, and the map $\mathcal{E}_p$ has been parametrized by some real number $0\leq p\leq1$ that gives the probability that the qubit will be erased.
The Channel Capacity of the Erasure Channel
To calculate the channel capacity $Q(\mathcal{E}_p)$ of the erasure channel, we must consider the scenario described in the previous section on noisy channel coding, where now the channel of interest is $\Phi=\mathcal{E}_p$. In what follows we will prove the following theorem:
Theorem: The channel capacity of the erasure channel.
For the erasure channel $\mathcal{E}_p$, which erases the qubit state with probability $p$ and leaves the state intact with probability $1-p$, the channel capacity $Q(\mathcal{E}_p)$ is given by
$$ Q(\mathcal{E}_p)=\left\{
\begin{array}{rl}
1-2p &, 0\leq p\leq\frac{1}{2}\\
0 &, \frac{1}{2}\leq p\leq 1
\end{array}\right. $$
The proof of this theorem will proceed in three parts. First, for the case $1/2\leq p \leq 1$, we will appeal to the no cloning theorem\cite{NC} to show that no error correcting procedure can exist which allows arbitrary high recovery of states through the channel $\Er_p$. Then for the case $0\leq p \leq 1/2$, the bound $Q(\Er_p)\leq 1-2p$ will be established arguing that the number of qubits transmitted of a noisy and perfectly noiseless channel is sub-additive\cite{Ben1}. Lastly, using the stabilizer formalism of quantum error correcting codes\cite{Got}, it will be shown that an error correcting code must exist which can actually achieve the rate $1-2p$.
The $p\geq 1/2$ Case
That fact that $Q(\Er_p)=0$ for $p\geq 1/2$ shows that it is impossible to transmit any information through the erasure channel $\Er_p$. This implies that there cannot exist any error correcting procedure that can effectively recover the input state $\rho$ with a fidelity arbitrarily close to $1$. To see why this is the case, we will appeal to the \emph{no-cloning} theorem, which states that it is impossible to have a general quantum operation which has the ability to ``clone", or make copies, of arbitrary quantum states.\cite{NC} Thus, we will argue that if there did exist an error correcting procedure with valid encoding and decoding operations which effectively preserved the initial state $\rho$, then such a procedure would also provide a way of cloning these quantum states and therefore contradict the no-cloning theorem.
For the sake of argument, it is worth anthropomorphizing the situation where quantum states pass through an erasure channel. Consider three different parties Alice, Bob, and Charlie. The erasure channel can be effectively realized through a particular interaction of these parties as follows. Suppose Alice has $n$ qubits and would like to send these qubits to Bob. Moreover, suppose Charlie has the ability to intercept the qubits Alice is trying to send to Bob, and also has the ability of sending erased states $\ket{2}$ to Bob. If Charlie intercepts Alice's qubit with probability $p$, in which case he sends Bob the erased state, and with probability $1-p$ Charlie allows Alice's qubit to pass onto Bob, then for large $n$ Bob will have received approximately $(1-p)n$ qubits and Charlie will be in possession of $pn$ qubits. For $p\geq 1/2$, Charlie will end up with more qubits than Bob, and if it were possible for Bob to recover the initial states of the qubits from some error correcting procedure then surely Charlie could as well using the same procedure. Therefore, considering both of their recovered states together would constitute two copies of original state. However, by the no-cloning theorem, such a procedure cannot exist. Hence, no recovery procedure can exist in this case, which implies that the channel capacity $Q(\Er_p)$ for $p\geq 1/2$ must be $0$.
Upper bounding $Q(\Er_p)$ for $0\leq p \leq1/2$
Consider some imperfect channel $\Phi$ with channel capacity $0<Q(\Phi)<1$ and a related perfect channel $\Phi_I$ with channel capacity $Q(\Phi_I)=1$. Then if the perfect channel is used $a$ times, $aQ(\Phi_I)=a$ bits can be transmitted as expected, whereas if the imperfect channel is used $b$ times the number of bits that can be transmitted is given by $bQ(\Phi)<b$. When considered jointly, where the perfect and imperfect channels are used $a$ and $b$ times, respectively, let $c$ denote the number of bits that can be transferred when both channels are used. The number of qubits this joint channel is considered to be additive if $c=a+bQ(\Phi)$, sub-additive if $c\leq a+bQ(\Phi)$, and super-additive if $c\geq a+bQ(\Phi)$.
It will now be shown that the number of qubits that can be jointly transmitted for a perfect and imperfect channel, as described above, must be sub-additive. The proof proceeds in a manner similar to that presented in \cite{Ben1}. For the sake of contradiction, suppose that the number of jointly transmitted qubits is super-additive so that $c>a+bQ(\Phi)$ when the perfect channel is used $a$ times and the imperfect channel is used $b$ times. Observe that the perfect channel $\Phi_I$ being used $a$ times can be simulated by its imperfect counterpart $\Phi$ with channel capacity $Q(\Phi)$ by using the imperfect channel $d$ times such that $dQ(\Phi)=a$. Thus, letting $Q'(\Phi)$ denote the channel capacity of this channel, it follows that
\[
Q'(\Phi)=\frac{c}{d+b}>\frac{dQ(\Phi)+bQ(\Phi)}{d+b}=Q(\Phi),
\]
where the inequality follows from the assumption $c>a+bQ(\Phi)=dQ(\Phi)+bQ(\Phi)$. However, here only the original channel $\Phi$ was used so it cannot be the case that $Q'(\Phi)=Q(\Phi)> Q(\Phi)$. Therefore it must be the case that the number of bits jointly transmitted of the perfect and noisy channel is sub-additive implying that $c\leq a +bQ(\Phi)$.
With this result in mind, consider $p_2\leq p_1$ and suppose $n$ qubits are to be sent and it is known ahead of time that $n(1-p_2/p_1)$ qubits will arrive intact and the remaining $n(p_2/p_1)$ qubits will be erased with probability $p_1$. This scenario can be though of as a perfect channel which transmits the $n(1-p_2/p_1)$ qubits and an imperfect erasure channel which is used $n(p_2/p_1)$ times with erasure probability $p_1$ and corresponding channel capacity $Q(\Er_{p_1})$. Then the joint channel will effectively transmit $c$ qubits where,
\[
c\leq n\left(1-\frac{p_2}{p_1}\right)+n\left(\frac{p_2}{p_1}\right)Q(\Er_{p_1}),
\]
due to the sub-additivity result just shown. Then the rate $R$ of this channel is given by
\[
R=\frac{c}{n}\leq 1-\frac{p_2}{p_1}+\frac{p_2}{p_1}Q(\Er_{p_1}).
\]
Now for sufficiently large $n$, the number of qubits transmitted intact will be $n(1-p_2/p_1)\approx n(1-p_2)$, and the number of qubits erased is $n(p_2/p_1)\approx np_2$. Which corresponds to the case of the erasure channel $\Er_{p_2}$ with erasure probability $p_2$. In this case, using the inequality just derived, it follows that the channel capacity $Q(\Er_{p_2})$ must satisfy
\[
Q(\Er_{p_2})\leq 1-\frac{p_2}{p_1}+\frac{p_2}{p_1}Q(\Er_{p_1}),
\]
which holds for all $p_2\leq p_1$. From the previous section we had already shown that $Q(\Er_{1/2})=0$ for the erasure channel with erasure probability $1/2$. Furthermore, it trivially holds that $Q(\Er_{0})=1$ since this corresponds to the special case of a perfect channel where no erasures take place. Thus, for all $p\leq p_1=1/2$, we arrive at
\[
Q(\Er_{p})\leq 1-\frac{p}{1/2}+\frac{p}{1/2}Q(\Er_{1/2})=1-2p,
\]
giving an upper bound on the channel capacity $Q(\Er_{p})$ in the case where $p\leq 1/2$.
  
The existence of a code achieving the upper bound on $Q(\Er_p)$
It has now been shown that the quantum channel capacity of the erasure channel satisfies $Q(\Er_p)\leq 1-2p$ for erasure probability $0\leq p \leq 1/2$. Therefore, if it can also be shown that an error correcting code exists that allows for successful recovery of the initial source state, it must necessarily follow that $Q(\Er_p)= 1-2p$ in the case where $0\leq p \leq 1/2$. Although an explicit code will not be constructed, it will now be shown that a stabilizer code does indeed exist which is able to recover the state through counting arguments that exploit the properties of the stabilizer formalism. The existence proof constructed here borrows ideas deployed by Gottesman for a proof of the same claim \cite{Got}. The unacquainted reader may feel free to consult this post for relevant properties pertaining to stabilizer codes needed for this analysis.
Suppose the error correcting protocol in this case takes place over a Hilbert space of $n$ qubits, which encodes $k$ qubits. Thus, consider a random stabilizer group $S\subset P_n$ of the Pauli group on $n$ qubits, where $S$ consists of $n-k$ generators. The operators contained in the stabilizer $S$ then preserve the code space $\Hil_{code}$ encoding $k$ qubits. These $n-k$ generators of $S$ are chosen at random from the $4^n$ possible choices in $P_n$ for stabilizer generators, with the condition that each choice of a generator commute with the previous choices and be independent from the others in order to form a minimal generating set.
Now, consider some state of $\Hil_{code}\subset \Hil^{2^n}$ consisting of $n$ qubits which is sent through the erasure channel with probability of erasure $p$. Then for large, $p$ the approximate number of states that will be erased is given by $np$. Therefore, in order to attempt correcting these $np$ qubits, a syndrome measurement must be done by measuring the $n-k$ generators of $S$. Ideally, this syndrome will yield information for uniquely determining the operator $E\in P_n$ that represents the error that was inflicted on the qubits. By applying $E^\dagger\in P_n$, the error can be corrected returning the corrupted state to its original state with high fidelity.
However, error recovery may fail if two distinct errors $E_1$ and $E_2$ have the same syndrome measurement. It is important to consider the likelihood $P_{fail}$ of failing to recover the state in this case. In general, for some error $E\in P_n$, a stabilizer generator either commutes or anti-commutes with $E$ and does so with a probability of $1/2$ in each case. Since there are $n-k$ generators the probability of two operators in $P_n$ having the same syndrome is given by $(1/2)^{n-k}$. Moreover, since there are $4^{pn}$ elements of $P_n$ that have support on the $pn$ erased qubits, the probability that two distinct errors have the same syndrome measurement must be bounded by
\[
P_{fail}\leq4^{pn}\left(\frac{1}{2}\right)^{n-k}=2^{-n(1-2p-R)},
\]
where $R=k/n$ is the rate. Provided that the rate satisfies $R<1-2p$, the right-hand-side will converge to $0$ as $n$ gets arbitrarily large. This implies that the probability of failing to recover the state approaches zero in this limit, so that such random stabilizer codes succeed in correcting the states for any $0\leq p\leq 1/2$. Hence, since this rate of $1-2p$ coincides with the previously derived upper bound on the channel capacity, it must be the case that $Q(\Er_p)=1-2p$.
Conclusion
In this post it is proved that the quantum channel capacity $Q(\Er_p)$ for the erasure channel $\Er_p$ satisfies $Q(\Er_p)=1-2p$ for $0\leq p\leq 1/2$ and $Q(\Er_p)=0$ for $1/2\leq p \leq $. It was argued in the latter case that the channel capacity must be $0$ in order to prevent a violation of the no-cloning theorem. The former case first involved finding an upper bound of $1-2p$ of the channel capacity, and then it was argued through the existence of random stabilizer codes that it is always possible to recover the state with high fidelity through such means, which implies that the upper bound in the rate of the channel can actually be achieved. The context of the erasure channel provides a rare instance where the the quantum channel capacity of the channel can be computed exactly. It seems that for most other channels of interest only upper and lower bounds have been found. There has been much on going research in the past and currently on open questions pertaining to these matters with many interesting qualitative and quantitative results.
The quantum channel capacity $Q(\Phi)$ of a quantum channel $\Phi$ is defined as the maximum asymptotic rate at which information can be reliably sent through the channel. In this post, the channel capacity $Q(\Er_p)$ of the erasure channel $\Er_p$ is proved, where the erasure channel $\Er_p$ is defined as the quantum operation on a qubit which erases the state with probability $p$ and leaves it unchanged otherwise. It is proved that $Q(\Er_p)=1-2p$ for $0\leq p\leq 1/2$ and that $Q(\Er_p)=0$ for $1/2\leq p\leq 1$. The latter case is proved via a contradiction with the no-cloning theorem, and the former is proved through the existence of a quantum error correcting code in the stabilizer formalism which achieves the optimum rate.
Introduction
The theory of quantum information is ultimately concerned with defining and understanding the properties and essence of quantum systems, how they transform, and how they can be used as resource in various contexts. Moreover, it is of great theoretical interest and practical importance to obtain quantitative measures of what can or cannot be accomplished in the quantum realm. Perhaps the most prevalent feature of quantum systems is their inherent fragility and high sensitivity to unwanted noise. Unlike the seemingly robust nature of the classical world, quantum systems are vulnerable to phenomenon which seem to "destroy" their very own quantum mechanical nature.
With this is mind, consider the task of trying to send quantum state to another party through a quantum channel capable of transmitting quantum information. Ideally, we would hope for perfect transmitting capabilities of such a channel, but it is more realistic to acknowledge some likelihood of error in which the quantum system is subjected to noise--or unwanted transformations of its state---before being received at the other end. Equivalently, we may be interested in merely preserving the state of a quantum system over time as opposed to spatially relocating the system. Even then it is possible that the system may experience undesired errors to its state. Thus, it is worthwhile to understand the precise conditions under which a quantum state could be reasonably recovered when transformed under a noisy quantum channel. The channel capacity of a quantum channel gives precise bounds on the rate at which quantum information can be reliably transferred through the channel and still be recovered with sufficiently high fidelity. In order to successfully transfer information through a noisy quantum channel it is beneficial and often necessary to encode the desired state into another quantum system, which is then subjected to errors and subsequently decoded back to the original state that was intended to be sent. This latter scheme is the domain of quantum error correcting codes. This reveals an intimate connection between channel capacities and error correction, that is, the channel capacity sets fundamental limits on the existence of successful error correcting codes which attempt to correct errors induced by that channel.
In what follows, the notion of the channel capacity of a channel will be made more precise and formal. With this framework the channel capacity of the erasure channel will be rigorously calculated. In short, the erasure channel can be simply be thought of as some channel which "erases" or "destroys'' some quantum bit that passes through the channel with some probability $p$, and leaves the quantum bit unchanged otherwise. The erasure channel, although relatively primitive and crude in its action, characterizes the natural phenomenon of information being lost in some process. This differs from the related depolarizing channel, which instead of completely destroying the state effectively randomizes it. One crucial difference worth mentioning is that in the context of the depolarizing channel, the sender or receiver may not know whether the system being transmitted has been randomized. In the case of the erasure channel, we will assume that at least the receiver of the system is aware of the existence of an erasure when it occurs.
The proof for obtaining the channel capacity will proceed in three steps. For an erasure channel with a probability of erasure occurring given by $p>1/2$, it will be shown that the channel capacity must be $Q=0$ in order to avoid contradiction with a fundamental result of quantum theory concerning the inability to clone arbitrary quantum states---the no-cloning theorem. Then in the case for $p\leq 1/2$, it will first be shown that the channel capacity must be bounded above by $Q\leq1-2p$ due to a property of channel capacities that prevents the total number of bits transmitted from being super-additive when a noisy and noiseless channel are considered together.\cite{Ben1} Finally, it will be argued through the construction of random stabilizer codes that error correcting codes do indeed exist \cite{Got} that achieve the channel capacity $Q=1-2p$ for the case of $p\leq 1/2$. Interestingly, our ability to obtain exact bounds for the erasure channel is in contrast for the channel capacity of the depolarizing channel (and most other channels), for which only upper and lower bounds are known.
As a note on the required background knowledge needed to follow this proof, it will be assumed that the reader is already aware of some basic concepts and axioms in the theory of quantum computation and information \cite{NC} and possesses a fair degree of mathematical maturity---namely, a working knowledge of linear algebra together with the basics of finite group theory will be assumed. Prior knowledge on the theory of error correction and the stabilizer formalism will be beneficial.
Notational Remarks
For some Hilbert space $\Hil$, let $D(\Hil)$ denote the space of linear operators on $\Hil$ that are valid quantum density states (positive operators with unit trace). Moreover, let $C(\Hil_1,\Hil_2)$ denote the space of valid quantum channels (completely-positive and trace preserving maps) that transform states $\rho_1\in D(\Hil_1)$ to states $\rho_2\in D(\Hil_2)$. For convenience, define $C(\Hil):=C(\Hil,\Hil)$ for the case where the quantum channels map states from a particular space $\Hil$ to the same space $\Hil$.
Quantum Error Correcting Codes
To formally define the channel capacity $Q(\Phi)$ of a quantum channel $\Phi$, we first need to formalize the notion of a quantum error correcting code. The objective of quantum error correction is to encode a quantum state $\rho$ into a particular subspace $\Hil_{code}\subset\Hil$ of a larger Hilbert space $\Hil$ by the means of some quantum operation $U_c: \rho\mapsto \rho_c$, where $\rho_c\in\Hil_{code}$. The operation $U_c$ is called the \emph{encoding operation}, the state $U_c(\rho)=\rho_c$ is called the \emph{encoded state} of $\rho$, and the space $\Hil_{code}$ is often referred to as the \emph{code space} of a quantum error correcting code. States of $\Hil_{code}$ should have the desired property that they can be recovered when subjected to errors. More formally, let $\Phi\in C(\Hil)$ be some quantum channel of interest that represents the action of a possible error effecting the state $\rho_c$, and let $\Phi(\rho_c)=\rho_e$. Along with the existence of an encoding operation $U_c$, an error correcting code also has a \emph{decoding operation} given by $U_d$. This operation serves the purpose of decoding the possibly erred state $\rho_e$, and should ideally satisfy $U_d(\rho_e)=\rho$. An error correcting protocol as described here is summarized schematically in the figure below.
A circuit representing the general error error correction scheme. 
In a less idealistic scenario, when $U_d(\rho_e)=\rho'\neq\rho$, we may be satisfied with having the resulting state $\rho' $ being approximately equal to $\rho$ through some measure of closeness. One such useful measure is given by the \emph{fidelity} $F(\rho,\rho')$ of two states $\rho,\rho'\in D(\Hil)$ of some Hilbert space $\Hil$, where
\[
F(\rho,\rho')=Tr(\sqrt{\rho^{\frac{1}{2}}\rho'\rho^{\frac{1}{2}}}),
\]
and $Tr(.)$ denotes the trace operation. The fidelity of two states satisfies $0\leq F(\rho,\rho')\leq1$, where increasing values of $F(\rho,\rho')$ implying that the states $\rho$ and $\rho'$ are closer.
Noisy Channel Coding and Channel Capacity
Let $\Phi\in C(\Hil)$ be some quantum channel, and consider its $n$-fold extension
\[
\Phi^{\otimes n}=\Phi\otimes\dots\otimes\Phi \in C(\Hil^{\otimes n})=C(\Hil)\otimes\dots\otimes C(\Hil),
\]
which essentially represents the channel $\Phi$ acting individually on each of the $n$ copies of $\Hil$. Now, suppose there is some error correcting code with code space $\Hil_{code}\subset\Hil$. For noisy channel coding \cite{Nielsen}, we consider the $n$-fold code space $\Hil_{code}^{\otimes n}\subset \Hil^{\otimes n}$, and are interested in source states $\rho\in D(\Hil)$ such that there exists an encoding operation $\overline{U}_c$ that takes a source state and maps it to an encoded state $\overline{U}_c(\rho)=\rho_c\in D(\Hil_{code}^{\otimes n})$, which is then acted on by $\Phi^{\otimes n}$ and subsequently decoded by a decoding operation $\overline{U}_d$ yielding a state $\rho'\in D(\Hil)$ close to the original source state $\rho$ in regards to the fidelity $F(\rho,\rho')$.
Suppose that the dimension of the space $\Hil$ is $2^m$ so that $\rho\in D(\Hil)$ is a state consisting of $m$ qubits. In this context, the error $\Phi$ can be thought of as operating on $n$ blocks, each consisting of $m$ qubits, as illustrating in the circuit below. Thus, this coding model can be thought of using $n$ applications of the channel $\Phi$ to successfully send $m$ qubits through the channel provided that the encoding and decoding operations $\overline{U}_c$ and $\overline{U}_d$ exist which preserve the original source state $\rho$.
A circuit representing a particular noisy channel coding scheme, where 
an input state $\rho$ of some number $m$ of qubits is acted on by $4$ 
blocks consisting of $m$ qubits each.
The quantity $R=m/n$ is defined as the rate of the channel $\Phi$. In this way, the channel capacity $Q(\Phi)$ is defined as the maximum asymptotic rate at which information can be reliably sent through the channel $\Phi$. More precisely, the channel capacity is the largest value $Q(\Phi)$ such that for any $R\leq Q(\Phi)$ and $\epsilon>0$, there exists an error correcting code $\Hil_{code}^{\otimes n}$ with rate at least $R$ where for any $\rho$ with $\overline{U}_c(\rho)=\rho_c\in\Hil_{code}^{\otimes n}$, the resulting recovered state $\rho'$ satisfies $F(\rho,\rho')=1-\epsilon$.
Modelling the Erasure Channel
Let $\ket{0}, \ket{1} \in \Hil^2$ denote the standard computational basis states of a Hilbert space $\Hil^2$ of dimension $2$ representing a qubit. To model an erasure we will embed the states $\ket{0}$ and $ \ket{1}$ into a higher dimensional space and introduce a third basis state $\ket{2}\in\Hil^3$ that is orthogonal to both $\ket{0}$ and $ \ket{1}$ so that the three states together span $\Hil^3$. This third state $\ket{2}$ then serves the purpose of representing an erasure of either the states $\ket{0}$ and $ \ket{1}$.
Now consider appending to the state of a qubit an ancilliary register belonging to some Hilbert space $\Hil_E$ that represents the ``environment" of some appropriate dimension (which may depend on the context). Then the action on the basis states of the qubit of the erasure channel, denoted by $\mathcal{E}_p$, is defined as
\begin{align*}
\ket{0}\otimes\ket{0}_E& \ \ \overset{\mathcal{E}_p}\mapsto \ \ \sqrt{1-p}\ket{0}\otimes\ket{0}_E+\sqrt{p}\ket{2}\otimes\ket{1}_E \\
\ket{1}\otimes\ket{0}_E& \ \ \overset{\mathcal{E}_p}\mapsto \ \ \sqrt{1-p}\ket{1}\otimes\ket{0}_E+\sqrt{p}\ket{2}\otimes\ket{2}_E, \\
\end{align*}
where the states $\ket{0}_E,\ket{1}_E,\ket{2}_E\in\Hil_E$ are orthogonal states of the environment, and the map $\mathcal{E}_p$ has been parametrized by some real number $0\leq p\leq1$ that gives the probability that the qubit will be erased.
The Channel Capacity of the Erasure Channel
To calculate the channel capacity $Q(\mathcal{E}_p)$ of the erasure channel, we must consider the scenario described in the previous section on noisy channel coding, where now the channel of interest is $\Phi=\mathcal{E}_p$. In what follows we will prove the following theorem:
Theorem: The channel capacity of the erasure channel.
For the erasure channel $\mathcal{E}_p$, which erases the qubit state with probability $p$ and leaves the state intact with probability $1-p$, the channel capacity $Q(\mathcal{E}_p)$ is given by
$$ Q(\mathcal{E}_p)=\left\{
\begin{array}{rl}
1-2p &, 0\leq p\leq\frac{1}{2}\\
0 &, \frac{1}{2}\leq p\leq 1
\end{array}\right. $$
The proof of this theorem will proceed in three parts. First, for the case $1/2\leq p \leq 1$, we will appeal to the no cloning theorem\cite{NC} to show that no error correcting procedure can exist which allows arbitrary high recovery of states through the channel $\Er_p$. Then for the case $0\leq p \leq 1/2$, the bound $Q(\Er_p)\leq 1-2p$ will be established arguing that the number of qubits transmitted of a noisy and perfectly noiseless channel is sub-additive\cite{Ben1}. Lastly, using the stabilizer formalism of quantum error correcting codes\cite{Got}, it will be shown that an error correcting code must exist which can actually achieve the rate $1-2p$.
The $p\geq 1/2$ Case
That fact that $Q(\Er_p)=0$ for $p\geq 1/2$ shows that it is impossible to transmit any information through the erasure channel $\Er_p$. This implies that there cannot exist any error correcting procedure that can effectively recover the input state $\rho$ with a fidelity arbitrarily close to $1$. To see why this is the case, we will appeal to the \emph{no-cloning} theorem, which states that it is impossible to have a general quantum operation which has the ability to ``clone", or make copies, of arbitrary quantum states.\cite{NC} Thus, we will argue that if there did exist an error correcting procedure with valid encoding and decoding operations which effectively preserved the initial state $\rho$, then such a procedure would also provide a way of cloning these quantum states and therefore contradict the no-cloning theorem.
For the sake of argument, it is worth anthropomorphizing the situation where quantum states pass through an erasure channel. Consider three different parties Alice, Bob, and Charlie. The erasure channel can be effectively realized through a particular interaction of these parties as follows. Suppose Alice has $n$ qubits and would like to send these qubits to Bob. Moreover, suppose Charlie has the ability to intercept the qubits Alice is trying to send to Bob, and also has the ability of sending erased states $\ket{2}$ to Bob. If Charlie intercepts Alice's qubit with probability $p$, in which case he sends Bob the erased state, and with probability $1-p$ Charlie allows Alice's qubit to pass onto Bob, then for large $n$ Bob will have received approximately $(1-p)n$ qubits and Charlie will be in possession of $pn$ qubits. For $p\geq 1/2$, Charlie will end up with more qubits than Bob, and if it were possible for Bob to recover the initial states of the qubits from some error correcting procedure then surely Charlie could as well using the same procedure. Therefore, considering both of their recovered states together would constitute two copies of original state. However, by the no-cloning theorem, such a procedure cannot exist. Hence, no recovery procedure can exist in this case, which implies that the channel capacity $Q(\Er_p)$ for $p\geq 1/2$ must be $0$.
Upper bounding $Q(\Er_p)$ for $0\leq p \leq1/2$
Consider some imperfect channel $\Phi$ with channel capacity $0<Q(\Phi)<1$ and a related perfect channel $\Phi_I$ with channel capacity $Q(\Phi_I)=1$. Then if the perfect channel is used $a$ times, $aQ(\Phi_I)=a$ bits can be transmitted as expected, whereas if the imperfect channel is used $b$ times the number of bits that can be transmitted is given by $bQ(\Phi)<b$. When considered jointly, where the perfect and imperfect channels are used $a$ and $b$ times, respectively, let $c$ denote the number of bits that can be transferred when both channels are used. The number of qubits this joint channel is considered to be additive if $c=a+bQ(\Phi)$, sub-additive if $c\leq a+bQ(\Phi)$, and super-additive if $c\geq a+bQ(\Phi)$.
It will now be shown that the number of qubits that can be jointly transmitted for a perfect and imperfect channel, as described above, must be sub-additive. The proof proceeds in a manner similar to that presented in \cite{Ben1}. For the sake of contradiction, suppose that the number of jointly transmitted qubits is super-additive so that $c>a+bQ(\Phi)$ when the perfect channel is used $a$ times and the imperfect channel is used $b$ times. Observe that the perfect channel $\Phi_I$ being used $a$ times can be simulated by its imperfect counterpart $\Phi$ with channel capacity $Q(\Phi)$ by using the imperfect channel $d$ times such that $dQ(\Phi)=a$. Thus, letting $Q'(\Phi)$ denote the channel capacity of this channel, it follows that
\[
Q'(\Phi)=\frac{c}{d+b}>\frac{dQ(\Phi)+bQ(\Phi)}{d+b}=Q(\Phi),
\]
where the inequality follows from the assumption $c>a+bQ(\Phi)=dQ(\Phi)+bQ(\Phi)$. However, here only the original channel $\Phi$ was used so it cannot be the case that $Q'(\Phi)=Q(\Phi)> Q(\Phi)$. Therefore it must be the case that the number of bits jointly transmitted of the perfect and noisy channel is sub-additive implying that $c\leq a +bQ(\Phi)$.
With this result in mind, consider $p_2\leq p_1$ and suppose $n$ qubits are to be sent and it is known ahead of time that $n(1-p_2/p_1)$ qubits will arrive intact and the remaining $n(p_2/p_1)$ qubits will be erased with probability $p_1$. This scenario can be though of as a perfect channel which transmits the $n(1-p_2/p_1)$ qubits and an imperfect erasure channel which is used $n(p_2/p_1)$ times with erasure probability $p_1$ and corresponding channel capacity $Q(\Er_{p_1})$. Then the joint channel will effectively transmit $c$ qubits where,
\[
c\leq n\left(1-\frac{p_2}{p_1}\right)+n\left(\frac{p_2}{p_1}\right)Q(\Er_{p_1}),
\]
due to the sub-additivity result just shown. Then the rate $R$ of this channel is given by
\[
R=\frac{c}{n}\leq 1-\frac{p_2}{p_1}+\frac{p_2}{p_1}Q(\Er_{p_1}).
\]
Now for sufficiently large $n$, the number of qubits transmitted intact will be $n(1-p_2/p_1)\approx n(1-p_2)$, and the number of qubits erased is $n(p_2/p_1)\approx np_2$. Which corresponds to the case of the erasure channel $\Er_{p_2}$ with erasure probability $p_2$. In this case, using the inequality just derived, it follows that the channel capacity $Q(\Er_{p_2})$ must satisfy
\[
Q(\Er_{p_2})\leq 1-\frac{p_2}{p_1}+\frac{p_2}{p_1}Q(\Er_{p_1}),
\]
which holds for all $p_2\leq p_1$. From the previous section we had already shown that $Q(\Er_{1/2})=0$ for the erasure channel with erasure probability $1/2$. Furthermore, it trivially holds that $Q(\Er_{0})=1$ since this corresponds to the special case of a perfect channel where no erasures take place. Thus, for all $p\leq p_1=1/2$, we arrive at
\[
Q(\Er_{p})\leq 1-\frac{p}{1/2}+\frac{p}{1/2}Q(\Er_{1/2})=1-2p,
\]
giving an upper bound on the channel capacity $Q(\Er_{p})$ in the case where $p\leq 1/2$.
The existence of a code achieving the upper bound on $Q(\Er_p)$
It has now been shown that the quantum channel capacity of the erasure channel satisfies $Q(\Er_p)\leq 1-2p$ for erasure probability $0\leq p \leq 1/2$. Therefore, if it can also be shown that an error correcting code exists that allows for successful recovery of the initial source state, it must necessarily follow that $Q(\Er_p)= 1-2p$ in the case where $0\leq p \leq 1/2$. Although an explicit code will not be constructed, it will now be shown that a stabilizer code does indeed exist which is able to recover the state through counting arguments that exploit the properties of the stabilizer formalism. The existence proof constructed here borrows ideas deployed by Gottesman for a proof of the same claim \cite{Got}. The unacquainted reader may feel free to consult this post for relevant properties pertaining to stabilizer codes needed for this analysis.
Suppose the error correcting protocol in this case takes place over a Hilbert space of $n$ qubits, which encodes $k$ qubits. Thus, consider a random stabilizer group $S\subset P_n$ of the Pauli group on $n$ qubits, where $S$ consists of $n-k$ generators. The operators contained in the stabilizer $S$ then preserve the code space $\Hil_{code}$ encoding $k$ qubits. These $n-k$ generators of $S$ are chosen at random from the $4^n$ possible choices in $P_n$ for stabilizer generators, with the condition that each choice of a generator commute with the previous choices and be independent from the others in order to form a minimal generating set.
Now, consider some state of $\Hil_{code}\subset \Hil^{2^n}$ consisting of $n$ qubits which is sent through the erasure channel with probability of erasure $p$. Then for large, $p$ the approximate number of states that will be erased is given by $np$. Therefore, in order to attempt correcting these $np$ qubits, a syndrome measurement must be done by measuring the $n-k$ generators of $S$. Ideally, this syndrome will yield information for uniquely determining the operator $E\in P_n$ that represents the error that was inflicted on the qubits. By applying $E^\dagger\in P_n$, the error can be corrected returning the corrupted state to its original state with high fidelity.
However, error recovery may fail if two distinct errors $E_1$ and $E_2$ have the same syndrome measurement. It is important to consider the likelihood $P_{fail}$ of failing to recover the state in this case. In general, for some error $E\in P_n$, a stabilizer generator either commutes or anti-commutes with $E$ and does so with a probability of $1/2$ in each case. Since there are $n-k$ generators the probability of two operators in $P_n$ having the same syndrome is given by $(1/2)^{n-k}$. Moreover, since there are $4^{pn}$ elements of $P_n$ that have support on the $pn$ erased qubits, the probability that two distinct errors have the same syndrome measurement must be bounded by
\[
P_{fail}\leq4^{pn}\left(\frac{1}{2}\right)^{n-k}=2^{-n(1-2p-R)},
\]
where $R=k/n$ is the rate. Provided that the rate satisfies $R<1-2p$, the right-hand-side will converge to $0$ as $n$ gets arbitrarily large. This implies that the probability of failing to recover the state approaches zero in this limit, so that such random stabilizer codes succeed in correcting the states for any $0\leq p\leq 1/2$. Hence, since this rate of $1-2p$ coincides with the previously derived upper bound on the channel capacity, it must be the case that $Q(\Er_p)=1-2p$.
Conclusion
In this post it is proved that the quantum channel capacity $Q(\Er_p)$ for the erasure channel $\Er_p$ satisfies $Q(\Er_p)=1-2p$ for $0\leq p\leq 1/2$ and $Q(\Er_p)=0$ for $1/2\leq p \leq $. It was argued in the latter case that the channel capacity must be $0$ in order to prevent a violation of the no-cloning theorem. The former case first involved finding an upper bound of $1-2p$ of the channel capacity, and then it was argued through the existence of random stabilizer codes that it is always possible to recover the state with high fidelity through such means, which implies that the upper bound in the rate of the channel can actually be achieved. The context of the erasure channel provides a rare instance where the the quantum channel capacity of the channel can be computed exactly. It seems that for most other channels of interest only upper and lower bounds have been found. There has been much on going research in the past and currently on open questions pertaining to these matters with many interesting qualitative and quantitative results.

