### The Factoring Problem: Sampling Estimates to an Integer Multiple of k/r

In this section, it will be shown that the problem of finding the order $r$ satisfying $x^r=1(mod\hspace{1mm} N)$ for some coprime integers $x$ and $N$, can be solved through classical means provided with random samples of integer multiples $k/r$ of $1/r$.  Furthermore, a quantum algorithm will be constructed based off of the eigenvalue estimation algorithm which can efficiently sample estimates of these random integer multiples of $1/r$. The results obtained from these sub-routines can then be used to factor $N$ via the reductions described in the previous section.

Let $r$ be some positive integer and consider the set $\{k/r\hspace{1mm}|\hspace{1mm}k\in\mathbb{Z}\}$ consisting of integer multiples of $1/r$. Now let $k_1/r$ and $k_2/r$ be two random elements from this set, and suppose the integer $r$ is unknown. Given the values of the $k_i/r$ it may not be possible to determine the value of $r$ because the fractions may or may not be simplified into lower terms leaving a certain ambiguity in determining $r$. However, if two other fractions can be found of the form $c_1/r_1$ and $c_2/r_2$ such that $gcm(c_1,r_1)=gcm(c_2,r_2)=1$ with $c_1/r_1=k_1/r$ and $c_2/r_2=k_2/r$, then it is possible to determine the value of $r$. With these conditions, the fractions $c_1/r_1$ and $c_2/r_2$ are both simplified to lowest terms so $r_1$ and $r_2$ can be uniquely determined. This would in turn determine $r$ since $lcm(r_1,r_2)=r$, where $lcm(r_1,r_2)$ denotes the least common multiple of $r_1$ and $r_2$.

The classical continued fractions algorithm \cite{hardy} (although not discussed here for the sake of brevity), can be used to find fractions of the type $c_1/r_1$ and $c_2/r_2$ provided with some random multiples $k_1/r$ and $k_2/r$, and can do so efficiently with a time complexity in $O(log^2r)$. Making use of this technique will be essential in the post-processing required to determine orders after the quantum part of the algorithm is applied.

The problem remains to have the ability to randomly sample integer multiples $k/r$ of $1/r$. Actually, as a consequence of the continued fractions algorithm it will suffice to obtain an estimate of $k/r$ to a certain degree of precision in order to arrive at the fraction $c_1/r_1$. This motivates the following problem.

Sampling Estimates to random integer multiples of 1/r

Input: Integers $x$ and $N$ such that $gcd(x,N)=1$

Problem: Output a number $y\in\{0,1,\dots,2^{n}-1\}$ such that for each $k\in\{0,1,\dots,r-1\}$, the probability $p$ such that $\left| \frac{y}{2^n}- \frac{k}{r}\right|\leq \frac{1}{2r^2}$ is $p\geq\frac{c}{r}$ for some constant $c>0$, where $r$ denotes the unknown order of $x$.

The distance constraint required in the problem statement ensures that the estimate meets a certain degree of precision in order to yield appropriate results. In addition, there is  a probability constrained present to make sure such a sampling is faithful in practice.

To aid in solving the sampling problem, define the operator given by
$U_x: \left|s\right>\mapsto \left|sx (mod\hspace{1mm}N)\right>.$
with the assumption that $x$ is coprime to $N$ such a map always has a well-defined inverse so that $U_x$ in indeed a unitary operator. The order $r$ of $x$ is reflected in the  action of the operator $U_x$ since $x^r=1(mod\hspace{1mm} N)$ and
$U_x^r\left|s\right>=\left|sx^r (mod\hspace{1mm}N)\right>=\left|s\right>.$
This implies that the eigenvalues of $\left|s\right>$ will actually be $r^{th}$ roots of unity of the form $e^{i2\pi k/r}$ for integers $k\in\{0,1,\dots, r-1\}$.

Now, consider the state
$\left|u_k\right>=\displaystyle\frac{1}{\sqrt{r}}\displaystyle\sum\limits_{s=0}^{r-1}e^{-i2\pi \frac{k}{r} s}\left|x^s (mod\hspace{1mm}N)\right>.$
Then observe that
$\begin{array}{r l} U_x\left|u_k\right>=&\displaystyle\frac{1}{\sqrt{r}}\displaystyle\sum\limits_{s=0}^{r-1}e^{-i2\pi \frac{k}{r} s}U_x\left|x^s (mod\hspace{1mm}N)\right> \\ =& \displaystyle\frac{1}{\sqrt{r}}\displaystyle\sum\limits_{s=0}^{r-1}e^{-i2\pi \frac{k}{r} s}\left|x^{s+1} (mod\hspace{1mm}N)\right> \\ =& \displaystyle\frac{e^{i2\pi \frac{k}{r}}}{\sqrt{r}}\displaystyle\sum\limits_{s=0}^{r-1}e^{-i2\pi \frac{k}{r} (s+1)}\left|x^{s+1} (mod\hspace{1mm}N)\right>\\ =&e^{i2\pi \frac{k}{r}}\left|u_k\right>, \end{array}$
where the last identity follows from $e^{i2\pi \frac{k}{r}r}\left|x^r (mod\hspace{1mm}N)\right>=e^{i2\pi \frac{k}{r}0}\left|x^0 (mod\hspace{1mm}N)\right>=\left|1 (mod\hspace{1mm}N)\right>.$ Hence, the $\left|u_k\right>$ is an eigenstate of $U_x$ with eigenvalue $e^{i2\pi \frac{k}{r}}$.

If such a state $\left|u_k\right>$ was provided, then the eigenvalue estimation algorithm can be implemented on the input state $\left|0\right>\left|u_k\right>$ to yield the output $\left|\tilde{k/r}\right>\left|u_k\right>$. Then upon measuring the first register an estimate $\tilde{k/r}$ of some integer multiple of $1/r$ could be determined.  The issue here is that, since $r$ is originally unknown, the state $\left|u_k\right>$ cannot be prepared directly because it is defined in terms of $r$. Fortunately,  such a state does not need to be constructed directly in order to effectively sample estimate to an integer multiple of $1/r$. For consider the following equally weighted superposition of all $r$ states $\left|u_k\right>$
$\begin{array}{r l} \displaystyle\frac{1}{\sqrt{r}}\displaystyle\sum\limits_{k=0}^{r-1}\left|u_k\right>=& \displaystyle\frac{1}{\sqrt{r}}\displaystyle\sum\limits_{k=0}^{r-1}\displaystyle\frac{1}{\sqrt{r}}\displaystyle\sum\limits_{s=0}^{r-1}e^{-i2\pi \frac{k}{r} s}\left|x^s (mod\hspace{1mm}N)\right> \\ =& \displaystyle\frac{1}{r}\displaystyle\sum\limits_{s=0}^{r-1}\left(\displaystyle\sum\limits_{k=0}^{r-1}e^{-i2\pi \frac{k}{r} s}\right)\left|x^s (mod\hspace{1mm}N)\right> \end{array}$
The amplitude $\alpha_1$ of the state $\left|1\right>=\left|x^0 (mod\hspace{1mm}N)\right>$ in the above superposition. Notice this occurs when $s=0$ since $x^0=1$. So the amplitude is therefore
$\alpha_1=\displaystyle\frac{1}{r}\displaystyle\sum\limits_{k=0}^{r-1}e^{-i2\pi \frac{k}{r} 0}=\displaystyle\frac{1}{r}\displaystyle\sum\limits_{k=0}^{r-1}1=1.$
Since the superposition satisfies the normalization constraint, the amplitudes of all other states $\alpha_s$ for $s \neq 0$ must be zero. Hence,
$\displaystyle\frac{1}{\sqrt{r}}\displaystyle\sum\limits_{k=0}^{r-1}\left|u_k\right>=\left|1\right>$  and this superposition is really just the computational basis state $\left|1\right>$ in disguise.

If instead this superposition was input into the eigenvalue estimation algorithm, then some superposition of states corresponding to estimates of integer multiples of $1/r$ will be output. This is ultimately a consequence of the linearity of the operator implementing the eigenvalue estimation algorithm.  More explicitly, the action of the eigenvalue algorithm when the target register is prepared in the superposition consisting of  equally weighted eigenstates, outputs an equally weighted superposition of estimates of integer multiples of $1/r$:
$\left|0\right>\left|1\right>=\displaystyle\frac{1}{\sqrt{r}}\displaystyle\sum\limits_{k=0}^{r-1}\left|0\right>\left|u_k\right> \mapsto \displaystyle\frac{1}{\sqrt{r}}\displaystyle\sum\limits_{k=0}^{r-1}\left|\tilde{k/r}\right>\left|u_k\right>.$
By measuring the control register some state $\left|\tilde{k/r}\right>$ corresponding to an integer $x$ will be observed with equal probability such that $x/2^n$ is an estimate of an integer multiple $\frac{k}{r}$. Therefore, a uniformly random integer multiple of $1/r$ can be sampled. A circuit depicting the algorithm uses to sample estimates of a random integer multiple of $1/r$ is shown in the figure below. Having found such samples, allows the order $r$ to be determined using the continued fractions algorithm. Then, this order $r$ can be used to find a non-trivial factor of the integer $N$ as described through the classical methods involved in the reduction of the problem. (A circuit for the problem of sampling random integer multiples of $1/r$. This circuit is analogous to the general circuit for the eigenvalue estimation problem shown here, except here the controlled $-U_{x}^{x}$ operation is defined differently. In this circuit, the input in the target register is prepared to be in the state $\left|1\right>$ instead of an eigenstate of $U_x$. Consequently the output in the control register is a superposition of estimates $\left|\tilde{k/r}\right>$ of integer multiples of $1/r$.)

This circuit is nearly identical to the circuit presented for the general eigenvalue estimation algorithm. Here, the controlled$-U_{x}^{x}$ gate is supposed to signify an analogous sequence of controlled $-U_{x}^{2^s}$ operations, where $s$ ranges from $0\leq s \leq r-1$. Since the action of the $U_x$ gate is defined to multiply a basis state by $x(mod\hspace{1mm}N)$, a natural way to compute the $c-U_x^{2^s}$ gate for some $s$ would be to simply apply the $c-U_x$ gate $2^s$ times. This, however, would not be efficient as it requires an exponential number of operations to implement. Instead, observe that by applying a similarly defined gate $c-U_{x^{2^s}}$ accomplishes an equivalent operation and only needs to be applied a single time. Moreover, calculating the value of $x^{2^s}$ by repeatedly squaring $x^2$ modulo $N$ can be done classically with only $s$-many multiplications.

To be more rigorous, it can be shown that the $c-U_{x}^{x}$ operation can be computed with time in $O((logN)^2loglogNlogloglogN)$.\cite{mosca} Recall that the $QFT$ and $QFT^{-1}$ can each be computed in time $O((logN)^2)$. Therefore, the whole sampling problem can be implemented efficiently in this case with time $O((logN)^2loglogNlogloglogN)$. When compared to the classical complexity of the same problem the best known bounds are in $e^{O((logN)^{1/2}(loglogN)^{1/2})}$. Thus, the quantum algorithm can perform exponentially faster than the classical when trying to factor integers.

It is worth mentioning that the approach taken by Shor in his factoring algorithm \cite{shor} involved a slightly different analysis. However, when comparing his method to the one outlined here it can be seen that the main different results from analyzing the problem in a different basis.

### The Factoring Problem: The Reduction of Factoring to Order-finding

For any positive integer $N$, the fundamental theorem of arithmetic says that $N$ can be expressed as a product of prime numbers alone. That is, there exist  distinct prime numbers $p_i$ and positive integers $q_i$ such that
$N=p_1^{q_1}p_2^{q_2}\dots p_m^{q_m}.$
The factoring problem can thus be phrased as follows.

The Factoring Problem

Input: A positive integer $N$.

Problem: Determine distinct prime numbers $p_i$ and integers $q_i$ such that  $N=p_1^{q_1}p_2^{q_2}\dots p_m^{q_m}.$

The factoring problem becomes more or less difficult depending on the nature of the integer $N$ to be factored. If $N$ is even, then $2$ will always be a factor of $N$. Moreover, some power of $2$  may be a factor of $N$, and each such power of $2$ will also be a factor of $N$. Therefore, assume $N$ is odd so that $2$ is not a factor of $N$.  A slightly harder case is if $N=p^q$ for some odd prime $p$ and integer $q$. However, even this scenario is easy to solve classically because it must always be the case that $q\leq log_2(N)$. So by simply computing the first $log_2(N)$ roots of $N$, say $N^{1/k}$ for some $k\leq log_2(N)$, and then checking to see if $N^{1/k}$ is a prime number determines a $p$ such that $p^k=N$. This test would need to be repeated at most $log_2(N)$ times, and since the problem of determining whether some integer is prime is in $\mathbf{P}$ the problem of finding a prime factor $p$ such that $p^q=N$ will have a time complexity corresponding to some polynomial in $log_2(N)$.

The remaining case to consider in factoring $N$ is when $N$ is an odd, non-prime power--that is, $N$ has at least two distinct  odd prime factors. In this case the integer can be expressed as $N=N_1N_2$ for two positive integers $N_1$ and $N_2$ both less than $N$. Each $N_i$ is a factor of $N$, which can in turn be factored further if the $N_i$ is not itself a prime integer. Hence, the problem of factoring $N$  in this case can be reduced the problem of splitting an odd, non-prime power $N$.

Splitting an Odd, Non-prime Power

Input: An odd positive integer $N$ such that $N$ is not a power of some prime.

Problem: Determine positive integer factors $N_1$ and $N_2$ less than N such that $N=N_1N_2$.

For any factorization of $N$ as $N=N_1N_2$, it must be the case that both $N_i$ are such that $N_i\leq \sqrt{N}$. Therefore, the problem of completely factoring an integer $N$ can be reduced to splitting some odd, non-prime integer $N$ at most $log_2(N)$ times. This reduction can be done classically and deterministically placing the complexity of such a reduction in the class $\mathbf{P}$.

Finally, suppose we are in the scenario where $N$ is an odd, non-prime power. Denote by $gcd(x,y)$ the greatest common divisor of two integers $x$ and $y$, and call two numbers coprime if and only if $gcd(x,y)=1$. Let $N$  be some integer to be factored, and $x$ be an integer such that $0<x<N$. If $N$ and $x$ are not coprime, then $gcd(x,N)>1$ will be a nontrivial factor of $N$. Moreover, $gcd(x,N)$ can be computed efficiently through classical means such as the extended Euclidean algorithm in time $O(log_2(N))$.\cite{knuth}

Instead, suppose that some integer $0<x<N$ is chosen such that $x$ and $N$ are coprime. Then since $gcd(x, N)=1$, a non-trivial factor cannot be found by simply evaluating $gcd(x,N)$. Then for $x$ coprime to $N$, consider the following sequence of numbers
$1=x^0 (mod\hspace{2mm}N),\hspace{2mm} x^1 (mod\hspace{2mm}N),\hspace{2mm} x^2 (mod\hspace{2mm}N), ...$
This sequence will necessarily begin to repeat itself after finitely many terms. That is, the function
$f_{x,N} : \{0,1,2,...\} \rightarrow \{0,1,2,...\},$ sending $a\mapsto x^a (mod\hspace{2mm}N),$
has a period $r$, which is the smallest positive integer $r$ that satisfies $f_{x,N}(a+r)=f_{x,N}(a)$ for all $a$. Such an $r$ also satisfies $x^r=1 (mod\hspace{2mm}N)$, and will be called the order of $x$.

Suppose that the period $r$ of the function $f_{x,N}(a)=x^a (mod\hspace{2mm}N)$ is given. In addition, suppose that $r$ is an even number. The it follows that
$x^r =1 (mod\hspace{2mm}N) \Longleftrightarrow (x^{ \frac{r}{2}})^2 = 1 (mod\hspace{2mm}N) \Longleftrightarrow (x^{ \frac{r}{2}})^2-1 = 0 (mod\hspace{2mm}N) \Longleftrightarrow$
$(x^{\frac{r}{2}} +1)(x^{ \frac{r}{2}}-1)= 0 (mod\hspace{2mm}N) \Longleftrightarrow (x^{ \frac{r}{2}} +1)(x^{ \frac{r}{2}}-1) = kN$ for some $k$.

Since $x>1$, this implies both $x^{ \frac{r}{2}} +1>1$ and $x^{ \frac{r}{2}}-1>1$, so that $k>0$. Therefore, $x^{ \frac{r}{2}} +1$ and $x^{ \frac{r}{2}}-1$ must share a factor with $N$. Now suppose that $(x^{ \frac{r}{2}} +1)$ and $(x^{ \frac{r}{2}}-1)$ are not multiples of $N$. Then the common factor shared with $N$ must be less than $N$. Calculating $gcd(x^{ \frac{r}{2}} +1, N)$ and $gcd(x^{ \frac{r}{2}}-1, N)$ would then yield two nontrivial factors of $N$. It has been shown that if we know $x$ and $r$ meeting the criterion above, then we could effectively determine nontrivial factors of $N$. Therefore, the problem of splitting an odd, non-prime integer $N$ can be reduced to the problem of finding the order $r$ of a random element $x$ $mod\hspace{2mm}N$.

The Order Finding Problem

Input:
Integers $x$ and $N$ such that $gcd(x,N)=1$

Problem: Find the order $r$ of $x$ $mod\hspace{2mm}N$, where $x^r=1 (mod\hspace{2mm}N)$.

In this reduction it was required that $r$ be an even number so that $r/2$ is also an integer. Since $x$ such that $0<x<N$ is initially chosen at random, it is possible to choose a $x$ where $r$ is not even or that at least one of $(x^{ \frac{r}{2}} +1)$ and $(x^{ \frac{r}{2}}-1)$ is a multiple of $N$. Such an $x$ would not yield a factor using the method described above. In this case, to successfully factor $N$ by reducing the problem to order finding, another $x$ would have to be chosen that does meet the necessary criterion. However, it can be shown that with probability greater than $1/2$, an $x$ can be chosen such that $r$ is even and $(x^{ \frac{r}{2}} +1)$ and $(x^{ \frac{r}{2}}-1)$ are not multiples of $N$. This reduction of factoring to order finding has a probabilistic classical algorithm placing the problem corresponding to this reduction in the complexity class $\mathbf{BPP}$.\cite{mosca}

The previous arguments only showed that an integer $N$ can be factored if the order $r$ of some integer $x$ coprime to $N$ is given, but did not offer any means of actually finding such an $r$. If an algorithm exists for finding these orders, and if it can accomplish the task efficiently, then an efficient algorithm for the factoring problem would follow through the series of reductions since each reduction cold be done efficiently as well. Its important to note that all these previous reductions are classical, and either deterministic or probabilistic. That is, no quantum algorithm has been used thus far in attempting to solve the factoring problem.  The next section will explicitly show how to find orders using an efficient quantum algorithm making use of the eigenvalue estimation algorithm, which will in turn solve the factoring problem efficiently.

### The Factoring Problem

Given some positive integer $N$ consider the problem of determining  its factors. Factors of $N$ are just the integers that evenly divide $N$. For a number that is not too large this calculation seems reasonable. The factoring problem can naturally be phrased as a decision problem, in which case the question is: Does $N$ have any nontrivial factors  less than some other integer $p$?'. As a decision problem, this problem can always be decided in the sense that there always exists a way in principle to answer this question with a yes' or `no' answer. For instance, some algorithm can simply perform an exhaustive test to check whether or not each integer less than $N$ divides $N$ evenly. Suppose now that the number N is some large 500 digit number, or even larger still! Unless this algorithm was able to make some lucky guesses at what the factors may be, this task would take an algorithm like this an impractical amount of time to accomplish. Without prior knowledge to any properties the number N may have, factoring a large N using a computer takes a long time since the computer has to test many potential factors to actually find them.

On the other hand, the task of simply multiplying two numbers together is a task that computers can efficiently solve. The difference in these two computational tasks, which is the ease in multiplying and the difficulty in factoring, forms the basis of a widely used cryptographic scheme known as  the RSA protocol.\cite{rivest} The RSA protocol is used mainly for purposes of digital security such as banking and other internet transactions. However, for this security, the implementors of RSA are relying on the assumption that no one has an algorithm or  computer fast enough to successfully factor N. For all practical purposes this assumption is reasonable since no classical algorithm is known which enables  efficient factoring. The current best being the number field sieve.\cite{lenstra}

In 1994, Peter Shor \cite{shor} constructed an algorithm to be implemented on a quantum computer that can factor a large integer exponentially faster than the best classical algorithms. This is interesting because Shor's algorithm provides evidence that quantum computers may have greater capabilities than traditional classical computers. One of these being the ability for quantum computers to render the RSA protocol obsolete. The running time of the algorithm is bounded by a polynomial in $n$, where $n\approx log(N)$ is the number of qubits used, thus placing the problem of factoring in the complexity class $\mathbf{BQP}$. Although this problem is in $\mathbf{NP}$ since its easy to verify that a factor is indeed a factor, it is not known if it is also in $\mathbf{BPP}$ or even $\mathbf{P}$. In other words, Shor's quantum algorithm is exponentially faster than any classical algorithm that is known at present.

The essence of Shor's algorithm lies in the fact that the problem of factoring an integer can be reduced to the problem of finding the period of a certain function, which is a problem that can be  efficiently solved on a quantum computer by making use of the eigenvalue estimation algorithm as it will be shown. This reduction can be done classically, meaning that there already exists known efficient and classical algorithmic means for factoring an integer provided with the period of a certain function. To understand why these reductions work requires some number theoretic results. The rest of this section will be invested on explaining the details of this reduction and how a quantum algorithm enables an efficient solution.

### Eigenvalue Estimation

This section will combine the technique deployed in the phase kick-back algorithms, and the method of phase estimation to solve a problem of a similar nature. Here, the objective will be to determine the eigenvalue of the form $e^{i2\pi\omega}$ of a eigenvector $\left|\phi\right>$ associated to a particular unitary operator $U$. In the eigenvalue estimation problem, the details of the unitary operator $U$ and eigenvector $\left|\phi\right>$ are unknown and not of main concern. It is assumed that such a state is already provided as well as the means to operate on the state with a circuit that implements $U$.

Eigenvalue Estimation Problem

Input: A quantum circuit implementing a unitary operator $U$ with an eigenvector $\left|\phi\right>$ and associated eigenvalue $e^{i2\pi\omega}$.

Problem: Determine an estimate for the unknown parameter $\omega$ which determines the eigenvalue.

Since $\left|\phi\right>$ is an eigenvector of $U$ with eigenvalue $e^{i2\pi\omega}$, then by definition $U\left|\phi\right>=e^{i2\pi\omega}\left|\phi\right>$. Define a controlled$-U$ operation, $c-U$ where a single qubit is used as the control register.  Then if the target register is prepared in the state $\left|\phi\right>$, the action of $c-U$ is given by
$\begin{array}{r l} c-U\left|0\right>\left|\phi\right>=&\left|0\right>\left|\phi\right>, \\ c-U\left|1\right>\left|\phi\right>=&e^{i2\pi\omega}\left|1\right>\left|\phi\right>. \end{array}.$
Thus, the phase factor $e^{i2\pi\omega}$ can be associating to the control qubit as in the phase kick-back algorithms. Now, denote the action of applying $c-U$ $m$ times as $c-U^m$, and observe that $c-U^m$ acts as follows
$\begin{array}{r l} c-U^m\left|0\right>\left|\phi\right>=&\left|0\right>\left|\phi\right>, \\ c-U^m\left|1\right>\left|\phi\right>=&e^{i2\pi\omega m}\left|1\right>\left|\phi\right>. \end{array}.$
This follows straight from the definition $\left|\psi\right>$ being an eigenvector of $U$ with eigenvalue $e^{i2\pi\omega}$, and since a product of exponentials is equivalent to a single exponential whose argument is the sum of the arguments of the exponentials.

The ability to perform the action of $c-U^m$ for various values of $m$ is helpful in solving the eigenvalue estimation problem, because this will allow the value of $\omega$ to be encoded into the relative phases of states. If the state
$\left|\psi\right>=\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{j=0}^{2^n-1}e^{i2\pi \omega j}\left|j\right>,$
can be prepared, then as done in the phase estimation algorithms the value of $\omega$ can be estimated by applying the $QFT^{-1}$ to the state $\left|\psi\right>$. However, since the value of $\omega$ is unknown this state cannot be prepared directly. Fortunately, the ability to perform $c-U^m$ operations on the state $\left|\phi\right>$, which is given, can be used to construct the state $\left|\psi\right>$. Therefore, once this state $\left|\psi\right>$ has been constructed and the phase estimation algorithm is performed by making use of $QFT^{-1}$, information about $\omega$ will be obtained enabling an estimate of the eigenvalue.

A circuit can be constructed using $c-U^m$ gates that takes some basis state, say $\left|\mathbf{0}\right>\in\mathcal{H}^{2^n}$, and produces the state
$\begin{array}{r l} \left|\psi\right>=&\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{j=0}^{2^n-1}e^{i2\pi \omega j}\left|j\right> \\ = &\left( \displaystyle\frac{\left|0\right>+e^{i2\pi \omega2^{n-1}}\left|1\right>}{\sqrt{2}}\right)\otimes\left( \displaystyle\frac{\left|0\right>+e^{i2\pi \omega2^{n-2}}\left|1\right>}{\sqrt{2}}\right)\otimes\dots\otimes\left( \displaystyle\frac{\left|0\right>+e^{i2\pi \omega2^{0}}\left|1\right>}{\sqrt{2}}\right) \\ =&\left|\psi_1\right>\otimes\left|\psi_2\right>\otimes\dots\otimes\left|\psi_n\right>. \end{array}.$
To accomplish this, first observe the action of $c-U^m$ on a qubit in an equally weighted superposition in the control register, and the eigenstate $\left|\phi\right>$ in the target register
$c-U^m\left(\displaystyle\frac{\left|0\right>+\left|1\right>}{\sqrt{2}}\right)\left|\phi\right>=\left(\displaystyle\frac{\left|0\right>+e^{i2\pi\omega m}\left|1\right>}{\sqrt{2}}\right)\left|\phi\right>.$
Then by setting $m=2^n-k$ for integers $1\leq n\leq n$, it follows that
$c-U^{2^{n-k}}\left(\displaystyle\frac{\left|0\right>+\left|1\right>}{\sqrt{2}}\right)\left|\phi\right>=\left(\displaystyle\frac{\left|0\right>+e^{i2\pi\omega 2^{n-k}}\left|1\right>}{\sqrt{2}}\right)\left|\phi\right>=\left|\psi_k\right>,$
which is the state of the $k^{th}$ qubit in the register of the state $\left|\psi\right>$.

Therefore, first take a control register in the basis state $\left|\mathbf{0}\right>\in\mathcal{H}^{2^n}$ and apply the Hadamard gate to each qubit putting putting the register in an equally weighted superposition
$H^{\otimes n}\left|\mathbf{0}\right>=\displaystyle\bigotimes\limits_{k=1}^{n}\left(\displaystyle\frac{\left|0\right>+\left|1\right>}{\sqrt{2}}\right).$ Then by applying a $c-U^{2^{n-k}}$ multiple times, for $1\leq k\leq n$, each time controlled by the $k^{th}$ qubit in the control register the state $\left|\psi\right>$ is produced:
$\displaystyle\bigotimes\limits_{k=1}^{n}\left[\left(\displaystyle\frac{\left|0\right>+\left|1\right>}{\sqrt{2}}\right)\left|\phi\right>\right]=\displaystyle\bigotimes\limits_{k=1}^{n}\left|\psi_k\right>\left|\phi\right>=\left|\psi\right>.$ (The circuit on the left implements the series of $c-U^{2^{n-k}}$ gates controlled by the $n$ qubits in the control register, where each is in the superposition $\frac{1}{\sqrt{2}}\left(\left|0\right>+\left|1\right>\right)$ and the target register in the eigenstate $\left|\phi\right>$, producing the state $\left|\psi\right>$ as output in the control register. The controlled gate labelled by $U^x$ in the circuit shown below is the shorthand symbol that represents the circuit above.) Before giving a complete construction of a circuit that solves the eigenvalue estimation problem, consider the circuit shown that represents just the sequence of $c-U^m$ gates acting on the states described above. As a notational convenience, this entire circuit will be represented in shorthand by the controlled$-U^x$ gate also shown above. Then the output $\left|\psi\right>$ provided in the control register after the controlled$-U^x$ is applied to $H^{\otimes n}\left|\mathbf{0}\right>\left|\phi\right>$ can be used as the input to the phase estimation algorithm. This is done by applying $QFT^{-1}$ to the control register, which produces some state $\left|x\right>$ such that $x/2^m=\tilde{\omega}$ is an estimate of the phase parameter  $\omega$ in the state $\left|\psi\right>$. This estimate can then be used to determine an estimate $e^{i2\pi\tilde{\omega}}$ of the eigenvalue. The circuit solving the eigenvalue estimation problem is shown in the figure below. Notice that the first gate is given by a $QFT$ acting on the control register, since this gate's action is equivalent to the action of $H^{\otimes n}$ on a control register where every qubit is in the state $\left|0\right>$. By doing this, a certain symmetry in the circuit and algorithm is highlighted that is similar to many of the previously introduced algorithms. In this case, it is seen that the $QFT$ plays an analogous role of the Hadamard gate $H^{\otimes n}$ in its use to encode and decode information after some controlled gate is applied.

(A circuit for the eigenvalue estimation algorithm. The $n$ qubit control register begins in the state $\left|\mathbf{0}\right>$ and the target register in the eigenstate $\left|\phi\right>$. First, the $QFT$ is applied to the control register, followed by the controlled$-U^x$ gate, and then the $QFT^{-1}$ is applied to the control register. This produces some basis state $\left|x\right>$ in the control register, which can then be used to estimate the eigenvalue through the phase parameter $\tilde{\omega}=x/2^n$.)

Since the $QFT$ and $QFT^{-1}$ can be implemented efficiently on a quantum computer, then provided with an eigenstate $\left|\phi\right>$ and an efficient means to implement the controlled$-U^x$, the eigenvalue estimation algorithm can solve the problem of estimating an eigenvalue efficiently.  The eigenvalue estimation problem is of a very general nature. After all, every unitary operator  acting on qubits has at least two eigenvalue/eigenvector pairs. Although this problem may still seem  of a somewhat contrived variety, the next section will reveal a remarkable application of the eigenvalue estimation algorithm to the problem of factoring integers efficiently.

### Phase Estimation

The utility of the quantum Fourier transform $QFT$ may not be immediately obvious. In this section, it will be shown that the $QFT$ can be applied to some state containing unknown relative phase factors in order to determine information about the phase.  Let $\omega=x/2^n$ for some positive integer $x<2^n$ so that $\omega=0.x_1x_2\dots x_n$, and suppose some state of the form
$\left|\psi\right>=\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{j=0}^{2^n-1}e^{i2\pi \omega j}\left|j\right>=\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{j=0}^{2^n-1}e^{\frac{i2\pi xj}{2^n}}\left|j\right>$
is provided. Since the action of the $QFT$ on the basis state $\left|x\right>=\left|x_1x_2\dots x_n\right>\in\mathcal{H}^{2^n}$ is
$QFT\left|x\right>=\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{j=0}^{2^n-1}e^{\frac{i2\pi xj}{2^n}}\left|j\right>=\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{j=0}^{2^n-1}e^{i2\pi \omega j}\left|j\right>=\left|\psi\right>,$
then the inverse $QFT^{-1}$ will map the state $\left|\psi\right>$ to the basis state $\left|x\right>$:
$QFT^{-1}\left|\psi\right>=QFT\left(\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{j=0}^{2^n-1}e^{i2\pi \omega j}\left|j\right>\right) =\left|x\right>.$
If the value of $\omega=x/2^n$ in the relative phases of $\left|\psi\right>$ was unknown, then by simply applying the $QFT$ to $\left|\psi\right>$ and measuring yields the state $\left|x\right>$, which can then be used to determine $\omega$. This consequence motivates the phase estimation problem.

Phase Estimation Problem

Input: A state $\left|\psi\right>=\displaystyle \frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{j=0}^{2^n-1}e^{i2\pi \omega j}\left|j\right>\in\mathcal{H}^{2^n}$, where the phase parameter $\omega \in [0,1)\in\mathbb{R}$ is unknown.

Problem:  Determine an estimate for the unknown  phase parameter $\omega$.

The reason why an estimate the phase parameter $\omega$ is desired in the phase estimation problem is because $\omega$ may not be a rational number. In the case where $\omega$ is an irrational number, the binary expansion $\omega=x_1x_2x_3\dots$ is infinite. To determine $\omega$ exactly would then require infinitely many bits of information. On the other hand, if $\omega$ is a rational number then the binary expansion $\omega=x_1x_2\dots x_n$ will eventually terminate after $n$ digits. Even in this case the number of digits $n$ in the expansion can be an arbitrary large number. Therefore, an estimate $\tilde{\omega}=x_1x_2\dots x_k$ of $\omega$ might still be necessary. In the phase estimation problem, how close an estimate $\tilde{\omega}$ is to the parameter $\omega$ is determined by the number of qubits used to express the state $\left|\psi\right>$. The more qubits used, the closer the estimate will be to $\omega$.

As previously shown, if $\omega=x/2^n$ for some integer $x<2^n$, then the inverse quantum Fourier transform $QFT^{-1}$ can be used to determine $x$ exactly. To solve the phase estimation problem for general $\omega$,  some number of qubits $n$ is chosen in advance that will determine the degree of accuracy in the estimate. Then when $QFT^{-1}$ acting on $n$ qubits is applied to the provided state $\left|\psi \right>$, a state $\left|x\right>$ is produced such that $x/2^n=\tilde{\omega}$  yields an estimate of $\omega$. Therefore, the circuit that implements $QFT^{-1}$ can be used to solve the phase estimation problem, which is shown in the figure below. Recall that the inverse of the circuit implementing some unitary operator like $QFT$ is simply the same circuit with the order of the gates reversed, and  each gate is replaced with its inverse operation. (The circuit for the phase estimation algorithm. This circuit is the $QFT^{-1}$ circuit, which takes as input some state state $\left|\psi\right>$ with a phase factor $\omega$, and transforms it to the basis state $\left|x\right>\in\mathcal{H}^{2^n}$ such that $x/2^n=\tilde{\omega}$ is an estimate for the phase $\omega$ up to $n$ bits of accuracy. The $QFT^{-1}$ circuit is equivalent to the circuit  that results from reversing the order of the gates in the $QFT$ circuit, shown here, and replacing each gate with its inverse.)\label{fig:qftinverse}
\end{figure}

Since it, was already argued that the $QFT$ can be implemented efficiently with a number of gates in $O(n^2)$, and the $QFT^{-1}$ requires the exact same number of gates, it follows that the size or time complexity of implementing the $QFT^{-1}$ is also in $O(n^2)$. Thus, the phase estimation algorithm just described is a quantum algorithm that solves the phase estimation problem in polynomial time, placing the problem into the complexity class $\mathbf{BQP}$. Moreover, it can be justified further that the phase estimation algorithm will output $x$ corresponding to the state $\left|x\right>$ such that $x/2^n=\tilde{\omega}$ is the closest estimate to $\omega$ with a probability greater than $4/\pi^2$.\cite{mosca} This estimate is optimal for a given number $n$ of qubits used. That is $\tilde{\omega}=x/2^n$ will be the closest multiple of $1/2^n$ to $\omega$.

The phase estimation algorithm is an example of a problem that a quantum computer can solve more efficiently than any known classical algorithm. Recall, that it takes $O(n2^n)$ steps to implement the classical discrete Fourier transform. Therefore any classical algorithm that were to solve the phase estimation problem using the Fourier transform would take exponentially longer. The quantum phase estimation algorithm  can be said to provide  a superpolynomial speed-up when compared to the classical context.

### The Quantum Fourier Transform Circuit

The quantum Fourier transform $QFT$ was defined to map an arbitrary  computational basis state $\left|j\right>\in\mathcal{H}^{2^n}$ to the state
$\left|\psi\right>=QFT\left|j\right>=\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{k=0}^{2^n-1}e^{\frac{i2\pi jk}{2^n}}\left|k\right>.$
In this section, it will be shown that this state $\left|\psi\right>$ can be expressed as the tensor product of $n$ qubits $\left|\psi_i\right>$ so that
$\left|\psi\right>=\left|\psi_1\right>\otimes\left|\psi_2\right>\otimes\dots\otimes\left|\psi_n\right>.$
Such a construction will provide insight on how an explicit circuit can be constructed that implements the $QFT$.

Before beginning, recall the equivalent means of representing the computational basis states of $\bigotimes_{i=1}^{n}\mathcal{H}^2$ in the binary representation as $\left|b_1b_2\dots b_n\right>$, for $b_i\in\{0,1\}$, and  the decimal representation as $\left|j\right>$, for an integer $j\in{0,1,\dots,2^n-1}$, given by the map
$\displaystyle\sum\limits_{i=1}^{n}b_i2^{n-i}=b_12^{n-1}+b_{2}2^{n-2}+...+b_{n-1}2+b_n=j.$
For some fraction $\omega\in[0,1)\subset\mathbb{R}$, a binary representation can be defined for this decimal as $\omega=0.x_1x_2\dots x_n$ for some $n$ as the map
$\omega=\displaystyle\sum\limits_{i=1}^{n}x_i2^{-i}=x_12^{-1}+x_22^{-2}+\dots+x_n2^{-n}.$ Instead the decimal $\omega$ may be given by bitstrings that begin labelled with a higher index such as $\omega=x_lx_{l+1}\dots x_n$ for some $l>1$. In this case
$\omega=\displaystyle\sum\limits_{i=l}^{n}x_i2^{-i+l-1}=x_l2^{-1}+x_{l+2}2^{-2}+...+x_n2^{-n+l-1}.$
If $\omega$ is a rational number, then its decimal representation is of finite length and the number of binary digits $n$ of $\omega=0.x_1x_2\dots x_n$ is determined by the number of bits needed to completely specify $\omega$. Such a $\omega$ always corresponds to the fraction $\omega=x/2^n$ for some integer $x\in\{0,1,2,\dots, 2^{n}-1\}$. On the other hand, if $\omega$ is an irrational number then the binary representation $\omega=0.x_1x_2x_3\dots$ will contain infinitely many bits $x_i$ in the expansion. However, irrational numbers $\omega= 0.x_1x_2x_3\dots$ can be approximated by a rational number $\tilde{\omega}=0.x_1x_2\dots x_k$ for some $k$, meaning the $x_i$ of $\tilde{\omega}$ coincide with the $x_i$ of $\omega$ up to the first $k$ bits. In this context, $\tilde{\omega}$ is called a rational approximation of $\omega$ to $k$ bits of accuracy.

Let $k$ be some integer and consider the product of $2^k$ with $\omega=0.x_1x_2\dots x_n$
$2^k\omega=2^k\displaystyle\sum\limits_{i=1}^{n}x_i2^{-i}=x_12^{k-1}+x_{2}2^{k-2}+...+x_n2^{k-n}.$
Suppose now that $k$ is some positive integer such that $1\leq k<n$. Then this summation can be regarded as two different sums, one over non-negative powers of $2$ and the other over negative powers of $2$:
$\begin{array}{r l} 2^k\omega=2^k\displaystyle\sum\limits_{i=1}^{n}x_i2^{-i}= & \displaystyle\sum\limits_{i=1}^{k}x_i2^{k-i}+\displaystyle\sum\limits_{i=k+1}^{n}x_i2^{k-i} \\ = & \left(x_12^{k-1}+x_{2}2^{k-2}+...+x_k2^{0}\right)+\left(x_{k+1}2^{-1}+x_{k+2}2^{-2}+...+x_n2^{-n}\right) \\ \equiv & x_1x_2\dots x_{k} + 0.x_{k+1}x_{k+2}\dots x_{n} \\ = & x_1x_2\dots x_{k}.x_{k+1}x_{k+2}\dots x_{n}. \end{array}$
Here, $x_1x_2\dots x_{k}$ can be thought of as the integer part of the number $2^k\omega$, and $0.x_{k+1}x_{k+2}\dots x_{n}$ can be thought of as the fractional part of $2^k\omega$.

This representation will be useful in the analysis to come, because of the consequences this has when considering the number $e^{i2\pi(2^k\omega)}$ with $\omega=0.x_1x_2\dots x_n\in[0,1)$ and $k\geq 1$. Since $e^{i2\pi z}=1$ for any integer $z$, it follows that
$\begin{array}{r l} e^{i2\pi(2^k\omega)}=&e^{i2\pi( x_1x_2\dots x_{k}.x_{k+1}x_{k+2}\dots x_{n})} \\ &=e^{i2\pi(x_1x_2\dots x_{k}) }e^{i2\pi(0.x_{k+1}x_{k+2}\dots x_{n})} \\ &=e^{i2\pi z}e^{i2\pi(0.x_{k+1}x_{k+2}\dots x_{n})} \\ &=e^{i2\pi(0.x_{k+1}x_{k+2}\dots x_{n})}, \end{array}$where $z=x_1x_2\dots x_{k}$ is just some integer.

With these results in mind, the following identity will be proved which expresses the action of the $QFT$ on a basis state $\left|j\right>\in\mathcal{H}^{2^n}$ as the tensor product state of $n$ qubits, where $\left|j\right>=\left|j_1j_2\dots j_n\right>$ when represented in the binary representation. That is, if $\left|\psi\right>=QFT\left|j\right>$, it will be shown that $\left|\psi\right>=\left|\psi_1\right>\otimes\left|\psi_2\right>\otimes\dots\otimes\left|\psi_n\right>$ for some states $\left|\psi_i\right>\in\mathcal{H}^2$.
In this tensor decomposition the state of each qubit is in some superposition and has a relative phase factor in terms of the bit string $j_1j_2\dots j_n$. Showing this will ultimately be an  algebraic exercise in translating between the decimal and binary representation of basis states. Proceed as follows:
$\begin{array}{r l} \left|\psi\right>=QFT\left|j\right>=&\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{k=0}^{2^n-1}e^{\frac{i2\pi jk}{2^n}}\left|k\right> \\ =& \displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{k_1=0}^{1}\cdots \displaystyle\sum\limits_{k_n=0}^{1}e^{i2\pi j(\sum_{l=1}^nk_l2^{-l})}\left|k_1k_2\dots k_n\right> \\ =& \displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{k_1=0}^{1}\cdots \displaystyle\sum\limits_{k_n=0}^{1}\displaystyle\bigotimes\limits_{l=1}^{n}e^{i2\pi j k_l2^{-l}}\left|k_l\right> \\ =& \displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\bigotimes\limits_{l=1}^{n}\left[\displaystyle\sum\limits_{k_l=0}^{1}e^{i2\pi j k_l2^{-l}}\left|k_l\right>\right] \\ =& \displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\bigotimes\limits_{l=1}^{n}\left[\left|0\right>+e^{i2\pi j 2^{-l}}\left|k_l\right>\right] \\ =& \left( \displaystyle\frac{\left|0\right>+e^{i2\pi j2^{-1}}\left|1\right>}{\sqrt{2}}\right)\otimes\left( \displaystyle\frac{\left|0\right>+e^{i2\pi j2^{-2}}\left|1\right>}{\sqrt{2}}\right)\otimes\dots\otimes\left( \displaystyle\frac{\left|0\right>+e^{i2\pi j2^{-n}}\left|1\right>}{\sqrt{2}}\right) \\ =& \left( \displaystyle\frac{\left|0\right>+e^{i2\pi0.j_n}\left|1\right>}{\sqrt{2}}\right)\otimes\left( \displaystyle\frac{\left|0\right>+e^{i2\pi0.j_{n-1}j_n}\left|1\right>}{\sqrt{2}}\right)\otimes\dots\otimes\left( \displaystyle\frac{\left|0\right>+e^{i2\pi0.j_1j_2\dots j_n}\left|1\right>}{\sqrt{2}}\right). \end{array}$

This decomposition of $\left|\psi\right>= \bigotimes_{i=1}^n\left|\psi_i\right>$ into the tensor product of the states $\left|\psi_i\right>$ is useful because it shows that if its possible to construct a unitary operators $U_i$ on the state space such that $U_i\left|j_1j_2\dots j_n\right>=\left|j_1\right>\dots\left|j_{i-1}\right>\left|\psi_i\right>\left|j_{i+1}\right>\dots\left|j_n\right>$ for each $i\in\{1,2,\dots n\}$, then the composition $U_nU_{n-1}\dots U_1$ must be the unitary operator that performs the quantum Fourier transform $QFT$:
$QFT\left|j_1j_2\dots j_n\right>=U_{n}U_{n-1}\dots U_1\left|j_1j_2\dots j_n\right>.$
If a circuit can be explicitly constructed that implements each $U_i$ in terms  of elementary gates, then the composition of these circuits can be used to construct a circuit for the quantum Fourier transform.

In order to construct a circuit for each $U_i$ define the single qubit gate $R_k$ parameterized by an integer $k$ and  defined by mapping the basis states as $R_k\left|0\right>=\left|0\right>$ and $R_k\left|1\right>=e^{i2\pi/2^k}\left|1\right>$. The matrix for $R_k$ in the computational basis is given as
$R_k=\begin{pmatrix}1 & 0 \\ 0 & e^{i2\pi/2^k} \end{pmatrix},$
and it can be seen as adding a relative phase factor to a qubit  in some superposition
$R_k\frac{1}{\sqrt{2}}\left(\left|0\right>+\left|1\right>\right)=\frac{1}{\sqrt{2}}(\left|0\right>+e^{i2\pi/2^k}\left|1\right>).$
Also, define a two qubit controlled$-R_k$ gate $c_r-R_k$ acting in an ambient $n$ qubit register, where the index $r$ signifies that the $r^{th}$ qubit in the register is to serve as the control qubit. The $c_r-R_k$ gate applies the $R_k$ gate to some qubit only if the qubit in the $r^{th}$ register is in the state $\left|1\right>$ and leaves the qubit unchanged if the control is in the state $\left|0\right>$.

Consider the first qubit $\left|j_1\right>$ in the $n$ qubit basis state $\left|j\right>=\bigotimes_{i=1}^{n}\left|j_i\right>$. Then applying the Hadamard gate to $\left|j_1\right>$ gives
$H\left|j_1\right>=\frac{1}{\sqrt{2}}(\left|0\right>+e^{i2\pi 0.j_1}\left|1\right>),$
since $e^{i2\pi 0.j_1}=1$ if $j_1=0$ and $e^{i2\pi 0.j_1}=-1$ if $j_1=1$. Now, if a $c_2-R_2$ gate is applied to $H\left|j_1\right>$ while using the state $\left|j_2\right>$ of the qubit in the $2^{nd}$ register as the control qubit, then
$\begin{array}{r l} c_2-R_kH\left|j_1\right>=&c_2-R_k\frac{1}{\sqrt{2}}(\left|0\right>+e^{i2\pi 0.j_1}\left|1\right>) \\ =&\frac{1}{\sqrt{2}}(\left|0\right>+e^{i2\pi /2^2}e^{i2\pi 0.j_1}\left|1\right>) \\ =&\frac{1}{\sqrt{2}}(\left|0\right>+e^{i2\pi (0.j_1+j_n2^{-2})}\left|1\right>) \\ =&\frac{1}{\sqrt{2}}\left(\left|0\right>+e^{i2\pi0.j_1j_2}\left|1\right>\right). \end{array}$
In a similar manner, if this time the gate $c_3-R_3$ is applied to this state while using the $3^{rd}$ qubit $\left|j_3\right>$ in the register as the control what results is the state
$c_3-R_3c_2-R_2H\left|j_1\right>=c_3-R_3\frac{1}{\sqrt{2}}\left(\left|0\right>+e^{i2\pi0.j_1j_2}\left|1\right>\right)=\frac{1}{\sqrt{2}}\left(\left|0\right>+e^{i2\pi0.j_1j_2j_3}\left|1\right>\right).$
Then continuing in this way by applying $c_r-R_r$ gates iteratively for $2\leq r \leq n$ yields
$c_n-R_nc_{n-1}-R_{n-1}\dots c_2-R_2H\left|j_1\right>=\frac{1}{\sqrt{2}}\left(\left|0\right>+e^{i2\pi0.j_1j_2\dots j_n}\left|1\right>\right)=\left|\psi_n\right>,$
which is the state $\left|\psi_n\right>$ of the $n^{th}$ qubit in the register of the tensor decomposition of the $QFT$ shown above.

A similar set of operations is applied to each qubit $\left|j_i\right>$ in the $n$ qubit register. For the first $n-1$ qubits $\left|j_i\right>$, with $1\leq i<n$,  apply the gates
$c_{n+1-i}-R_{n+1-i}c_{n-i}-R_{n-1}\dots c_2-R_2H\left|j_i\right>=\frac{1}{\sqrt{2}}\left(\left|0\right>+e^{i2\pi0.j_ij_{i+1}\dots j_n}\left|1\right>\right)=\left|\psi_{n+1-i}\right>.$

For the last qubit $\left|j_n\right>$ of the register only apply the Hadamard gate. Notice that these states also correspond accordingly to the states $\left|\psi_{n+1-i}\right>$ of the $QFT$ decomposition.
By letting $U'_i$ denote the unitary operator consisting of these gates that act on the state $\left|j_i\right>$ of the $i^{th}$  qubit in the $n$ qubit register, it follows that
$U'_n\cdots U'_2U'_1\left|j_1j_2\dots j_n\right>=\left|\psi_n\right>\otimes\left|\psi_{n-1}\right>\otimes\dots\otimes\left|\psi_1\right>.$
This is almost exactly the state
$\left|\psi\right>=QFT\left|j_1j_2\dots j_n\right>=\left|\psi_1\right>\otimes\left|\psi_{2}\right>\otimes\dots\otimes\left|\psi_n\right>,$
except that order of the qubits in the registers is merely reversed when compared to $U'_n\cdots U'_2U'_1\left|j_1j_2\dots j_n\right>$. (Part of the circuit used to implement the quantum Fourier transform QFT. Not shown in the circuit is the seqiences of $SWAP$ gates used to reverse the order of the $\left|\psi_i\right>$. This circuit begins with an $n$ qubit register is some computational basis state $\left|j_1j_2\dots j_n\right>$ and then applies the sequence of gates given by $U'_i$ to each $\left|j_i\right>$ consisting of a Hadamard gate followed by controlled$-R_k$ gates that use the states $\left|j_k\right>$, for $j>i$ in the register as control qubits.)

At this point, one might be satisfied as accepting the state produced by $U'_n\cdots U'_2U'_1\left|j_1j_2\dots j_n\right>$  as being close enough to actually implementing the $QFT$. After all, simply relabeling the qubits in the registers seems to be a logical triviality. However, there does exist a unitary operator that effectively interchanges the states of two qubits in different registers given by the $SWAP$ gate presented in the section on quantum gates.

By appropriately interchanging the states of the qubits in the register of the state $\left|\psi_n\right>\otimes\left|\psi_{n-1}\right>\otimes\dots\otimes\left|\psi_1\right>$, this state can be transformed into the state $\left|\psi_1\right>\otimes\left|\psi_{2}\right>\otimes\dots\otimes\left|\psi_n\right>$. Only $\frac{n}{2}$ $SWAP$ operations performed on the appropriate registers are needed to reverse the order of the qubits. By performing this reversal of states after the operation $U'_n\cdots U'_2U'_1\left|j_1j_2\dots j_n\right>$ is applied, the basis state $\left|j_1j_2\dots j_n\right>$ is finally transformed into the state given by $\left|\psi\right>=QFT\left|j_1j_2\dots j_n\right>$. The circuit produced by appending the appropriate $SWAP$ gates to the end of the circuit shown in the figure below implements the actual $QFT$, and can itself be represented by a quantum gate that signifies this circuit as shown. (The gate representing the quantum Fourier transform $QFT$. The $QFT$ takes as input the state $\left|j\right>$ and transforms it to the state $\left|\psi\right>=\left|\psi_1\right>\otimes\left|\psi_2\right>\otimes\dots\otimes\left|\psi_n\right>$. This gate is equivalent to the circuit shown in the previous figure with the necessary $SWAP$ gates appended to the end of the circuit.)

A more careful analysis of the implementing the quantum Fourier transform is  worthwhile in order to determine its computational complexity. In terms of elementary gates, it can be seen that $n+1-i$ gates are needed to perform the operation $U'_i$. Then since each $U_i$ must be performed, for $1\leq 1\leq n$, the total number of gates to implement $U'_n\cdots U'_2U'_1\left|j_1j_2\dots j_n\right>$ is
$n+(n-1)\dots+1=\displaystyle\sum\limits_{i=1}^{n}i= \frac{n(n+1)}{2}.$Therefore the  circuit size or time complexity of implementing $U'_n\cdots U'_2U'_1\left|j_1j_2\dots j_n\right>$ is in $O(n^2)$. In addition, the $\frac{n}{2}$ $SWAP$ operations must be accounted for in order to determine the true circuit size of implementing the $QFT$. Recall, that $3$ controlled$-NOT$ gates were able to simulate the effect of the $SWAP$ gate. Then the total number of elementary gates needed to perform the $\frac{n}{2}$ swap operations is just $\frac{3n}{2}$, which is in $O(n)$ but still bounded above by $O(n^2)$. Hence, the total number of gates needed to perform the quantum Fourier transform $QFT$ is $O(n^2)$.

It is important to keep in mind that the $QFT$ was defined in general on the state space $\mathcal{H}^{N}$ for an arbitrary positive integer $N$. The construction shown above is for the case when $N=2^n$ for some $n$. Although, this case will be the most relevant for quantum computing, the QFT can be defined for general $N$.\cite{coppersmith} Moreover, the family of circuits constructed for varying $N$ is also uniformly polynomial, which is a necessary property when considering the computational complexity of algorithms that make use of the $QFT$.

The quantum Fourier transform will prove to be a useful and essential tool in the quantum algorithms to follow. The $QFT$ allows special superpositions of states to be constructed and novel ways to encode information in the states involved and their relative phases, which can then be exploited to accomplish interesting tasks. An important property of the $QFT$ is that it can be efficiently implemented in polynomial time, whereas $\omega(n2^n)$ operations are needed to implement the classical discrete Fourier transform on $2^n$ complex numbers.\cite{winograd}

Hence, the quantum algorithm is exponentially faster than its classical counterpart. Therefore, if the $QFT$ is used in some algorithm it can be done so efficiently.

### The Quantum Fourier Transform

In the previously introduced algorithms, the Hadamard transform played an essential role in being able to created a certain superposition of states that was exploited in the various ways by the algorithms. Recall the action of the Hadamard transform $H^{\otimes n}$ on an arbitrary basis state $\left|\mathbf{x}\right>\in\mathcal{H}^{2^n}$:
$H^{\otimes n}\left|\mathbf{x}\right>=\displaystyle \frac{1}{\sqrt{2^{n}}}\displaystyle\sum\limits_{\mathbf{y}\in\{0,1\}^n}(-1)^{\mathbf{x}\cdot\mathbf{y}}\left|\mathbf{y}\right>,$
which can be thought of as effectively encoding the string $\mathbf{x}$ into the relative paste factors present in the amplitudes of the states in the superposition. Since $H^{\otimes n}$ is its own inverse appying applying $H^{\otimes n}$ to the state $H^{\otimes n}\left|\mathbf{x}\right>$ returns the state $\left|\mathbf{x}\right>$:
$(H^{\otimes n}\left(\displaystyle \frac{1}{\sqrt{2^{n}}}\displaystyle\sum\limits_{\mathbf{y}\in\{0,1\}^n}(-1)^{\mathbf{x}\cdot\mathbf{y}}\left|\mathbf{y}\right>\right) =H^{\otimes n}H^{\otimes n}\left|\mathbf{x}\right>=\left|\mathbf{x}\right>/$In this regard, the Hadamard transform can also be thought of as decoding the information of the string $\mathbf{x}$ in the phase factors into the state $\left|\mathbf{x}\right>$.

Notice that all of the amplitudes defining the superposition of states in the Hadamard transform are either $\pm 1$. The quantum Fourier Transform will generalize the Hadamard transform by constructing certain weighted superpositions of basis states that possess more general phase factors of the form $e^{i2\pi\omega}$, for some real number $\omega\in(0,1)\subset\mathbb{R}$. In the case of the Hadamard transform, $\omega=0$ and $\omega=1/2$ making $e^{i2\pi\omega}=\pm1$. The quantum Fourier Transform will allow certain specially weighted superposition of states to be constructed, which will then be used in algorithms that can perform faster than known classical approaches.

The Quantum Fourier Transform

Consider an $N$--tuple of $N$ complex numbers $x_j\in\mathbb{C}$, where $j\in\{0,1, \dots, N-1\}$. The discrete \emph{classical} Fourier transform is a function $\mathcal{F}$ that takes these $N$ numbers and constructs $N$ other complex numbers in terms of the $x_j$:
$\begin{array}{r c l} \mathcal{F}: \mathbb{C}^N &\rightarrow&\mathbb{C}^N \\ (x_0, x_1, \dots, x_{N-1})& \mapsto &(y_0, y_1, \dots, y_{N-1}), \end{array}$
where each of the $y_k$ complex numbers are given by
$y_k=\displaystyle\frac{1}{\sqrt{N}}\displaystyle\sum\limits_{j=0}^{N-1}e^{\frac{i2\pi jk}{N}}x_j.$
The quantum Fourier transform has a related action on the $N$ computational basis states $\left|j\right>\in\mathcal{H}^{N}$. Here, the basis states are represented in the \emph{decimal representation} where each state is indexed by the integer $j\in\{0,1,\dots, N-1\}$. The quantum Fourier transform, denoted by $\mathbf{QFT}$, maps these computational basis states to another basis called the Fourier basis consisting of weighted superpositions of the computational basis states as
$\begin{array}{r c l} QFT: \mathcal{H}^N &\rightarrow&\mathcal{H}^N \\ \left|j\right>& \mapsto &\displaystyle\frac{1}{\sqrt{N}}\displaystyle\sum\limits_{k=0}^{N-1}e^{\frac{i2\pi jk}{N}}\left|k\right>. \end{array}$
In terms of the discrete classcical Fourier transform, the action of the $QFT$ can be expressed more concisely as
$\left|j\right> \mapsto \displaystyle \frac{1}{\sqrt{N}}\displaystyle\sum\limits_{k=0}^{N-1}y_k\left|k\right>.$
Now, consider the action of the $QFT$ on the $n$ qubit basis state $\left|0\right>$ where $N=2^n$,
$QFT\left|0\right>=\displaystyle\frac{1}{\sqrt{N}}\displaystyle\sum\limits_{k=0}^{N-1}e^{\frac{i2\pi k\cdot 0}{N}}\left|k\right>=\displaystyle\frac{1}{\sqrt{N}}\displaystyle\sum\limits_{k=0}^{N-1}\left|k\right>=H^{\otimes n}\left|0\right>,$
which is an equally weighted superposition of all $N$ basis states and this precisely the same state produced  when Hadamard gates are applied to each qubit. Thus, in the case where $\left|j\right>=\left|0\right>$, $QFT\left|j\right>=H^{\otimes n}\left|j\right>$. In general though, $QFT\left|j\right>\neq H^{\otimes n}\left|j\right>$. In this regard, The quantum Fourier transform can be seen as a generalization of the Hadamard transform.

It remains to be checked that the $QFT$ is a unitary transformation, which will now be done by verifying that $QFT^\dagger=QFT^{-1}$. To see that this is indeed the case, consider the matrix representation of $QFT$ whose matrix coefficient $q_{jk}$ contained in the $j^{th}$ row and $k^{th}$ column is given by $q_{jk}= e^{ \frac{i2\pi jk}{N}}$. Then the conjugate transpose $QFT^\dagger$ of $QFT$ has  the corresponding matrix coefficient $q'_{jk}=\overline{q_{kj}}=e^{ \frac{-i2\pi kj}{N}}$
in its $j^{th}$ row and $k^{th}$ column.
Therefore, the action of $QFT^\dagger$ on some basis state $\left|k\right>\in\mathcal{H}^N$ is given as
$\begin{array}{r c l} QFT^\dagger: \mathcal{H}^N &\rightarrow&\mathcal{H}^N \\ \left|k\right>& \mapsto &\displaystyle\frac{1}{\sqrt{N}}\displaystyle\sum\limits_{j=0}^{N-1}e^{\frac{-i2\pi kj}{N}}\left|j\right>. \end{array}$
Now consider what happens when the composition of $QFT$ with $QFT^\dagger$ acts on a basis state $\left|j\right>$
$\begin{array}{r l} QFT^\dagger QFT\left|j\right>=&QFT^\dagger\left(\displaystyle\frac{1}{\sqrt{N}}\displaystyle\sum\limits_{k=0}^{N-1}e^{\frac{i2\pi jk}{N}}\left|k\right> \right) \\ =&\displaystyle\frac{1}{\sqrt{N}}\displaystyle\sum\limits_{k=0}^{N-1}e^{\frac{i2\pi jk}{N}}QFT^\dagger\left|k\right> \\ =& \displaystyle\frac{1}{\sqrt{N}}\displaystyle\sum\limits_{k=0}^{N-1}e^{\frac{i2\pi jk}{N}}\left(\displaystyle\frac{1}{\sqrt{N}}\displaystyle\sum\limits_{l=0}^{N-1}e^{\frac{-i2\pi kl}{N}}\left|l\right>\right) \\ =&\displaystyle\frac{1}{N}\displaystyle\sum\limits_{l=0}^{N-1}\left(\displaystyle\sum\limits_{k=0}^{N-1}e^{\frac{i2\pi jk}{N}}e^{\frac{-i2\pi kl}{N}}\right)\left|l\right> \\ &= \displaystyle\frac{1}{N}\displaystyle\sum\limits_{l=0}^{N-1}\left(\displaystyle\sum\limits_{k=0}^{N-1}e^{\frac{i2\pi k(j-l)}{N}}\right)\left|l\right>. \end{array}$
Let $\alpha_s$ denote the amplitude of the basis state $\left|s\right>$ in this superposition.  Then assuming $s\neq j$,
$\alpha_s=\displaystyle\frac{1}{N}\displaystyle\sum\limits_{k=0}^{N-1}e^{\frac{i2\pi k(j-s)}{N}}=\displaystyle\frac{1}{N}\displaystyle\frac{e^{\frac{i2\pi k(j-s)}{N}N}-1}{e^{\frac{i2\pi k(j-s)}{N}}-1}=\displaystyle\frac{1}{N}\displaystyle\frac{(1-1)}{e^{\frac{i2\pi k(j-s)}{N}}-1}=0,$
where the formula for the geometric series
$\displaystyle\sum\limits_{k=0}^{N-1}z^{k}=\displaystyle \frac{z^{N}-1}{z-1}$
was used together with
$z^N=e^{ \frac{i2\pi (j-s)}{N}N}=e^{i2\pi (j-s)}=1$
since $e^{i2\pi m}=1$ for any integer $m$. This implies that all basis states $\left|s\right>\neq\left|j\right>$ have $\alpha_s=0$ amplitude and do not appear in the above superposition. Then examining the amplitude of the state $\left|j\right>$ shows that
$\alpha_j=\displaystyle\frac{1}{N}\displaystyle\sum\limits_{k=0}^{N-1}e^{\frac{i2\pi k(j-j)}{N}}=\displaystyle\frac{1}{N}\displaystyle\sum\limits_{k=0}^{N-1}e^{\frac{i2\pi k(0)}{N}}=\displaystyle\frac{1}{N}\displaystyle\sum\limits_{k=0}^{N-1}1=\displaystyle\frac{N}{N}=1,$
which shows that $QFT^\dagger QFT\left|j\right>=\left|j\right>$. A similar argument also shows that $QFT QFT^\dagger\left|j\right>=\left|j\right>$. This implies that $QFT^\dagger=QFT^{-1}$ proving that $QFT$ is indeed a unitary operator.

The proof just presented that the quantum Fourier transform $QFT$ is a unitary operator shows that the $QFT$ can be used to define transformations on quantum states. However, the proof does not provide a constructive means for actually implementing the $QFT$, since unitarity was shown directly using the definition of the $QFT$. In the next section, an explicit circuit using elementary quantum gates will be constructed. The existence of such a circuit will also provide proof that the $QFT$ is unitary provided that all gates used in this construction are themselves unitary.

### Simon's Problem and Algorithm

Let $X\subseteq \{0,1\}^n$ be some subset and consider a function $f: \{0,1\}^n\rightarrow X$ that satisfies the property that for all $\mathbf{x}, \mathbf{y}\in \{0,1\}^n$, $f(\mathbf{x})=f(\mathbf{y})$ if and only if $\mathbf{x}=\mathbf{y}$ or $\mathbf{x}=\mathbf{y}\oplus\mathbf{s}$, for some fixed $\mathbf{s}\in\{0,1\}^n$. Suppose such a function is provided, but the string $\mathbf{s}$ is unknown. Simon's problem, presented first in \cite{simon}, is concerned with determining the string $\mathbf{s}$ by making queries to a black-box that evaluates the function $f$.

Simon's Problem

Input:  A black-box that computes an unknown function $f:\{0,1\}^n\rightarrow X$, where $X\subseteq\{0,1\}$, such that there exists  some fixed $\mathbf{s}\in\{0,1\}^n$ that satisfies  $f(\mathbf{x})=f(\mathbf{y})$ if and only if $\mathbf{x}=\mathbf{y}$ or $\mathbf{x}=\mathbf{y}\oplus\mathbf{s}$.

Problem:  Determine the unknown string $\mathbf{s}\in\{0,1\}^n$ by making queries to the black-box that evaluates $f$.

A quantum algorithm for solving Simon's problem will be constructed in what follows, and it will be shown that this can be done by making exponentially fewer queries to $f$ than what is needed in the classical case. Before presenting this algorithm, it is worthwhile to invest some detail pertaining to some properties of bitstrings in $\{0,1\}^n$ and certain operations defined on them. The relevant details can be found in the previous post on orthogonal complements in the vector space $\mathbb{Z}_2^n$. These consequences will be essential in the analysis of Simon's algorithm.

Simon's Algorithm

For Simons problem we are provided with a black-box that computes an unknown function $f: \{0,1\}^n\rightarrow X$ such that there exists  some fixed $\mathbf{s}\in\{0,1\}^n$ that satisfies  $f(\mathbf{x})=f(\mathbf{y})$ if and only if $\mathbf{x}=\mathbf{y}$ or $\mathbf{x}=\mathbf{y}\oplus\mathbf{s}$. This black-box can be implemented as the controlled$-U_f$ gate $\begin{array}{r l} c-U_f: &\mathcal{H}^{2^n}\otimes\mathcal{H}^{2^n}\rightarrow \mathcal{H}^{2^n}\otimes \mathcal{H}^{2^n}, \\ & \left|\mathbf{x}\right>\left|\mathbf{y}\right>\mapsto \left|\mathbf{x}\right>\left|\mathbf{y}\oplus f(\mathbf{x})\right>, \end{array}$
where the first register serves as a $n$ qubit control register and the second register acts as a $n$ qubit target. (A circuit used to solve Simon's problem consisting of two $n$ qubit registers. Simon's algorithm runs the circuit multiple times making a single query to the black-box $c-U_f$ each time the circuit is applied. First, Hadamard gates are applied to the first $n$ qubits in the control register of the state $\left|\psi_0\right>$ creating an equally weighted superposition of basis states in the control register. Then by applying the $c-U_f$ gate to $\left|\psi_1\right>$ an entangled state $\left|\psi_2\right>$ is produced that contains the value of each $f(\mathbf{x})$ in the target register. Measuring the target register of $\left|\psi_2\right>$, leaves the control register in a state which produces a superposition $\left|\psi_4\right>$ of states in the orthogonal complement $\mathbf{s}^ {\bot}$ after the last set of Hadamard gates is applied to the control register. A final measurement of the control register then produces a state $\left|\psi_5\right>$ in $\mathbf{s}^ {\bot}$. After this procedure is iterated a sufficient amount of times the states obtained after the measurement can be used to determine $\mathbf{s}$.)

Simon's algorithm makes use of the circuit displayed in the figure above, and begins with the input state $\left|\psi_0\right>=\left|\mathbf{0}\right>\left|\mathbf{0}\right>$. First, a Hadamard gate is applied to every qubit in the control yielding
$\left|\psi_1\right>=H^{\otimes n}\left|\mathbf{0}\right>\left|\mathbf{0}\right>=\displaystyle \frac{1}{\sqrt{2^{n}}}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}\left|\mathbf{x}\right>\left|\mathbf{0}\right>,$
which has an equally weighted superposition of all $2^n$ basis states in the control register. Next, by making a single query with $c-U_f$ to the state $\left|\psi_1\right>$, a superposition of states is formed that contains the value of $f(\mathbf{x})$ for every $\mathbf{x}\in\{0,1\}^n$ in the target register
$\begin{array}{r l} \left|\psi_2\right>=c-U_f\left|\psi_1\right> &=\displaystyle\frac{1}{\sqrt{2^{n}}}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}c-U_f(\left|\mathbf{x}\right>\left|\mathbf{0}\right>) \\ &=\displaystyle \frac{1}{\sqrt{2^{n}}}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}\left|\mathbf{x}\right>\left|f(\mathbf{x})\right>. \end{array}$
Notice that this state $\left|\psi_2\right>$ is actually an entangled state. If all the qubits in the target register were to be measured at this point, a particular computational basis state $\left|f(\mathbf{x})\right>\in\mathcal{H}^{2^n}$ will be observed.

To see what state the control register will be left in after a measurement is made to the target, recall the definition of the function $f$: f(\mathbf{x})=f(\mathbf{y})\) if and only if $\mathbf{x}=\mathbf{y}$ or $\mathbf{y}=\mathbf{x}\oplus\mathbf{s}$. Observe that if $\mathbf{s}\neq\mathbf{0}$ then the function $f$ is two-to-one. This means that for every value $f(\mathbf{x})$ there are precisely two bitstrings in $\{0,1\}^n$ that both share the same value $f(\mathbf{x})$---namely, the strings $\mathbf{x}$ and $\mathbf{y}=\mathbf{x}\oplus\mathbf{s}$. Therefore, consider a partition of the $2^n$ bitstrings of $\{0,1\}^n$ into $2^{n-1}$ disjoint sets  where each set is of the form $\{\mathbf{x},\mathbf{x}\oplus\mathbf{s}\}$. Moreover, let $I$ be the index set that contains $2^{N-1}$ bitstrings $\mathbf{x}$ that represent each of the sets $\{\mathbf{x},\mathbf{x}\oplus\mathbf{s}\}$. In this way, the  entangled state $\left|\psi_2\right>$ can be expressed as
$\left|\psi_2\right>=\displaystyle \frac{1}{\sqrt{2^{n}}}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}\left|\mathbf{x}\right>\left|f(\mathbf{x})\right>=\displaystyle \frac{1}{\sqrt{2^{n-1}}}\displaystyle\sum\limits_{\mathbf{x}\in I}\left(\left|\mathbf{x}\right>+\left|\mathbf{x}\oplus\mathbf{s}\right>\right)\left|f(\mathbf{x})\right>,$ where now the sum only runs over the indexed states of $I$.

When the target register of $\left|\psi_2\right>$ is measured, a state of the form
$\left|\psi_3\right>=\left(\left|\mathbf{x}\right>+\left|\mathbf{x}\oplus\mathbf{s}\right>\right)\left|f(\mathbf{x})\right>$for a particular value of $f(\mathbf{x})$ is observed with equal probability. Thus, the target register becomes the state $\left|f(\mathbf{x})\right>$, and the control register is left in the state $\displaystyle\frac{1}{\sqrt{2}}\left(\left|\mathbf{x}\right>+\left|\mathbf{x}\oplus\mathbf{s}\right>\right)$ after measurement. Then as it was shown in the previous section, where $\left|\psi\right>=\displaystyle\frac{1}{\sqrt{2}}\left(\left|\mathbf{x}\right>+\left|\mathbf{x}\oplus\mathbf{s}\right>\right),$ applying Hadamard gates to the control register after a measurement on the target register has been made results in the state
$\left|\psi_4\right>=H^{\otimes n}\left|\psi_3\right>=H^{\otimes n}\left|\psi\right>\left|f(\mathbf{x})\right>=\displaystyle\frac{1}{\sqrt{2^{n-1}}}\displaystyle\sum\limits_{\mathbf{z}\in\mathbf{s}^\bot}(-1)^{\mathbf{x}\cdot\mathbf{z}}\left|\mathbf{z}\right>\left|f(\mathbf{x})\right>,$
which has the control register in an equally weighted superposition of all basis states belonging to the orthogonal complement $\mathbf{s}^ {\bot}$. Measuring the control register therefore yields some state $\left|\psi_5\right>=\left|\mathbf{z}\right>$, such that $\mathbf{z}\in\mathbf{s}^ {\bot}$, where each state is observed with equal probability.

Simon's algorithm solves the problem of determining $\mathbf{s}$ by repeating the operation just described by the circuit in the figure shown above. By observing the control register after the final measurement is made in each iteration of this circuit, elements of $\mathbf{s}^ {\bot}$ are effectively sampled uniformly at random. After a sufficient amount of samples are made from each iteration, the information obtained from the observed states $\left|z_i\right>$ can be used to construct the matrix $W$ whose rows consists of coefficients given by the strings $\mathbf{z_i}$. This allows the string $\mathbf{s}$ to be determined by finding the solutions to the matrix equation $W\mathbf{x}^T=\mathbf{0}^T$.

In regards to the query complexity of Simon's algorithm, it can be shown that the number of expected queries $Q(n)$ made to the black-box $c-U_f$, which is directly proportional to the number of iterations that need to be made to Simon's circuit, is less than $n$. In addition, the amount of other steps $S(n)$ needed to solve the matrix equation $W\mathbf{x}^T=\mathbf{0}^T$ is in $O(n^3)$.\cite{mosca}

It is presumed that this latter part of this algorithm of solving the matrix equation is performed through classical means. Therefore, the total time complexity of Simon's algorithm is $T(n)=Q(n)R(n)+S(n)=O(n)R(n)+O(n^3)$ for some function $R(n)$ representing the time complexity of implementing the black box $c-U_f$. Regardless, the total time complexity of Simon's algorithm is $\Omega(n^3)$. In the probabilistic  classical case, at least $\Omega(2^{n/3})$ queries must be made to some classical black box implementing the function $f$ in order to solve Simon's problem with a probability of success at least $2/3$. This requires resources exponentially larger than the quantum algorithm provided by Simon's. Thus, Simon's algorithm is our first example of an algorithm that can perform exponentially faster than any known classical algorithm.

### Orthogonal complements in the vector space $\mathbb{Z}_2^n$

This section will review some mathematical preliminaries that will be useful in understanding Simin's algorithm.

The set $\{0,1\}^n$ can be regarded as an $n$-dimensional vector space $\mathbb{Z}_2^n$ over the field $\mathbb{Z}_2$ where the addition of vectors, or bitstrings, $\mathbf{x}=x_1x_2\dots x_n$ and $\mathbf{y}=y_1y_2\dots y_n$ in $\{0,1\}^n$ is given by $\mathbf{x}\oplus\mathbf{y}=\mathbf{z}$. Let $\mathbf{z}=z_1z_2\dots z_n$. Then this operation is just bitwise addition modulo $2$ where $z_i=(x_i+y_i)mod2$ for each $i$. That is, $z_i=1$ if and only if $x_i\neq y_i$. An inner product on $\mathbb{Z}_2^n$ is also defined for $\mathbf{x}=x_1x_2\dots x_n$ and $\mathbf{y}=y_1y_2\dots y_n$ in $\mathbb{Z}_2^n$ as
$\mathbf{x}\cdot\mathbf{y}=(x_1y_1+x_2y_2+\dots +x_ny_n)mod2=\left(\displaystyle\sum\limits_{i=1}^n x_iy_i\right)mod2,$ and so the product of two bitstrings is either $0$ or $1$.

Consider some $\mathbf{s}\in\mathbb{Z}_2^n$ and define the orthogonal complement of $\mathbf{s}$ as the set $\mathbf{s}^{ \bot}=\{\mathbf{x}\in\mathbb{Z}_2^n | \mathbf{x}\cdot\mathbf{s}=0\}$ consisting of all strings orthogonal to $\mathbf{s}$. The orthogonal complement $\mathbf{s}^{ \bot}$ of $\mathbf{s}$ is actually a vector subspace of $\mathbb{Z}_2^n$ that is orthogonal to the $1$-dimensional vector subspace $\{\mathbf{0},\mathbf{s}\}$ provided that $\mathbf{s}\neq\mathbf{0}$. Then since $\mathbf{s}^{ \bot}\cup\{\mathbf{0},\mathbf{s}\}=\mathbb{Z}_2^n$, and $\mathbb{Z}_2^n$ has dimension $n$, $\mathbf{s}^{ \bot}$ must be a subspace of dimension $n-1$.

For some string $\mathbf{x}=x_1x_2\dots x_n\in\{0,1\}^n$, denote by $\mathbf{x}^T$ the column vector that has the coefficient $x_i$ in row $i$. Define a $m\times n$ matrix $W$ whose rows are given by $m$ bitstrings $\mathbf{w}_j\in\mathbf{s}^{ \bot}$ for some $\mathbf{0}\neq\mathbf{s}\in\{0,1\}^n$. If the $m$ bitstrings $\mathbf{w}_i$ are chosen so that they span $\mathbf{s}^{ \bot}$ entirely, then a nontrivial solution $\mathbf{x}^T$ to the matrix equation $W\mathbf{x}^T=\mathbf{0}^T$, where $\mathbf{0}\in\{0,1\}^m$, will be a the unique solution $\mathbf{x}^T=\mathbf{s}^T$ that determines exactly the bitstring $\mathbf{s}$.

Let $\mathbf{x}, \mathbf{y}\in\{0,1\}^n$ and $\mathbf{s}=\mathbf{x}\oplus\mathbf{y}$. Consider the quantum state $\left|\psi\right>=\displaystyle\frac{1}{\sqrt{2}}(\left|\mathbf{x}\right>+\left|\mathbf{y}\right>)\in\mathcal{H}^{2^n}.$ Note that $\mathbf{y}=\mathbf{x}\oplus\mathbf{s}$ implies $\left|\psi\right>=\displaystyle\frac{1}{\sqrt{2}}(\left|\mathbf{x}\right>+\left|\mathbf{x}\oplus\mathbf{s}\right>).$ Applying the Hadamard transform $H^{\otimes n}$ to $\left|\psi\right>$ gives
$\begin{array}{r l} H^{\otimes n}\left|\psi\right> &=\displaystyle\frac{1}{\sqrt{2}}\left(H^{\otimes n}\left|\mathbf{x}\right>+H^{\otimes n}\left|\mathbf{x}\oplus\mathbf{s}\right>\right) \\ &= \displaystyle\frac{1}{\sqrt{2}}\left(\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{\mathbf{z}\in\{0,1\}^n}(-1)^{\mathbf{x}\cdot\mathbf{z}}\left|\mathbf{z}\right>+\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{\mathbf{z}\in\{0,1\}^n}(-1)^{(\mathbf{x}\oplus\mathbf{s})\cdot\mathbf{z}}\left|\mathbf{z}\right>\right) \\ &= \displaystyle\frac{1}{\sqrt{2^{n+1}}}\left(\displaystyle\sum\limits_{\mathbf{z}\in\{0,1\}^n}\left((-1)^{\mathbf{x}\cdot\mathbf{z}}+(-1)^{(\mathbf{x}\oplus\mathbf{s})\cdot\mathbf{z}}\right)\left|\mathbf{z}\right>\right) \end{array}.$
Now consider the amplitude $\alpha_z$ of the state $\left|\mathbf{z}\right>$ in this superposition:
$\begin{array}{r l} \alpha_z & =(-1)^{\mathbf{x}\cdot\mathbf{z}}+(-1)^{(\mathbf{x}\oplus\mathbf{s})\cdot\mathbf{z}} \\ & = (-1)^{\mathbf{x}\cdot\mathbf{z}}+(-1)^{\mathbf{x}\cdot\mathbf{z}\oplus\mathbf{s}\cdot\mathbf{z}} \\ & = (-1)^{\mathbf{x}\cdot\mathbf{z}}\left(1+(-1)^{\mathbf{s}\cdot\mathbf{z}}\right) \end{array}.$
If $\mathbf{z}\in\mathbf{s}^{ \bot}$, then $\mathbf{s}\cdot\mathbf{z}=0$ and so $(-1)^{\mathbf{s}\cdot\mathbf{z}}=1$ which implies that $\alpha_z=2(-1)^{\mathbf{x}\cdot\mathbf{z}}$. Thus, the state $\left|\mathbf{z}\right>$ remains with a nonzero amplitude $\alpha_z$ in the superposition $H^{\otimes n}\left|\psi\right>$. Instead, suppose $\mathbf{z}\notin\mathbf{s}^{ \bot}$ so that $\mathbf{s}\cdot\mathbf{z}=1$, which implies that the amplitude of the state $\left|\mathbf{z}\right>$ is $\alpha_z=(-1)^{\mathbf{x}\cdot\mathbf{z}}\left(1+(-1)^1\right)=0$. Therefore, the state $H^{\otimes n}\left|\psi\right>$ consists of a superposition of precisely those $2^{n-1}$ states $\left|\mathbf{z}\right>\in\{0,1\}^n$ such that $\mathbf{z}\in\mathbf{s}^ { \bot}$, allowing the sum to be written as
$H^{\otimes n}\left|\psi\right>=\displaystyle \frac{1}{\sqrt{2^{n-1}}}\displaystyle\sum\limits_{\mathbf{z}\in\mathbf{s}^{ \bot}}(-1)^{\mathbf{x}\cdot\mathbf{z}}\left|\mathbf{z}\right>.$
If a measurement were to be performed on the state $H^{\otimes n}\left|\psi\right>$, where $\left|\psi\right>=\displaystyle \frac{1}{\sqrt{2}}(\left|\mathbf{x}\right>+\left|\mathbf{x}\oplus\mathbf{s}\right>),$  some computational basis state $\left|\mathbf{z}\right>$ such that  $\mathbf{z}\in\mathbf{s}^{ \bot}$ will be observed.

If multiple quantum states of the form $\left|\psi\right>=\displaystyle\frac{1}{\sqrt{2}}(\left|\mathbf{x}\right>+\left|\mathbf{x}\oplus\mathbf{s}\right>)$ can be constructed and then measured, the set of observed states can be used to construct the matrix $W$ whose rows consist of elements in $\mathbf{s}{ \bot}$. Furthermore, if these observed states $\left|\mathbf{w}_j\right>$ happen to correspond to elements that span $\mathbf{s}^ { \bot}$, then the string $\mathbf{s}$ can be uniquely determined by solving the system $W\mathbf{x}^T=\mathbf{0}^T$. Simon's algorithm provides a way to construct such a state by making queries to the a black-box.

### The Deutsch-Jozca Algorithm

The Deutsch problem concerned itself with a function $f:\{0,1\}\rightarrow \{0,1\}$ defined on a single bit. Instead, consider the function $f:\{0,1\}^n\rightarrow \{0,1\}$ defined on bitstrings of length $n$. Moreover, suppose this function is either constant if $f(\mathbf{x})=c$ for all $\mathbf{x}\in\{0,1\}^n$ and a fixed $c\in\{0,1\}$, or balanced if $f(\mathbf{x})=0$ for precisely half of the bitstrings $\mathbf{x}\in\{0,1\}^n$ and $f(\mathbf{x})=1$ for the remaining half of the bitstrings $\mathbf{x}\in\{0,1\}^n$. This time, the problem of determining whether or not $f$ is constant or balanced is tricker. In the classical context, suppose $2^{n-1}$ queries are made to $f$ for half of the possible bitstrings, and each of these queries returned the value $0$. Then without making at least one more query to $f$ it is impossible to determine with certainty whether or not $f$ is constant or balanced since this next query could return a value of either $0$ or $1$. Thus, in the worst case $2^{n-1}+1$ queries would be necessary to determine if $f$ is constant or balanced. In the quantum case, the Deutsch-Jozsa algorithm, originally presented in \cite{jozsa}, is able to determine with certainty whether or not $f$ is constant or balanced by making only a single query.

## The Deutsch-Jozsa Problem

Input: A black-box that computes an unknown function $f:\{0,1\}^n\rightarrow \{0,1\}$ which is guaranteed to be either constant or balanced.

Problem:  Determine wether or not the function is constant or balanced by making queries to the black-box.

Define a black-box encoding a function $f:\{0,1\}^n\rightarrow \{0,1\}$ as the following controlled gate that acts on a $n$ qubit control register and single qubit target register
$\begin{array}{r l} c-U_f: &\mathcal{H}^{2^n}\otimes\mathcal{H}^2\rightarrow \mathcal{H}^{2^n}\otimes \mathcal{H}^2, \\ & \left|\mathbf{x}\right>\left|y\right>\mapsto \left|\mathbf{x}\right>\left|y\oplus f(\mathbf{x})\right>. \end{array}$
Consider the circuit displayed in the figure below which takes as input the state $\left|\psi_0\right>=\left|\mathbf{0}\right>\big(\frac{\left|0\right>-\left|1\right>}{\sqrt{2}} \big)$, where the first register represents the $n$ qubit control and the second register represents the single qubit target. Notice that the overall action of this circuit is a generalization of the circuit for the Deutsch algorithm, where now the control register consists of $n$ qubits instead of just one. (A circuit solving the Deutsch-Jozsa problem by making a single query to the black-box $c-U_f$. Applying Hadamard gates to the first $n$ qubitis in the control register of the state
$\left|\psi_0\right>=\left|\mathbf{0}\right>\big(\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big)$ yields the state $\left|\psi_2\right>$ which is in some superposition of  eigenstates of the $c-U_f$ gate. Applying the second set of Hadamard gates ensures that state $\left|\psi_4\right>=\left|\mathbf{0}\right>$ will only be observed upon measurement of the control register of $\left|\psi_3\right>$ if the function $f$ is constant. Otherwise some basis state $\left|\psi_4\right> \neq\left|\mathbf{0}\right>$ will be observed after measurement if the function $f$ is balanced.)

After the Hadamard gate is applied to each qubit in the control the state becomes
$\left|\psi_1\right>=(H^{\otimes n}\otimes I)\left|\psi_0\right>=H^{\otimes n}\left|\mathbf{0}\right>\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big)= \big(\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}\left|\mathbf{x}\right>\big)(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big).$
Then applying the $c-U_f$ gate to $\left|\psi_1\right>$ gives
$\left|\psi_2\right>=(c-U_f)\left|\psi_1\right>= \big(\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}(-1)^{f(\mathbf{x})}\left|\mathbf{x}\right>\big)(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big),$
and after the second set of Hadamard gates is applied to the control register the state becomes
$\begin{array}{r l} \left|\psi_3\right>=(H^{\otimes n}\otimes I)\left|\psi_2\right> &= \big(\displaystyle\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}(-1)^{f(\mathbf{x})}\big(\frac{1}{\sqrt{2^n}}\displaystyle\sum\limits_{\mathbf{z}\in\{0,1\}^n}(-1)^{\mathbf{x}\cdot\mathbf{z}}\left|\mathbf{z}\right>\big)\big)(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) \\ &= \big(\displaystyle\frac{1}{2^n}\displaystyle\sum\limits_{\mathbf{z}\in\{0,1\}^n}\big( \displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}(-1)^{f(\mathbf{x})+\mathbf{x}\cdot\mathbf{z}}\big)\left|\mathbf{z}\right> \big)(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) \end{array}.$
Let $\left|\psi_4\right>$ represent the state of the control register after a measurement is performed on the control register of $\left|\psi_3\right>$. In order analyze what state will be observed when the control register of $\left|\psi_3\right>$ is measured, consider the amplitude $\alpha_0$  of the state $\left|\mathbf{z}\right>=\left|\mathbf{0}\right>$ in the control register of $\left|\psi_3\right>$:
$\alpha_0= \displaystyle\frac{1}{2^n}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}(-1)^{f(\mathbf{x})+\mathbf{x}\cdot\mathbf{0}}=\displaystyle\frac{1}{2^n}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}(-1)^{f(\mathbf{x})}.$
Suppose the function $f$ is constant. Then there are two cases with either $f(\mathbf{x})=0$ for all $\mathbf{x}\in\{0,1\}^n$ or $f(\mathbf{x})=1$ for all $\mathbf{x}\in\{0,1\}^n$, but either way
$\alpha_0=\displaystyle\frac{1}{2^n}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}(-1)^{f(\mathbf{x})}=\displaystyle\frac{1}{2^n}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}\pm 1=\displaystyle\frac{\pm2^n}{2^n}=\pm 1.$
Therefore the state of the control register after the measurement will be in the state $\left|\psi_4\right>=\left|\mathbf{0}\right>$ with certainty, implying that the function $f$ is constant.

On the other hand suppose instead that the function $f$ is balanced so that $f(\mathbf{x})=0$ for exactly half of the states $\left|\mathbf{x}\right>$ for $\mathbf{x}\in\{0,1\}^n$ and $f(\mathbf{x})=1$ for the remaining half. It then follows that
$\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}(-1)^{f(\mathbf{x})}=0,$
since exactly half of the contributing terms in the sum will be $1$ while the other half will be $-1$ so that the total sums to $0$. Now recall the amplitude $\alpha_0$ of the state $\left|\mathbf{z}\right>=\left|0\right>$ in the control register of $\left|\psi_3\right>$ in this case:
$\alpha_0=\displaystyle \frac{1}{2^n}\displaystyle\sum\limits_{\mathbf{x}\in\{0,1\}^n}(-1)^{f(\mathbf{x})}=0$.
This implies that there is zero probability in observing the state $\left|0\right>$ when a measurement is performed on the control register of $\left|\psi_3\right>$. Instead, some computational basis state $\left|\psi_4\right>=\left|\mathbf{z}\right>$ with $\mathbf{z}\in\{0,1\}^n$ such that $\mathbf{z}\neq\mathbf{0}$ will be observed with nonzero probability.

The Deutsch-Jozsa algorithm shows that its possible to determine in a single query whether or not some function $f:\{0,1\}^n\rightarrow \{0,1\}$ is constant or balanced provided that the function $f$ is guaranteed to have one of these two properties. It was previously argued that the deterministic classical algorithm for solving the related problem must make at least $2^{n-1}+1$ queries in the worst case to establish with certainty wether or not the function $f$ is constant or balanced. Hence, the Deutsch-Jozsa algorithm gives an exponential speed up in the query complexity of the problem when compared to its deterministic classical analog.

However, it is worth mentioning that a probabilistic classical algorithm can actually solve the problem making only $2$ queries to some black-box  if allowed to error with a probability at most $p= \frac{1}{3}$. To decrease this probability of error to $p<\frac{1}{2^n}$ only $n+1$ queries need to be made. Therefore, when comparing the query complexity of the problem in the quantum case to the probabilistic classical case, the Deutsch-Jozsa algorithm only solves the problem by making one less query if the classical algorithm is allowed a constant probability of error, but the classical algorithm must make $O(n)$ queries if exponentially small error probabilities are desired.

### The Deutsch Problem and Algorithm

Let $f: \{0,1\}\rightarrow \{0,1\}$ be some function encoded in a black-box which takes as input some $x\in\{0,1\}$ and returns the value $f(x)\in\{0,1\}$. There are only $4$ different functions that can be defined in this case:
$\begin{array}{r l l} f_{00}:& 0 \mapsto 0,& 1\mapsto 0, \\ f_{11}:& 0 \mapsto 1,& 1\mapsto 1, \\ f_{01}:& 0 \mapsto 0,& 1\mapsto 1, \\ f_{10}:& 0 \mapsto 1,& 1\mapsto 0 .\end{array}$
Notice that two of these functions, $f_{00}$ and $f_{11}$ are constant, where $f(x)=c$ for some fixed $c\in\{0,1\}$. The other two functions, $f_{01}$ and $f_{10}$, are balanced since $f(x)=0$ for half of the possible inputs $x$ and $f(y)=1$ for the other half of remaining inputs $y$.

The Deutsch problem, originally formulated and solved in \cite{deutsch}, and is the problem of determining whether or not some unknown function $f: \{0,1\}\rightarrow \{0,1\}$ is constant or balanced by making queries to the function $f$. Observe that the property of being constant or balanced can be determined by simply evaluating the sum $f(0)\oplus f(1)$, since $f(0)\oplus f(1)=0$ if and only if $f$ is constant and $f(0)\oplus f(1)=1$ if and only if $f$ is balanced.

The Deutsch Problem

Input: A black-box that computes an unknown function $f:\{0,1\}\rightarrow \{0,1\}$

Problem: Determine whether or not the function is constant or balanced, or equivalently the value of $f(0)\oplus f(1)$, by making queries to the black-box.

Trivially, making two queries to the black-box for the values of $f(0)$ and $f(1)$ is enough to completely determine the function $f$. The question then becomes whether or not the Deutsch problem can be solved by making only a single query. Classically, only a single query will never be enough to completely determine whether or not the function in constant or balanced, because provided with only a single value of say $f(0)$ the function may still be any two of the four possible functions described above. One of these possibilities is a constant function and the other is a balanced function so the Deutsch problem remains undetermined with only a single query in the classical case. The quantum algorithm described below shows that its possible to solve the Deutsch problem by making only a single query.

To represent the quantum black-box that encodes some unknown function $f:\{0,1\}\rightarrow \{0,1\}$, define the following controlled gate acting on two qubits
$\begin{array}{r l} c-U_f: &\mathcal{H}^2\otimes\mathcal{H}^2\rightarrow \mathcal{H}^2\otimes \mathcal{H}^2, \\ & \left|x\right>\left|y\right>\mapsto \left|x\right>\left|y\oplus f(x)\right>. \end{array}$
Observe what happens when $c-U_f$ acts on the state $\left|x\right>\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big)$ in the two cases where $\left|x\right>=\left|0\right>$ or $\left|x\right>\left|1\right>$:
$\begin{array}{r l} c-U_f\left|x\right>\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) & =c-U_f\big(\displaystyle\frac{\left|x\right>\left|0\right>}{\sqrt{2}}\big)-c-U_f\big(\displaystyle\frac{\left|x\right>\left|1\right>}{\sqrt{2}}\big) \\ & = \big(\displaystyle\frac{\left|x\right>\left|0\oplus f(x)\right>}{\sqrt{2}}\big)-\big(\displaystyle\frac{\left|x\right>\left|1\oplus f(x)\right>}{\sqrt{2}}\big) \\ & = (-1)^{f(x)}\left|x\right>\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big). \end{array}$
This we can consider the control register as picking up the phase factor $(-1)^{f(x)}$ which depends on the value of $f(x)$. Now consider the circuit shown in the figure below. (A circuit solving the Deutsch problem by making a single query to the black-box $c-U_f$. Applying a Hadamard gate to the first qubit in the state $\left|\psi_0\right>=\left|0\right>\big(\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big)$ yields the state $\left|\psi_1\right>$ which is an equally weighted superposition of two eigenstates of the $c-U_f$ gate. After one query is made with $c-U_f$ the first qubit in the state  $\left|\psi_2\right>$ has a relative phase factor that gets encoded into the state $\left|\psi_3\right>$ after the Hadamard gate. The state then yields  $\left|\psi_4\right>=\left|f(0)\oplus f(1)\right>$ when measured  which determins wether $f$ is constant or balanced.)

The initial state of the input is $\left|\psi_0\right>=\left|0\right>\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big)$ and after the Hadamard gate is applied to the first qubit it is put into an equally weighted superposition
$\left|\psi_1\right>=(H\otimes I)\left|\psi_0\right>=H\left|0\right>\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big)=\big(\displaystyle\frac{\left|0\right>+\left|1\right>}{\sqrt{2}}\big)\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big).$
Then after making one query to the black-box by applying the $c-U_f$ gate to$\left|\psi_1\right>$ gives
$\begin{array}{r l} \left|\psi_2\right>=c-U_f\left|\psi_1\right> &=c-U_f\big(\displaystyle\frac{\left|0\right>+\left|1\right>}{\sqrt{2}}\big)\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) \\ &= c-U_f\big(\displaystyle\frac{\left|0\right>}{\sqrt{2}}\big)\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) +c-U_f\big(\displaystyle\frac{\left|1\right>}{\sqrt{2}}\big)\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) \\ & = (-1)^{f(0)}\displaystyle\frac{\left|0\right>}{\sqrt{2}}\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big)+(-1)^{f(1)}\displaystyle\frac{\left|1\right>}{\sqrt{2}}\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) \\ & = \big(\displaystyle\frac{(-1)^{f(0)}\left|0\right>+(-1)^{f(1)}\left|1\right>}{\sqrt{2}}\big)\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) \\ & = (-1)^{f(0)}\big(\displaystyle\frac{\left|0\right>+(-1)^{f(0)\oplus f(1)}\left|1\right>}{\sqrt{2}}\big)\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big), \end{array}$
where the factor $(-1)^{f(0)}$ has been factored out using $(-1)^{f(0)}(-1)^{f(1)}=(-1)^{f(0)\oplus f(1)}$.  The first qubit in $\left|\psi_2\right>$ now has a relative phase factor of $(-1)^{f(0)\oplus f(1)}$, which contains the value of $f(0)\oplus f(1)$. Applying a second Hadamard gate to the first qubit effectively decodes the value of $f(0)\oplus f(1)$ after measurement. To see this, consider the state of the first qubit in  $\left|\psi_3\right>$after the Hadamard gate is applied in the two cases where $f(0)\oplus f(1)=0$ and $f(0)\oplus f(1)=1$:

If $f(0)\oplus f(1)=0$:
$\begin{array}{r l} \left|\psi_3\right> & = (-1)^{f(0)}H\big(\displaystyle\frac{\left|0\right>+(-1)^{f(0)\oplus f(1)}\left|1\right>}{\sqrt{2}}\big)\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) \\ & = (-1)^{f(0)}H\big(\displaystyle\frac{\left|0\right>+\left|1\right>}{\sqrt{2}}\big)\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) \\ & = (-1)^{f(0)}\left|0\right>\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big), \end{array}$

If $f(0)\oplus f(1)=1$:
$\begin{array}{r l} \left|\psi_3\right> & = (-1)^{f(0)}H\big(\displaystyle\frac{\left|0\right>+(-1)^{f(0)\oplus f(1)}\left|1\right>}{\sqrt{2}}\big)\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) \\ & = (-1)^{f(0)}H\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big)\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big) \\ & = (-1)^{f(0)}\left|1\right>\big(\displaystyle\frac{\left|0\right>-\left|1\right>}{\sqrt{2}}\big). \end{array}$
Note that the first qubit is in the state $\left|0\right>$ if $f(0)\oplus f(1)=0$ and is in the state $\left|1\right>$ if $f(0)\oplus f(1)=1$. Therefore, measuring the first qubit will return the state $\left|\psi_4\right>=\left|0\right>$ with certainty if $f$ is constant, and will instead return the state  $\left|\psi_4\right>=\left|1\right>$ with certainty if $f$ is balanced.

The circuit constructed solves the Deutsch problem by making only one query to the black-box, whereas $2$ queries were necessary in the classical query complexity. Despite being a somewhat contrived example, this is the first example presented in the black-box model that exploits the phase kick-back technique where a quantum algorithm can solve the problem by making fewer queries than needed in the classical case.