Thursday, February 8, 2024

Shor's algorithm

Shor's algorithm is a quantum algorithm for factoring in polynomial time. For sources, I am using this Qiskit tutorial and Wikipedia.

I'm mostly interested in the quantum part of the algorithm, so instead I will take as a starting point the following problem. Let $N$ be some modulus, and $a \in (\mathbb{Z}/N\mathbb{Z})^\times$ (i.e. $a, N$ must be coprime). We wish to compute the (multiplicative) order of $a$ in this group, i.e. the smallest $r > 0$ such that $a^r \equiv 1 \pmod{N}$.

First, let's forget the fact that we have to use qubits and pretend we can produce an $N$-ary quantum bit, e.g. when $N = 2^n$ this is just $n$ qubits. That is, suppose we have a basis for our quantum system:

$\lvert 0 \rangle = \lvert 0^n \rangle, \; \lvert 1 \rangle = \lvert 0^{n-1} 1 \rangle, \ldots, \; \lvert N-1 \rangle = \lvert 1^n \rangle.$

Implicitly, $\lvert m \rangle$ is a representative for $m \in \mathbb{Z}/N\mathbb{Z}$. We define an operator $U_a$ on this system
$U_a\lvert m \rangle = \lvert am \rangle.$

This is a permutation matrix, so it is unitary. In practice we need to think about how to implement the set-up above, i.e. to represent $\mathbb{Z}/N\mathbb{Z}$ using qubits and the operator $U_a$ using quantum gates. This is a completely classical algorithm, and it is known that classical algorithms can be implemented using quantum gates, so we will not address this directly.

Let $r$ be the period of $a$; by construction, $U_a^r = \mathbb{I}$, so all eigenvalues are $r$th roots of unity. For convenience, let $\omega = e^{2 \pi i / r}$. Some interesting eigenvectors and eigenvalues for this operator are for various $s \in \mathbb{Z}/r\mathbb{Z}$:

$v_s := \displaystyle \frac{1}{\sqrt{r}}\sum_{k=0}^{r-1} \omega^{-ks} \lvert a^k \rangle, \;\;\;\;\;\;\;\;\;\; \lambda_s = \omega^s.$

The renormalized sum, over $s = 0, \ldots, r-1$, of the above eigenvectors is $\lvert 1 \rangle$, i.e. we have an eigenvector decomposition
$\lvert 1 \rangle = \displaystyle \frac{1}{\sqrt{r}}(v_0 + v_1 + \cdots + v_{r-1}).$

We note in passing that the rest of the eigenvectors and eigenvalues are shifted from this one by various other elements of $\mathbb{Z}/N\mathbb{Z}$; we don't care about them.

Recall that phase estimation allowed us, given a unitary operator $U$ and an eigenvector, to extract its eigenvalue. We do not have an eigenvector. However, we do have $\lvert 1 \rangle$, which is a sum of eigenvectors with equal weights. Running phase estimation on this superposition amounts to doing so on one of the eigenvectors chosen randomly, say with eigenvalue $s/r$, and after iterating phase estimation our estimate will converge to one of these with high probability. In fact, this estimation allows us to pinpoint the precise fraction. First, we have a bound on $r$, i.e. $r \leq N-1$. Roughly, there are finitely many fractions of the form $s/r$ where $r \leq N-1$, so they should be distinguishable by intervals of some length, which turns out to be (inverse) quadratic in $N$. This is polynomial, but for computing runtime in $n$ we take the logarithm, so it is linear in $n$, which is good enough. The algorithm for computing the $s/r$ is the continued fractions algorithm.

Now, knowing one of the fractions $s/r$ only reveals $r$ when $\gcd(s, r) = 1$; otherwise it is some factor of $r$. We can keep track of the denominators appearing and take their least common multiple, and this will yield $r$ with probability exponentially approaching $1$.

Finally, it should be remarked that the algorithm works in $O(n^3)$ time. An accounting:
  • We need to implement phase estimation circuit. This involves constructing the multiplication operators $U_a^{2^i}$ for $i = 0, 1, \ldots, n-1$, which is $O(n)$ operations of $O(n^2)$ and the QFT gate which is order $O(n^2)$. In total, we have $O(n^3)$.
  • We need to do continued fractions, which is $O(n^3)$.

Quantum (and discrete) Fourier transform

First, I want to note there is a nice discussion with pictures at the Qiskit tutorials. I won't try to replicate the pictures.
Recall the basic idea behind Pontryagin and Fourier duality. Given some abelian group $G$, we have the dual group $\hat{G} = \text{Hom}(G, \mathbb{C}^\times)$, the group of characters, i.e. continuous homomorphisms to the multiplicative group of complex numbers. For expository clarity, let us assume that $G$ is compact and that $\hat{G} = \text{Hom}(G, S^1)$ is discrete; in our only case of interest, both will be both.

There is a tautological character pairing $\hat{G} \times G \rightarrow \mathbb{C}^\times$. For example, taking $G = \mathbb{Z} /N\mathbb{Z} $, we have $\hat{G} = \mu_N \subset \mathbb{C}^\times$, the group of $N$th roots of unity, and the pairing is

$\mu_N \times \mathbb{Z}/N\mathbb{Z} \ni (z, m) \mapsto z^m.$

Given a function $f: G \rightarrow \mathbb{C}$, we have its Fourier transform:

$\hat{f}: \hat{G} \rightarrow \mathbb{C}, \;\;\;\;\;\;\; \hat{f}(\xi) = \displaystyle\frac{1}{|G|}\int_{x \in G} \xi^{-1}(x) f(x) \; dx.$

and given a function $g: \hat{G} \rightarrow \mathbb{C}$, we have its inverse Fourier transform:

$\hat{g}: G \rightarrow \mathbb{C}, \;\;\;\;\;\;\; \hat{g}(x) = \displaystyle\sum_{\xi \in \hat{G}} \xi(x) f(x).$

We chose $G$ to be compact so there is no quesiton of convergence of the integral, and $G$ to be discrete so that the integral on $\hat{G}$ is a (possibly infinite) sum for which we ignore all convergence issues. Aside from this, the Fourier transform and its inverse are not entirely symmetric; they differ by an inversion in the character pairing.

Moreover, Fourier inversion says that (pretend that, say, $G$ is compact and $\hat{G}$ is discrete; for us, they will be finite thus both):

$\hat{\hat{f}} = f, \;\;\;\;\;\;\; f(x) = \displaystyle\sum_{\xi \in \hat{G}} \hat{f}(\xi) \xi(x).$

Now, let us consider the case where $G = \mu_N$ and $\hat{G} = \mathbb{Z}/N\mathbb{Z}$. We represent the characteristic function on $m \in \mathbb{Z}/N\mathbb{Z}$ by $\lvert m \rangle$ (we will see why later). A function $f$ on $G$ is then a linear combination, i.e. given that $f(m) = a_m \lvert m \rangle$, we write

$f = \displaystyle\sum_{m \in \mathbb{Z}/N\mathbb{Z}} a_m \lvert m \rangle.$

Letting $\theta \in \displaystyle\frac{1}{N} \mathbb{Z}$, the Fourier transform takes the characteristic function at $e^{2 \pi i \theta}$ on $G = \mu_N$ to

$\lvert \phi_\theta \rangle := \displaystyle\frac{1}{N} \sum_{m \in \mathbb{Z}/N\mathbb{Z}} e^{2 \pi i \theta m} \lvert m \rangle.$

Note that in our convention, we will normalize using $1/\sqrt{N}$ rather than $\frac{1}{N}$. In this way, the inverse transform also picks up a normalization of $1/\sqrt{N}$, and both are unitary. This matrix performs a base change between the standard basis of characteristic functions on $\hat{G} = \mathbb{Z}/n\mathbb{Z}$ and the basis coming from applying the Fourier transform to charactersitic functions on $G = \mu_N$. We let $QFT_N$ denote the unitary transformation such that

$QFT_N \lvert \phi_{m/N} \rangle = \lvert m \rangle.$

But by Fourier inversion, we know that it is

$QFT_N = \displaystyle\frac{1}{\sqrt{N}}\sum_{x, y \in \mathbb{Z}/N\mathbb{Z}} e^{2 \pi ixy/N} \lvert x \rangle \langle y \rvert.$

Note that this matrix is symmetric, not conjugate symmetric. This is due to the difference between Fourier transform and inverse Fourier transform, i.e. the inversion in the character pairing. Its inverse is then the conjugate transpose

$QFT_N^{-1} = \displaystyle\frac{1}{\sqrt{N}}\sum_{x, y \in \mathbb{Z}/N\mathbb{Z}} e^{-2 \pi ixy/N} \lvert y \rangle \langle x \rvert.$


Now, recall we were trying to do phase estimation given a unitary operator $U$ and eigenvector $\lvert \psi \rangle$ and eigenvalue $\lambda = e^{2 \pi i \theta}$. The rightmost measurement on

$(\mathbb{I} \otimes H) \circ CU \circ (\mathbb{I} \otimes H)\lvert \psi 0 \rangle$

will, with certainty, measure $0$ if $\theta = 0$ and $1$ if $\theta = 1/2$. Otherwise, we get 0 or 1 according to some probability distribution. We can iterate this:

$(\mathbb{I} \otimes H) \circ CU^2 \circ (\mathbb{I} \otimes H)\lvert \psi 0 \rangle$

which will measure $0$ if $\theta = 0, 1/2$ (i.e. $4\theta \cong 0 \pmod{2}$) and $1$ if $\theta = 1/4, 3/4$ (i.e. $4\theta \cong 1 \pmod{2}$). More generally,

$(\mathbb{I} \otimes H) \circ CU^{n-1} \circ (\mathbb{I} \otimes H)\lvert \psi 0 \rangle$

will measure $0$ if $\theta = x/2^n$ for even $m$ (i.e. $2^k\theta \cong 0 \pmod{2}$) and $1$ if $\theta = x/2^k$ for odd $x$ (i.e. $2^n\theta \cong 1 \pmod{2}$).

Now, we want to put these together. Theoretically, doing the above should be able to identify with certainty eigenvalues of the form $e^{2 \pi i \theta}$ with $\theta = x/2^n$ for any $x \in \mathbb{Z}$. To begin with we follow our nose: instead of a single register on the right, we will have $n$ registers. We let $C_kU$ mean to apply the controlled $U$ gate with the control on the $k$th register from the right. We take the $n$ rightmost measurements on:

$(\mathbb{I} \otimes H^{\otimes n}) \circ (C_{n-1}U)^{2^{n-1}} \circ \cdots\circ (C_2U)^{2^1} \circ (C_1U)^{2^0} \circ (\mathbb{I} \otimes H^{\otimes n})\lvert \psi 0^n \rangle.$

Since $\psi$ is an eigenvector, this state is separable into the left part and right part, and we focus only on the right part. A calculation shows that, writing $x/2^n = 0.a_1a_2\ldots a_{n}$ and letting $\lvert x \rangle = \lvert a_n \ldots a_2 a_1 \rangle$ represent the corresponding state, we have that the resulting state is:

$\displaystyle\frac{1}{2^n}\sum_{x=0}^{2^n - 1} e^{2 \pi i \theta x} \lvert x \rangle.$

As discussed earlier, this is precisely the Fourier transform applied to the characteristic function on $e^{2 \pi i \theta}$, given that this is a $2^n$th root of unity. To recover $\theta$, one merely has to apply the inverse Fourier transform as defined above. So, assuming we can build it, the circuit becomes

$(\mathbb{I} \otimes QFT_{2^n}^{-1})\circ (\mathbb{I} \otimes H^{\otimes n}) \circ (C_nU)^n \circ \cdots\circ (C_2U)^2 \circ C_1U \circ (\mathbb{I} \otimes H^{\otimes n})\lvert \psi 0^n \rangle$

with the final state being

$QFT^{-1}_{2^n} \displaystyle\frac{1}{\sqrt{2^n}}\sum_{x=0}^{2^n - 1} e^{2 \pi i \theta x} \lvert x \rangle = \displaystyle \frac{1}{ \sqrt{2^n}\sqrt{2^n}} \sum_{x,t,y=0}^{2^n - 1} e^{-2 \pi i y t/2^n} \lvert y \rangle \langle t \rvert e^{2 \pi i \theta x} \lvert x \rangle = \displaystyle \frac{1}{2^n} \sum_{x,y=0}^{2^n - 1} e^{2 \pi i x (\theta - y/2^n)} \lvert y \rangle.$

Adding it and using standard formulas for geometric series, the coefficient to $\lvert y \rangle$ is:

$\displaystyle \frac{1 - e^{2^n 2 \pi i (\theta - y^/2^n)}}{1 - e^{2 \pi i (\theta - y/2^n)}} = \frac{1 - e^{2 \pi i (2^n\theta - y^/2^n)}}{1 - e^{2 \pi i (\theta - y/2^n)}}.$

giving the final formula

$\displaystyle \frac{1}{2^n} \sum_{y=0}^{2^n - 1} \frac{1 - e^{2 \pi i (2^n\theta - y^/2^n)}}{1 - e^{2 \pi i (\theta - y/2^n)}} \lvert y \rangle.$

Now, if $\theta \in \displaystyle \frac{1}{2^n}\mathbb{Z}$, then a measurement on the $n$ rightmost qubits will reveal the corresponding numerator with certainty. This leaves two remaining questions: what happens if the denominator is not a power of 2, and how to implement this.
Before answering those questions, we also need to discuss the case where the input $\lvert \psi\rangle$ is in fact not an eigenvector for $U$, but a sum of eigenvectors. In this case, the resulting quantum state from applying the above algorithm is no longer separable but entangled. To see this, assume that the $\theta_j$ is of the form $m_j/2^n$ as above. Suppose that
$\lvert \psi \rangle = \displaystyle\sum_{j=1}^\ell a_j \lvert \psi_j \rangle$

where $\lvert \psi_j \rangle$ are (unit) eigenvectors with eigenvalue $\lambda_j = e^{2 \pi i \theta_j}$, and $\sum |a_j|^2 = 1$. By linearity, running phase estimation on $\lvert \psi \rangle$ will return $m_j$ with probability $||a_j||$. Note it's possible that the $m_j$ coincide, in which case the probabilities add.

Intuitively, if all the eigenvalues are distinct and we measure $m$ by running the algorithm above, this means this measurement should project $\lvert \psi \rangle$ to $\lvert \psi_i \rangle$ where $m_i = m$. Mathematically, applying the above algorithm to $\lvert \psi \rangle$ gives
$\displaystyle\sum_{j=1}^\ell a_j \lvert \psi_j \rangle \lvert m_j \rangle$.

After measurement, we see that the only $\lvert \psi_j \rangle$ appearing are those where $m_j = m$, more or less by definition, i.e.
$\displaystyle\frac{1}{\sum_{m_j = m} ||a_j||} \displaystyle\sum_{m_j = m} a_j \lvert \psi_j \rangle \lvert m \rangle$

In particular, if we ran the algorithm again on the output, we would find the same eigenvalue $\lambda_i$ again.


Let's address the first question of what happens when $\theta \not\in \displaystyle \frac{1}{2^n} \mathbb{Z}$. I'll spoil the answer. Suppose we take a measurement and get a fraction $m/2^n$.
  • The probability that $m/2^n$ is the fraction in $\displaystyle\frac{1}{2^n}\mathbb{Z}$ which is closest (or tied to being closest) to $\theta$ is at least $4/\pi^2$, i.e. at least 40%.
  • The probability that $m/2^n$ is not the closest to $\theta$ is false is at most $1/4$, i.e. 25%.
Since these bounds are constant in $n$, this means that for any probability cutoff for having found the closest $m/2^n$ to $\theta$, the number of iterations we need to take to get within the cutoff is constant.

Let us briefly derive this. Recall that the coefficient to $\lvert y \rangle$ is:

$\displaystyle \frac{1}{2^n} \displaystyle \frac{1 - e^{2^n 2 \pi i (\theta - y^/2^n)}}{1 - e^{2 \pi i (\theta - y/2^n)}} = \displaystyle \frac{1}{2^n} \frac{1 - e^{2 \pi i (2^n\theta - y^/2^n)}}{1 - e^{2 \pi i (\theta - y/2^n)}}.$

We can visualize a circle with $2^n$ tick marks, one of which corresponds to $y/2^n$, while $\theta$ is some arbitrary point. The numerator says to take the points modulo the tick marks, then blow up the segment between ticks to take the whole circle, and then take the distance between them. The denominator says to simply take the distance between them. Using the fact that the cord length is at least the arc length, we can deduce the above inequalities.
Now, we return to the previous question of what happens when $\lvert \psi \rangle$ is not an eigenvector, but now we do not assume that $r = 2^n$. Suppose $\lambda_j = e^{2 \pi i \theta_j}$. By linearity, the state is

$\displaystyle\frac{1}{2^n}\sum_{y=0}^{2^n-1} \displaystyle\sum_{j=1}^\ell \displaystyle a_j \frac{1 - e^{2 \pi i (2^n\theta_j - y/2^n)}}{1 - e^{2 \pi i (\theta_j - y/2^n)}}\lvert \psi_j \rangle \lvert y \rangle .$

Unlike before, if we measure $m$ in the right registers, this doesn't cause all terms involving $\lvert \psi_j \rangle$ for $m_j \ne m$ to vanish. We still have contributions from all $\vert \psi_j \rangle$:

$\displaystyle\sum_{j=1}^\ell \displaystyle a_j \frac{1 - e^{2 \pi i (2^n\theta_j - m/2^n)}}{1 - e^{2 \pi i (\theta_j - m/2^n)}}\lvert \psi_j \rangle \lvert m \rangle .$

In the above formula, we did not renormalize. Roughly speaking, once we do, what happens is that the $\theta_j$ near $m$ gain weight, while those further from it lose weight. In particular, with enough iterations, the vector $\lvert \psi \rangle$ begins to converges exponentially to an eigenvector in the eigenspace for $\lambda = e^{2 \pi i m/2^n}$ with probability exponentially approaching $1$.
Now, to implement it. We ignore all normalizations. We denote our input as $\lvert a_{n-1} \cdots a_0 \rangle = \lvert m \rangle \lvert a_0 \rangle = \lvert 2m + a_0 \rangle$. Say we've already implemented the gate $QFT_{2^{n-1}}$. Then, applying it to $\lvert m \rangle$, we have:

$(QFT_{2^{n-1}} \otimes \mathbb{I}) \lvert m \rangle \lvert a_0 \rangle = \displaystyle\sum_{x=0}^{2^{n-1} -1} e^{2 \pi i \cdot m \cdot (x/2^{n-1})} \lvert x \rangle \lvert a_0 \rangle = \displaystyle\sum_{x=0}^{2^{n-1} -1} e^{2 \pi i \cdot 2m \cdot (x/2^{n})} \lvert 2x + a_0 \rangle$

In the next step, we want to tweak $m$ so that it becomes $2m + a_0$. To do this, for $x = 2^k$, i.e. the $k+1$th register, we need to add the phase $(x/2^n)a_0$. That is, we apply a controlled phase gate, with the rightmost register as the control, with phase $2 \pi i \cdot (2^{k-1}/2^n)$ to the $k$th register:

$\displaystyle\sum_{x=0}^{2^{n-1} -1} e^{2 \pi i \cdot (2m + a_0) \cdot (x/2^{n})} \lvert 2x + a_0 \rangle.$

We next apply the QFT to the remaining register, i.e. the Hadamard gate to the leftmost register:

$\displaystyle\sum_{x=0}^{2^{n-1} -1} e^{2 \pi i \cdot (2m + a_0) \cdot (x/2^{n})} (\lvert 2x \rangle + (-1)^{a_0} \lvert 2x + 1 \rangle) = \displaystyle\sum_{x=0}^{2^{n-1} -1} \sum_{y = 0}^1 e^{2 \pi i \cdot (2m + a_0) \cdot (x/2^{n}) + \pi i a_0 y} \lvert 2x + y\rangle.$

Intuitvely, this is applying a Fourier transform of degree $2$ to the rightmost bit. There are two ways to interpret $\lvert a_{n-1} \ldots a_0 \rangle$:
  • the integer $a = a_0 + 2a_1 + \cdots + 2^{n-1} a_{n-1}$, i.e. as the binary expression $a_{n-1} \cdots a_0$,
  • the root of unity $e^{2 \pi i \theta}$ where $\theta = a/2^n$, i.e. as the binary expression $0.a_{n-1} \cdots a_0$.
In particular, it is not the rightmost bit that should be $2$th root of unity but the leftmost bit. So we rewrite

$= \displaystyle\sum_{x=0}^{2^{n-1} -1} \sum_{y = 0}^1 e^{2 \pi i \cdot (2m + a_0) \cdot (x + 2^{n-1} y)/2^{n}} \lvert 2x + y\rangle.$

and rotate it there using swap gates:

$\displaystyle\sum_{x=0}^{2^{n-1} -1} \sum_{y = 0}^1 e^{2 \pi i \cdot (2m + a_0) \cdot (x + 2^{n-1} y)/2^{n}} \lvert 2^{n-1} y + x\rangle = \displaystyle\sum_{z=0}^{2^n-1} e^{2 \pi i \cdot (2m+a_0) z} \lvert z \rangle.$

This is the desired formula.

Tuesday, February 6, 2024

Some standard quantum gates

There appear to be some quantum states and gates that are more or less standard. I'm not sure what the theoretical or physical reason for this is, but I'll fill this post out as I learn more.

  • Plus and minus states for a qubit:
    $\lvert + \rangle = \displaystyle\frac{1}{\sqrt{2}}\left( \lvert 0 \rangle + \lvert 1 \rangle \right), \;\;\;\;\;\;\;\;\;\; \lvert-\rangle = \displaystyle\frac{1}{\sqrt{2}}\left( \lvert 0 \rangle - \lvert 1 \rangle \right).$
  • Bell states/basis on two qubits:
    $\lvert \phi^+ \rangle = \displaystyle\frac{1}{\sqrt{2}}\left( \lvert 00 \rangle + \lvert 11 \rangle \right)$
    $\lvert \phi^- \rangle = \displaystyle\frac{1}{\sqrt{2}}\left( \lvert 00 \rangle - \lvert 11 \rangle \right)$
    $\lvert \psi^+ \rangle = \displaystyle\frac{1}{\sqrt{2}}\left( \lvert 01 \rangle + \lvert 10 \rangle \right)$
    $\lvert \psi^- \rangle = \displaystyle\frac{1}{\sqrt{2}}\left( \lvert 01 \rangle - \lvert 10 \rangle \right)$
    These are essentially variants of entangled states. The Bell state $\lvert \phi^+ \rangle$ can be created from the qubit $\lvert 00 \rangle$ by applying the operator $XC \circ \mathbb{I}H$ (where $XC$ is the controlled NOT, with control on the right factor). The other bell states may be obtained by applying $X$ and $Z$ to this one.
  • The Hamamard gate, which acts on a single qubit
    $H = \displaystyle\frac{1}{\sqrt{2}}\begin{pmatrix}1 &1\\1&-1\end{pmatrix}$
    which transforms a qubit between the standard basis and the plus/minus basis. Note that $H^2 = \mathbb{I}$.
  • Phase operations:
    $P_\theta = \begin{pmatrix}1&0\\0&e^{i\theta}\end{pmatrix}, \;\;\;\;\; S = P_{\pi/2} = \begin{pmatrix}1&0\\0&i\end{pmatrix}, \;\;\;\;\; T = P_{\pi/4} = \begin{pmatrix}1&0\\0&(1+i)/\sqrt{2}\end{pmatrix}$
  • Pauli operations, which act on a single qubit
    $\mathbb{I} = \begin{pmatrix} 1&0\\0&1\end{pmatrix}, \;\;\;\;\;X = \begin{pmatrix} 0&1\\1&0\end{pmatrix}, \;\;\;\;\;Y = \begin{pmatrix} 0&-i\\i&0\end{pmatrix}, \;\;\;\;\;Z = \begin{pmatrix} 1&0\\0&-1\end{pmatrix}.$
    The operation $X$ is also called the NOT operation, and $Z = P_{\pi}$ is a phase operation. These have the relations
    $X^2 = Y^2 = Z^2 = \mathbb{I},$
    $XY = iZ, YZ = iX, ZX = iY,$
    $XZ = -iY, ZY = -iX, YX = -iZ.$
  • Rotations in the Bloch sphere. The Bloch sphere is the space of states for a qubit, modulo global phase, i.e. $S^3/S^1 \simeq S^2$. In spherical coordinates $\theta \in [0, \pi], \psi \in [0, 2\pi)$:
    $\cos(\theta/2) \lvert 0 \rangle + e^{i \psi} \sin(\theta/2) \lvert 1 \rangle.$
    That is, given a qubit $\alpha \lvert 0 \rangle + \beta \lvert 1 \rangle$, we can use the global phase change to make $\alpha$ a real positive number in $[0, 1]$, which we parameterize by $\cos(\theta/2)$. Then, $\beta$ has magnitude $\sin(\theta/2)$ and arbitrary phase. We can also embed the Bloch sphere in $\mathbb{R}^3$ using the usual spherical coordinates:
    $(\theta, \psi) \mapsto (\sin \theta \cos \phi, \sin\theta \sin \phi, \cos \theta).$
    The rotation about the three axes then can be interpreted as an action on qubits, giving rise to the rotation gates.
    • $R_x(\theta) = \begin{pmatrix} \cos(\theta/2) & -i \sin(\theta/2) \\ -i \sin(\theta/2) & \cos(\theta/2) \end{pmatrix}$
    • $R_y(\theta) = \begin{pmatrix} \cos(\theta/2) & -\sin(\theta/2) \\ \sin(\theta/2) & \cos(\theta/2) \end{pmatrix}$
    • $R_z(\theta) = \begin{pmatrix} e^{-\theta/2} & 0 \\ 0 & e^{\theta/2} \end{pmatrix}$
    We also have
    $X = iR_x(\pi), Y = iR_y(\pi), Z = iR_z(\pi),$

    $R_z(\theta) \sim P_\theta.$
  • The SWAP gate, which swaps tensor factors:
    $\text{SWAP}\lvert x \rangle \lvert y \rangle = \lvert y \rangle \lvert x \rangle$
  • Controlled operations: given a unitary operation $U$ on any quantum system $R$, the associated controlled operation $CU$ acts on $Q \otimes R$ where $Q$ is a single qubit by
    $CU = \lvert 0 \rangle \langle 0 \rvert \otimes \mathbb{I}_R + \lvert 1 \rangle \langle 1 \rvert \otimes U,$
    i.e. the qubit $Q$ controls whether or not (or with what probability) the operation $U$ is performed.
    • The controlled NOT operation $CX$ sends the standard basis to the Bell basis (with a slight sign difference):
      $CU\lvert 00 \rangle = \lvert \phi^+\rangle, \;\;\;\;\; CU\lvert 01 \rangle = \lvert \phi^- \rangle, \;\;\;\;\; CU\lvert 10 \rangle = \lvert \psi^+ \rangle, \;\;\;\;\; CU\lvert 11 \rangle = -\lvert \psi^- \rangle$

Monday, February 5, 2024

Quantum mechanics vs. probability

A few similarities and differences between quantum systems and probabilistic systems.
  • A probability distribution on a set $X$ is a function $\rho: X \rightarrow \mathbb{R}$ such that $\int_X \rho(x) \; dx = 1$. A quantum state in a set $X$ is a norm 1 element of the free vector space $\mathbb{C}\{X\}$. An operator on a probability distribution is a stochastic matrix, i.e. a matrix of non-negative real entries whose columns sum to 1. An operator on a quantum system is a unitary matrix.
  • Given two probability distributions on $X$ and $Y$, one can form the product $\rho_X \times \rho_Y: X \times Y \rightarrow \mathbb{R}$, which has $\int_{X \times Y} \rho_X \rho_Y = 1$ by Fubini's theorem. Given two quantum states $v \in V$ and $w \in W$, one can form $v \otimes w \in V \otimes W$, which is also a unit vector.
  • Given a probability distribution on a product $X \times Y$, one can project the distribution to either $X$ or $Y$ by integration. However, given a tensor product of vector spaces $V \otimes W$, there is no way to project to $V$ or $W$.
  • There is a notion of conditional probability, corresponding to the notion of partial measurement of a quantum state. Note that we have to renormalize the vector to be a unit vector in this case.
  • The property of independence of distributions corresponds to the property of a quantum state being separable, i.e. a rank one tensor. The notion of having non-zero correlation corresponds to entanglement.
  • Stochastic matrices often have no stochastic square root, i.e. one can not take half steps. The square roots of unitary matrices are always unitary.
  • Stochasitc matrices can never turn uncertainty into certainty, i.e. there is no stochastic matrix $A$ such that $A \begin{pmatrix} 1/2 \\ 1/2 \end{pmatrix} = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$. However, unitary matrices act transitively on the unit vectors, e.g. the Hadamard operator $H = \begin{pmatrix} 1/\sqrt{2} & 1/\sqrt{2} \\ 1/\sqrt{2} & -1/\sqrt{2} \end{pmatrix}$ has $H \begin{pmatrix} 1/\sqrt{2} \\ 1/\sqrt{2} \end{pmatrix} = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$.
  • We can measure the the probability of some event $E \subset X$ by taking $P(E) = \int_E \rho(x) \; dx$. For quantum systems, measurements on a complex inner product space $V$ correspond to Hermitian projections (i.e. self-adjoint idempotents).
    • More precisely, a measurement should be viewed as a "complete" collection of such self-adjoint idempotents that sum to the identity. This is mostly semantic and refers to the fact that we usually do not simply ask if a coin flip is heads, we ask if it is heads or tails. Of course, any non-complete collection of "orthogonal" projections can be completed by adding the projection to the orthogonal complement of the sums of the images.

Shor's algorithm

Shor's algorithm is a quantum algorithm for factoring in polynomial time. For sources, I am using this Qiskit tutorial and Wikipedia ...