Tuesday, September 27, 2022

Phase Kickback

Phase kickback appears to refer to a technique in quantum computing where one alters one factor of a rank 1 tensor in a way dependent on the other factor $(\sum v) \otimes w \mapsto \sum v \otimes T_v(w)$ and then, because the $w$ are eigenvectors for the $T_v$, one can rewrite the result as a rank 1 tensor again $\sum v \otimes T_v(w) = (\sum \lambda_v v) \otimes w$. In some sense, the right tensor factor is "auxilliary" and the effect is to attach coefficients we want to the left tensor factor. We prefer rank 1 tensors because we can then perform measurements separately on the factors. Let's give a few examples, from the Qiskit tutorial.


Bernstein-Vazirani Algorithm: Let's first discuss the mathematics. Consider a function $f: \{0, 1\}^n \rightarrow \{0, 1\}$ which is of the form $f(v) = s \cdot v$ for some fixed $s \in \{0, 1\}^n$. We want to determine $s$. Let $V = \mathrm{Fun}(\{0, 1\}, \mathbb{C})$ be the state space with characteristic functions $e_0, e_1$. For $v \in \{0, 1\}^n$, we let $e_v \in V^{\otimes n}$ denote the corresponding basis vector. Recall the Hadamard gate $H = \displaystyle\frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ 1 & - 1 \end{pmatrix}$, whose higher tensors act by:
$H^{\otimes n} e_v = \displaystyle\sum_{w \in \{0, 1\}^n} (-1)^{w \cdot v} e_w, \;\;\;\;\;\;\;\;\;\; H^{\otimes n} e_0 = \displaystyle\sum_{w \in \{0, 1\}^n} e_w$.

There is a standard way to attach a unitary (in fact, permutation) operator $U_f$, acting on $n+m$ qubits, to a map $f: \{0, 1\}^n \rightarrow \{0, 1\}^m$:
$U_f(e_v \otimes e_i) = e_v \otimes e_{i + f(v)}.$
Here, $m=1$. Let us quickly observe its eigenvectors: they are
$e_x \otimes (e_0 + e_1)$ with eigenvalue $1$,
$e_x \otimes (e_0 - e_1)$ with eigenvalue $(-1)^{f(x)}$.
This is important. One notes that the right tensors are the plus/minus basis, thus we might expect to do a change of basis using the Hadamard operator.

Using the above description of eigenvectors, we have the following:
$(U_f \circ H^{\otimes n+1})(e_0^{\otimes n} \otimes e_1) =\displaystyle\frac{1}{\sqrt{2^{n+1}}} U_f( \sum_{x \in \{0, 1\}^n} e_x \otimes (e_0 - e_1))$
$ = \displaystyle\frac{1}{\sqrt{2^{n+1}}}\displaystyle \sum_{x \in \{0, 1\}^n} (-1)^{f(x)} e_x \otimes (e_{0} - e_{1}).$
This is a rank 1 tensor, which essentially happens because $H(e_1)$ is an eigenvector of $U_f$. Thus, the tensor factors are independent, so they can be dealt with separately. Applying $H^{\otimes n+1}$, we obtain
$H^{\otimes n+1} \left(\displaystyle\frac{1}{\sqrt{2^{n+1}}}\displaystyle \sum_{x \in \{0, 1\}^n} (-1)^{f(x)} e_x \otimes (e_{0} - e_{1}) \right) = e_s \otimes e_1$
and, measuring the first $n$ registers, we are done! Note that we did not need to apply $H$ to the last tensor factor, so the quantum circuit we want to apply is:
$(H^{\otimes n} \otimes \mathbb{I}) \circ U_f \circ H^{\otimes n+1} \vert 0 \cdots 0 1\rangle$
and then we measure the left $n$ registers, discarding the rightmost register. As a practical matter, we need to be able to implement all these gates. Maybe we believe $H$ can be implemented since it is so standard, but it is less clear for $U_f$.


Deutsch-Jozsa Algorithm: This is the same problem as above, except the function $f$ is either constant or balanced (half inputs take value 0, half take 1), and we want to determine which it is. The procedure is the same, except at the very end, where when we apply $H^\otimes$ to the eigenvectors we get
$\displaystyle\frac{1}{\sqrt{2^{n+1}}}\displaystyle \sum_{x \in \{0, 1\}^n} (-1)^{f(x)} e_x \otimes (e_{0} - e_{1}).$
Let us ignore the right tensor factor (and renormalize). Applying $H^{\otimes n}$ to this expression we get
$\displaystyle\frac{1}{\sqrt{2^{2n}}}\displaystyle \sum_{x \in \{0, 1\}^n} (-1)^{f(x)} \sum_{y \in \{0,1\}^n} (-1)^{x \cdot y} e_y = \displaystyle\frac{1}{\sqrt{2^{2n}}}\displaystyle \sum_{x, y \in \{0, 1\}^n} (-1)^{f(x) + y \cdot x} e_y.$
Now, taking $y = 0$, we note that the coefficient $\pm 1$ is the function is constant, and $0$ if the function is balanced. So, if we take a measurement and get $\langle 0\cdots 0 \vert$, then the function is constant, and if we get anything else, then the function is not constant, so in this case, balanced.

In summary, we apply the same circuit as above:
$(H^{\otimes n} \otimes \mathbb{I}) \circ U_f \circ H^{\otimes n+1} \vert 0 \cdots 0 1\rangle$
and also measure the rightmost $n$ registers. Before, under the assumption that $f(x) = s \cdot x$, these registers recovered $s$. Now, under the assumption that $f$ is either balanced or constant, if these registers are measured to be $0^n$ then the function is constant, and otherwise it is balanced. A few remarks:
  • The only function which is both of the form $f(x) = s \cdot x$ and constant is the zero function, in which case $s = 0^n$. This is in agreement with both problems.
  • A function of the form $f(x) = s \cdot x$ is balanced if and only if $s \ne 0^n$. This is in agreement with both problems.

Phase Estimation: Now, suppose we have a unitary operator $U$ on $n$ qubits. We have some state and we observe it, so it becomes some eigenvector $\lvert psi \rangle$ of $U$. The question is, can we extract the eigenvalue $\lambda = e^{2\pi i \theta}$, or at least estimate it?

The idea is to apply the following unitary operator, letting $CU$ be the controlled unitary operator with the control by convention to the right:
$(\mathbb{I}^{\otimes n} \otimes H) \circ CU \circ (\mathbb{I}^{\otimes n} \otimes H)$.
We apply this to the state $\langle \psi 0 \rangle$ and get, first ignoring renormalization:
$\lvert \psi 0 \rangle \mapsto \lvert \psi 0 \rangle + \lvert \psi 1 \rangle \mapsto \lvert \psi 0 \rangle + \lambda \lvert \psi 1 \rangle \mapsto (1 + \lambda) \lvert \psi 0 \rangle + (1 - \lambda) \lvert \psi 1 \rangle.$
Renormalizing, we have
$\displaystyle\frac{1}{2} \lvert \psi \rangle \left((1 + \lambda) \lvert 0 \rangle + (1 - \lambda)\lvert 1 \rangle\right)$
Now, recalling that $\lambda = e^{2 \pi i \theta}$, the probability that we measure 0 and 1 for the rightmost qubit are respectively:
$p_0 := \displaystyle\frac{1}{2}||1 + e^{2 \pi i \theta}|| = \displaystyle\frac{2 + e^{2 \pi i \theta} + e^{-2 \pi i \theta}}{2} = 1 + \cos(2\pi \theta) = \cos(\pi \theta)^2,$
$p_1 :=\displaystyle\frac{1}{2}||1- e^{2 \pi i \theta}|| = \displaystyle\frac{2 - e^{2 \pi i \theta} - e^{-2 \pi i \theta}}{2} = 1 - \cos(2\pi \theta) = \sin(\pi \theta)^2.$
Now, assuming we can measure these probabilities perfectly (which we can't, but we can get probably get close if we do this enough times) allows us to determine $|\theta|$, i.e. we cannot distinguish $\theta$ and $1 - \theta$. In practice, I don't think people do this, because it requires running the algorithm a bunch of times to get within an error, and one cannot extract algebraic information from this. However, we note that if $\theta = 0, 1/2$, we can extract precise algebraic information. This will be the subject of a future post on the quantum Fourier transform.

Monday, September 26, 2022

Why does quantum mechanics involve complex numbers?

First, I am using this helpful expository paper by Ricardo Karam. He gives four reasons, labeled (A) through (D). I want to discuss his third reason, (C), based on the Stern-Gerlach experiment, and my understanding of it.

First, there is an experiment which does the following; I'll be as vague as possible to avoid getting details wrong. A particle is sent through various boxes which observe its spin in the $x$, $y$ and $z$ directions, with the "equal filtering probability" property, i.e. that if a pure state for one observable is then observed by any other, it has equal probability of $1/2$ of being positive or negative spin.

Let's introduce some notation. First, the eigenvectors for the $z$-observation $S_z$ are $\lvert + \rangle$ and $\lvert - \rangle$, and we take this to be our standard basis. After performing a $x$-observation $S_x$ (or a $y$-observation $S_y$), we obtain the states $\lvert + \rangle_x$ and $\lvert - \rangle_x$ (respectively $\lvert + \rangle_y, \lvert - \rangle_y$). The matrices for the observables in this standard basis are
$A_z = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}, \;\;\; A_x = \begin{pmatrix} \vert & \vert \\ \lvert + \rangle_x & \lvert - \rangle_x \\ \vert & \vert \end{pmatrix}, \;\;\; A_y = \begin{pmatrix} \vert & \vert \\ \lvert + \rangle_y & \lvert - \rangle_y \\ \vert & \vert \end{pmatrix}$.

For convenience, we record that a solution (up to various symmetries) for the problem is:
$A_z = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}, \;\;\; A_x = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}, \;\;\; A_y = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ i & -i \end{pmatrix}$.
We observe that these matrices are unitary.

Because I'm not too familiar with bra-ket notation, I need to make a digression. Vectors of the form $\langle \bullet \rvert$ are covectors, e.g. ${}_x\langle + \rvert$ is the first column of the conjugate transpose $A_x^*$. What is the meaning of these covectors? By definition, to pair with $\langle + \vert$ and $\langle - \vert$ gives the proportion of particles which have positive or negative $z$-spin. We generalize: ${}_y\langle +\vert - \rangle_x$ is a number whose square magnitude is the probability that a particle which has been observed to have negative $x$-spin will then be observed to have positive $y$-spin. Note that we compute the quantity ${}_y\langle +\vert - \rangle_x$ by taking the corresponding entry of the matrix $A_y^* A_x$.

Why square magnitude and not magnitude? All our matrices will have the property that $A^* A = I$, i.e. they are unitary, because repeating an observation should result in the same thing. The columns of unitary matrices are have norm one, which satisfy the property that the sum of the squares of the magnitudes equals one. Thus the correct quantity to interpret as a probability (i.e. the thing that sums to one) are the squares.

Anyway, the point is, the "equal probability" property means that any pair of eigenvectors (from different observables) should pair to value $1/2$ (and the fact that repeated measurements give the same value means for a single observable the eigenvalues must be orthogonal). If they were all real eigenvectors, this means that their squared dot product is $1/2$, i.e. the angle between the vectors is $45^\circ$. But this is impossible to do with three pairs of eigenvectors in $\mathbb{R}^2$, so the eigenvectors must be complex. Moreover, the matrices have complex entries with respect to the standard basis which we have fixed.

Finally, it's worth constrasting this situation with polarizing filters; I saw an argument on Twitter at some point about whether polarization is a quantum phenomenon or not. I'm a bit confused by this. On the one hand, one can clearly model light polarization in terms of projections, but it appears there is both a quantum and a classical model for polarization of photons. I want to say this is due to the fact that photon polarization only involves two observables, but I haven't really checked this.

Sunday, September 25, 2022

Introduction

This is a learning blog for quantum computing from a mathematical perspective. Anything written here has a high likelihood of being completely wrong. Use at your own risk.

Shor's algorithm

Shor's algorithm is a quantum algorithm for factoring in polynomial time. For sources, I am using this Qiskit tutorial and Wikipedia ...