Single-Qubit Systems#


In the introductory chapter to quantum bits and circuits, we presented the model of a qubit inspired by the probabilistic nature of the spin of an electron. However, we only described how the electron behaves when its spin was pointing in the \(\texttt{+}z, \texttt{-}z, \texttt{+}x, \texttt{-}x\) directions (to which we assigned the qubit states \(|0\rangle, |1\rangle, |+\rangle, |-\rangle\), respectively).

But what if the electron spin is pointing along some arbitrary angle with respect to the \(\texttt{+}z\) direction? We did mention that the probability of measuring spin-up or spin-down depends on this angle, but did not give a quantitative rule for how electrons behave.

In this chapter, we will not only expand on this idea, but we will also broaden the definition of the state that describes the spin of an electron pointing in any direction, which represents the most general way to express the state of a qubit. We will then introduce the Bloch Sphere, which is a graphical representation of the mathematical description of the qubit’s state, and lastly discuss the different operations we can apply to a qubit to transform it from one arbitrary state to another.

1. Refining the Qubit#

Our goal for this section is to derive the final mathematical definition of a qubit. For this, we will follow a similar approach to that of the previous two chapters; that is, through a discovery process using simple examples. The idea is to derive the vector representation of a qubit, given by:

\[\begin{split} |q\rangle = \begin{bmatrix} \alpha_0 \\ \alpha_1 \end{bmatrix}, \text{ such that:} \; \alpha_j \in \mathbb{C}, \; \text{and} \; |\alpha_0|^2 + |\alpha_1|^2 = 1 ,\end{split}\]

which is a generalization of our definition for the bit, where the only entries allowed were \(0\) and \(1\):

\[\begin{split} |b\rangle = \begin{bmatrix} \beta_0 \\ \beta_1 \end{bmatrix}, \text{ such that:} \; \beta_j \in \{0, 1\}, \; \text{and} \; \beta_0^2 + \beta_1^2 = 1 .\end{split}\]

So, let’s take a first step and see how we go from the few specific statevectors we’ve presented so far, to a more general (but partial) representation using real numbers. After that, we will find why we need to incorporate complex values as well in our definition. Lastly, we will combine these two steps to get a complete representation of a general qubit and its visual representation in the Bloch sphere.

1.1 Real Probability Amplitudes#

Let us, once again, revisit the Stern-Gerlach experiment and consider the results of performing repeated experiments on an electron with its spin at an angle \(\theta\) with respect to the \(\texttt{+}z\) axis, as shown in the image below. Observations demonstrate that, in this scenario, the probability of the electron deflecting in the \(\texttt{+}z\) direction (i.e., measuring state \(|0\rangle\)) is given by the cosine squared of the angle \(\theta/2\):

\[ \mathbb{P}_{0} = \cos^2\left(\frac{\theta}{2}\right) \]

Similarly, the probability of the electron deflecting in the \(\texttt{-}z\) direction (i.e., measuring state \(|1\rangle\)) is equal to the sine squared of the angle:

\[ \mathbb{P}_{1} = \sin^2\left(\frac{\theta}{2}\right) \]
../../_images/02_03_01_stern-gerlach_angle_prob.png

From a probability standpoint, these expressions work since the sum of sine and cosine squared for any common angle always add to one, just like probabilities ought to do:

\[\begin{split} \begin{aligned} \mathbb{P}_{0} &+ \mathbb{P}_{1} = 1 \\ \\ \cos^2\left(\frac{\theta}{2}\right) &+ \sin^2\left(\frac{\theta}{2}\right) = 1 \end{aligned} \end{split}\]

Therefore, as discussed in the section on probability amplitudes, we can construct a statevector that represents the state of a qubit (electron spin) by making sure that when the selected probability amplitudes are squared, we obtain the empirical probabilities that match those from observed results via experimentation (as specified by the Born rule):

\[\begin{split} \begin{aligned} |q\rangle &= \begin{bmatrix} \cos\left(\frac{\theta}{2}\right) \\ \sin\left(\frac{\theta}{2}\right) \end{bmatrix} \\ \\ |q\rangle &= \cos\left(\frac{\theta}{2}\right)|0\rangle + \sin\left(\frac{\theta}{2}\right) |1\rangle \end{aligned} \end{split}\]

It is easy to see that the expression above is of the form:

\[\begin{split} |q\rangle = \begin{bmatrix} a_0 \\ a_1 \end{bmatrix}, \text{ such that:} \; a_j \in \mathbb{R}, \; \text{and} \; a_0^2 + a_1^2 = 1 ,\end{split}\]

which is very close to the final definition of a qubit we set ourselves to find. The only exception is that the vector elements here are real-valued rather than complex numbers. We will explore why we need complex values in the following section, but for now, we can see this expression works for the four electron spin cases we have explored so far:

Spin Direction

Qubit State

Angle \(\theta\) [rad]

\(\cos\left(\frac{\theta}{2}\right)\)

\(\sin\left(\frac{\theta}{2}\right)\)

\(\texttt{+}z\)

\(\vert 0\rangle\)

\(0\)

\(1\)

\(0\)

\(\texttt{-}z\)

\(\vert 1\rangle\)

\(\pi\)

\(0\)

\(1\)

\(\texttt{+}x\)

\(\vert +\rangle\)

\(\frac{\pi}{2}\)

\(\frac{1}{\sqrt{2}}\)

\(\phantom{-}\frac{1}{\sqrt{2}}\)

\(\texttt{-}x\)

\(\vert -\rangle\)

\(\frac{3\pi}{2}\)

\(\frac{1}{\sqrt{2}}\)

\(-\frac{1}{\sqrt{2}}\)

But more generally, it works for any angle \(\theta\). Let’s look at the specific example where \(\theta = \pi/3\). The expectation is that we will get the following respective probabilities for measuring states \(|0\rangle\) and \(|1\rangle\):

\[\begin{split} \begin{aligned} &\mathbb{P}_{0} = \cos^2\left(\frac{\pi/3}{2}\right) = \left(\frac{\sqrt{3}}{2}\right)^2 = \frac{3}{4} = 0.75 \\ \\ &\mathbb{P}_{1} = \sin^2\left(\frac{\pi/3}{2}\right) = \left(\frac{1}{2}\right)^2 = \frac{1}{4} = 0.25 \end{aligned} \end{split}\]

And now let’s use the Statevector class in qiskit to construct this statevector, and see if the measurement probabilities match our expectation:

import numpy as np
from qiskit.quantum_info import Statevector

θ = np.pi/3                # Spin angle wrt to +z axis
α0 = np.cos(θ/2)           # Probability amplitude associated with |0⟩
α1 = np.sin(θ/2)           # Probability amplitude associated with |1⟩

q = Statevector([α0, α1])  # Construct statevector |q⟩ = α0|0⟩ + α1|1⟩ = [α0 α1]ᵀ

probs = q.probabilities() # Extract expected probs array [P₀, P₁]
print(probs)
[0.75 0.25]

The result above is the calculation of the analytical probabilities, but we can also sample the statevector (as if we were performing measurements in this system) and get a distribution of results that should be approximately give us \(75\%\) of measuring \(0\) and \(25\%\) of measuring \(1\):

counts = q.sample_counts(shots=1000) # "Measure" statevector 1000 times and return dictionary
                                     # of the form: {'0': number_of_0s, '1': number_of_1s}
print(counts)
{np.str_('0'): np.int64(752), np.str_('1'): np.int64(248)}

This generalization of the qubit, where the probability amplitudes are given by the \(\cos^2\) and \(\sin^2\) of the angle \(\theta/2\), works well to describe the results of the Stern-Gerlach experiment; however, there is a bit of an issue. In this particular case, we are only concerned about the direction of the spin on the \(xz\)-plane because we are shooting the electrons along the \(y\) direction. But how do we describe the spin of the electron if it is aligned with, let’s say, the \(\texttt{+}y\) axis? Well, experiments show that, if we were to measure the spin of such an electron with a field oriented along the \(z\) axis, we will still see the electron deflecting up or down with \(50\%\) probability (just like we do for spin in the \(\texttt{+}x\) direction). As a matter of fact, this is true no matter which orientation (given by the angle \(\varphi\)) the electron has in the \(xy\)-plane:

../../_images/02_03_02_elec_phi_angle.png

So, how do we represent such states? After all, for spin along the \(\pm x\) directions, we defined the probability amplitudes by taking the positive and negative square roots of the probability of measuring spin-up and spin-down, so what do we do for spin along, let’s say the \(\pm y\) directions?

¡Complex Numbers to the Rescue!

1.2 Complex Probability Amplitudes#

Let us start by recalling the reason why we chose states \(|+\rangle\) and \(|-\rangle\) to be of the form:

\[ |+\rangle = \frac{1}{\sqrt{2}}|0\rangle + \frac{1}{\sqrt{2}}|1\rangle \quad \text{ and } \quad |-\rangle = \frac{1}{\sqrt{2}}|0\rangle - \frac{1}{\sqrt{2}}|1\rangle. \]

First off, we selected the values of these probability amplitudes because, when we square them, we still get the correct probabilities of measuring \(0\) and \(1\):

\[\begin{split} \begin{aligned} \text{For state } |+\rangle : \mathbb{P}_0 = \left(\frac{1}{\sqrt{2}}\right)^2 = \frac{1}{2} \quad &\text{and} \quad \mathbb{P}_1 = \left(\frac{1}{\sqrt{2}}\right)^2 = \frac{1}{2}. \\ \\ \text{For state } |-\rangle : \mathbb{P}_0 = \left(\frac{1}{\sqrt{2}}\right)^2 = \frac{1}{2} \quad &\text{and} \quad \mathbb{P}_1 = \left(-\frac{1}{\sqrt{2}}\right)^2 = \frac{1}{2}. \end{aligned} \end{split}\]

And secondly, the signs for the probability amplitudes were selected to allow us to express states \(|0\rangle\) and \(|1\rangle\) in terms of states \(|+\rangle\) and \(|-\rangle\) in a way that is consistent with observations when measurements of spin are performed along the \(x\) axis. More concretely, for state \(|0\rangle\):

\[\begin{split} \begin{aligned} |0\rangle &= \frac{1}{\sqrt{2}}|+\rangle + \frac{1}{\sqrt{2}}|-\rangle \\ \\ &= \frac{1}{\sqrt{2}}\left(\frac{1}{\sqrt{2}}|0\rangle + \frac{1}{\sqrt{2}}|1\rangle \right) + \frac{1}{\sqrt{2}}\left(\frac{1}{\sqrt{2}}|0\rangle - \frac{1}{\sqrt{2}}|1\rangle \right) \\ \\ &= \frac{1}{2}|0\rangle + \frac{1}{2}|1\rangle + \frac{1}{2}|0\rangle - \frac{1}{2}|1\rangle \\ \\ &= |0\rangle . \end{aligned} \end{split}\]

And, for state \(|1\rangle\):

\[\begin{split} \begin{aligned} |1\rangle &= \frac{1}{\sqrt{2}}|+\rangle - \frac{1}{\sqrt{2}}|-\rangle \\ \\ &= \frac{1}{\sqrt{2}}\left(\frac{1}{\sqrt{2}}|0\rangle + \frac{1}{\sqrt{2}}|1\rangle \right) - \frac{1}{\sqrt{2}}\left(\frac{1}{\sqrt{2}}|0\rangle - \frac{1}{\sqrt{2}}|1\rangle \right) \\ \\ &= \frac{1}{2}|0\rangle + \frac{1}{2}|1\rangle - \frac{1}{2}|0\rangle + \frac{1}{2}|1\rangle \\ \\ &= |1\rangle . \end{aligned} \end{split}\]

So, what do we do for the probability amplitudes of the states corresponding to spins along the \(\pm y\) directions? Well, we need to follow similar rules to those we had for spin along \(\pm x\) axes:

  1. When “squared”, they must be equal to probability values of \(1/2\).

  2. We must be able to consistently express states \(|0\rangle\) and \(|1\rangle\) as superpositions of these states.

Unfortunately, the only real numbers that meet these conditions are \(+1/\sqrt{2}\) and \(-1/\sqrt{2}\), which are what we use for states \(|+\rangle\) and \(|-\rangle\). However, with just a small adjustment to the probability rule given above, complex numbers can save the day and make our model consistent with observations!

To understand the changes we need to make, let’s start by recalling that a complex number \(c\) is of the form:

\[ c = a + bi, \]

where \(a, b \in \mathbb{R}\), and are known as the real and imaginary parts of \(c\), respectively. Furthermore, \(i\) is known as the imaginary unit, which satisfies the property that when squared, it equals negative one: \(i^2 = -1 .\)

Complex numbers can then be graphically represented in what is known as the complex plane, with the horizontal direction representing the real axis, and the vertical direction the imaginary axis:

../../_images/02_03_03_complex_plane.png

With this in mind, we can make the very simple observation that, in this plane, the imaginary numbers \(i\) and \(-i\) are in an orthogonal number line to that of the real numbers, which includes the values \(1\) and \(-1\), and all of which lie in a circle of radius \(1\):

../../_images/02_03_04_i_in_plane.png

So, if we superimpose the complex plane on the \(xy\)-plane that defines the direction of the electron spin, we can see that we can associate a complex value in the unit circle with the direction of the spin statevector in this plane\(^*\):

../../_images/02_03_05_spin_in_complex_plane.png

Furthermore, if we multiply this complex number with one of the two probability amplitudes of our statevector, we will indeed get the full and accurate representation of a quantum bit. So, for example, states \(|+\rangle\) and \(|-\rangle\), which are aligned with the \(\texttt{+}x\) and \(\texttt{-}x\) directions, have one of their probability amplitudes multiplied by \(+1\) and \(-1\), respectively. Similarly, we can define the states for spin aligned with the \(\texttt{+}y\) and \(\texttt{-}y\) directions by multiplying one of their probability amplitudes by \(+i\) and \(-i\), accordingly:

\[\begin{split} \begin{aligned} |r\rangle &= \frac{1}{\sqrt{2}}|0\rangle + \frac{i}{\sqrt{2}}|1\rangle, \\ \\ |l\rangle &= \frac{1}{\sqrt{2}}|0\rangle - \frac{i}{\sqrt{2}}|1\rangle. \end{aligned} \end{split}\]

We label this states as \(|r\rangle\) and \(|l\rangle\) because, when looking at the three-dimensional Cartesian plane, they point in the “right” and “left” directions, respectively\(^\dagger\).

Now, upon close inspection, these two statevectors do not really meet our probability condition! If we were to square the probability amplitude associated with \(|1\rangle\), for let’s say state \(|r\rangle\), we would get an invalid probability because squaring \(i\) gives us a negative value:

\[ \mathbb{P}_1 = \left(\frac{i}{\sqrt{2}}\right)^2 = -\frac{1}{2} \, \leftarrow \textbf{invalid probability} .\]

Luckily, this has a simple fix. If instead of taking the square of probability amplitudes to find probability, we find its squared modulus, we will then always get a positive value. So, for an arbitrary complex number \(c\), we find it’s squared modulus by multiplying the number with its complex conjugate:

\[\begin{split} \begin{aligned} |c|^2 &= c\, c^{*} = (a+bi)(a-bi) \\ \\ |c|^2 &= a^2 -abi + abi -b^2i^2 \\ \\ |c|^2 &= a^2 + b^2 \end{aligned} \end{split}\]

Therefore, our updated (and final) version of the Born rule says that, to find the probability of measuring some given state we must take the squared modulus of the probability amplitude associated with it. Thus, for states \(|r\rangle\) and \(|l\rangle\) the probabilities of measuring \(0\) and \(1\) given below are consistent with what we observe from experimentation:

\[\begin{split} \begin{aligned} \text{For state } |r\rangle : \mathbb{P}_0 = \left|\frac{1}{\sqrt{2}}\right|^2 = \frac{1}{2} \quad &\text{and} \quad \mathbb{P}_1 = \left|\frac{i}{\sqrt{2}}\right|^2 = \frac{1}{2}. \\ \\ \text{For state } |l\rangle : \mathbb{P}_0 = \left|\frac{1}{\sqrt{2}}\right|^2 = \frac{1}{2} \quad &\text{and} \quad \mathbb{P}_1 = \left|-\frac{i}{\sqrt{2}}\right|^2 = \frac{1}{2}. \end{aligned} \end{split}\]

Now, let’s inspect if we can recover states \(|0\rangle\) and \(|1\rangle\) by taking equal superpositions of these vectors, which is a condition derived from observations of running, for example, the Stern-Gerlach experiment with the magnetic field along the \(y\) axis. So, first let’s take a positive superposition:

\[\begin{split} \begin{aligned} \frac{1}{\sqrt{2}}|r\rangle + \frac{1}{\sqrt{2}}|l\rangle &= \frac{1}{\sqrt{2}}\left(\frac{1}{\sqrt{2}}|0\rangle + \frac{i}{\sqrt{2}}|1\rangle \right) + \frac{1}{\sqrt{2}}\left(\frac{1}{\sqrt{2}}|0\rangle - \frac{i}{\sqrt{2}}|1\rangle \right) \\ \\ &= \frac{1}{2}|0\rangle + \frac{i}{2}|1\rangle + \frac{1}{2}|0\rangle - \frac{i}{2}|1\rangle \\ \\ &= |0\rangle, \end{aligned} \end{split}\]

which clearly shows we can express \(|0\rangle\) in terms of \(|r\rangle\) and \(|l\rangle\). And now if we take a negative superposition we get:

\[\begin{split} \begin{aligned} \frac{1}{\sqrt{2}}|r\rangle - \frac{i}{\sqrt{2}}|l\rangle &= \frac{1}{\sqrt{2}}\left(\frac{1}{\sqrt{2}}|0\rangle + \frac{i}{\sqrt{2}}|1\rangle \right) - \frac{1}{\sqrt{2}}\left(\frac{1}{\sqrt{2}}|0\rangle - \frac{i}{\sqrt{2}}|1\rangle \right) \\ \\ &= \frac{1}{2}|0\rangle + \frac{i}{2}|1\rangle - \frac{1}{2}|0\rangle + \frac{i}{2}|1\rangle \\ \\ &= i|1\rangle. \end{aligned} \end{split}\]

So we do not quite get exactly \(|1\rangle\) since the state above is premultiplied by a factor of \(i\), but the thing is that states \(|1\rangle\) and \(i|1\rangle\) are effectively the equivalent. This is because, what we really want out of this quantum-mechanical model is to be able to get accurate predictions of what we observe through experimentation. So we have to ask ourselves, what is expression \(i|1\rangle\) telling me about what I will measure from this state? Well, we know that we will observe state \(|1\rangle\) with a probability of \(\mathbb{P}_1 = |i|^2 = 1\), which is the same prediction we will get from simply having state \(|1\rangle\). We say these two states are equivalent up to a global phase. We will explain what the term “global phase” means in the upcoming section.

Clearly, these steps work for states \(|r\rangle\) and \(|l\rangle\), but what about the rest of possible spin directions in the \(xy\)-plane? To generalize this approach, all we need to do is assign to each state the corresponding number from the complex unit circle that lies along the same direction as the spin is pointing to. And luckily for us, there is a direct correspondence between this angle and the complex number associated with it. Recall that, any complex \(c = (a + bi)\) can also be expressed in polar form as:

\[ (a + bi) = r \left( \cos(\varphi) + i \sin(\varphi) \right) = r e^{i \varphi} \]

Here, \(r\) is the magnitude of the vector associated with the complex number:

\[ r = |c| = \sqrt{a^2 + b^2}, \]

and \(\varphi\) is the angle it forms with the real axis:

\[ \varphi = \arg(c) = \text{atan2}(a,b), \]

both of which were shown in the images above in blue. And for clarity, here \(\arg(\,\cdot\,)\) stands for argument, and \(\text{atan2}(\,\cdot\,)\) corresponds to the 2-input arctangent function.

Now, for the specific case of spin on the \(xy\) plane, we only need numbers that do not change magnitude of the measured probabilities, therefore we are only interested in complex numbers where \(r = 1\). Therefore, the complex number of interest is solely given by \(e^{i \varphi}\), which we can use to generalize any state in the \(xy\)-plane as:

\[ |q\rangle = \frac{1}{\sqrt{2}}|0\rangle + e^{i \varphi} \frac{1}{\sqrt{2}}|1\rangle .\]

1.3 Putting it all Together (…and the Bloch Sphere)#

We can now combine the results from section 1.1 and 1.2 to get the following general definition for a qubit:

\[ |q\rangle = \cos\left(\frac{\theta}{2}\right)|0\rangle + e^{i \varphi} \sin\left(\frac{\theta}{2}\right)|1\rangle, \]

where the angles \(\theta\) and \(\varphi\) respectively define the spin orientation with respect to the \(\texttt{+}z\) and \(\texttt{+}x\) axes. This visual depiction of spin orientation is so helpful that we actually use it to represent any kind of qubit independent of its physical implementation. But, instead of calling it “spin”, we refer to it as a unit vector inside of what is known as the Bloch sphere. This vector, however, simply points in the same direction the spin of an electron would be oriented if the qubit was of this type:

../../_images/02_03_06_bloch.png

In the image above we have also added the locations of where the vector will be pointing for states \(\left\{(|0\rangle, |1\rangle \right\}, \left\{|+\rangle, |-\rangle \right\}\) and \(\left\{|r\rangle, |l\rangle \right\}, \) which, as can be seen, respecitvely lie along the \(z\), \(x\) and \(y\) axes.

Let’s now use Qiskit to draw the Bloch sphere representation of an arbitrary qubit. You can change the values of \(\theta\) and \(\varphi\) to see how the direction of the statevector moves around the sphere:

θ = np.pi/3
φ = 2*np.pi/5

α0 = np.cos(θ/2)
α1 = np.sin(θ/2) * np.exp(1j*φ)

sv = Statevector([α0, α1])
sv.draw('bloch')
../../_images/a9d8db9b99d8cbb6f8dd4dc6e8660dae3f39beccc38988fdd104c003fd8f34aa.png

So this is all well and good, however, there seems to be a small discrepancy between what we had set to be our general definition of a qubit and the expression we derived above. If we look closely at the values of \(\alpha_0\) and \(\alpha_1\), it seems that it suffices for \(\alpha_0\) to be purely real rather than complex:

\[ |q\rangle = \underbrace{\cos\left(\frac{\theta}{2}\right)}_{\alpha_0 \in \mathbb{R}}|0\rangle + \underbrace{e^{i \varphi} \sin\left(\frac{\theta}{2}\right)}_{\alpha_1 \in \mathbb{C}}|1\rangle, \]

On the other hand, our general definition of a qubit has both \(\alpha_0\) and \(\alpha_1\) being complex-valued. Luckily, it is not hard to show how to reconcile this difference. Let’s start by assuming \(\alpha_0, \alpha_1 \in \mathbb{C}\), and expressing them in polar form:

\[\begin{split} \begin{aligned} |q\rangle &= \alpha_0 |0\rangle + \alpha_1 |1\rangle \\ \\ |q\rangle &= r_0 e^{i\varphi_0} |0\rangle + r_1 e^{i\varphi_1} |1\rangle \end{aligned} \end{split}\]

Let’s now note that, \(r_0, r_1\) are the respective magnitudes of \(\alpha_0, \alpha_1\), and therefore: \(r_0^2 + r_1^2 = 1\). We can then equate each of these to the sine and cosine terms of the trigonometric identity \(\cos(\theta/2)^2 + \sin(\theta/2)^2 = 1\):

\[ |q\rangle = \cos\left(\frac{\theta}{2}\right) e^{i\varphi_0} |0\rangle + \sin\left(\frac{\theta}{2}\right) e^{i\varphi_1} |1\rangle . \]

Next, we can factor out the \(e^{i\varphi_0}\) term from this expression, resulting in:

\[ |q\rangle = e^{i\varphi_0} \left[ \cos\left(\frac{\theta}{2}\right) |0\rangle + \sin\left(\frac{\theta}{2}\right) e^{i(\varphi_1-\varphi_0)} |1\rangle \right] . \]

For convenience, we will relabel \(\varphi_0\) as \(\gamma\), and call it the global phase since it premultiples the whole state. Similarly, we will relabel \(\varphi_1 - \varphi_0\) as simply \(\varphi\), and call it the relative phase of state \(|q\rangle\) since it corresponds to the phase difference between state \(|0\rangle\) and state \(|1\rangle\):

\[ |q\rangle = e^{i\gamma} \left[ \cos\left(\frac{\theta}{2}\right) |0\rangle + e^{i\varphi} \sin\left(\frac{\theta}{2}\right) |1\rangle \right] . \]

It is clear that the expression inside the square brackets is the same as we previously derived. Furthermore, the key observation to make here is that, the global-phase prefactor \(e^{i\gamma}\) has no impact in the predictions we can make from state \(|q\rangle\). This is because when we compute the probabilities of measuring either state \(|0\rangle\) or state \(|1\rangle\) by taking the square modulus, this premultiplier is always equal to \(1\), independent of the value of \(\gamma\):

\[ \left|e^{i\gamma} \right|^2 = 1 .\]

So, in general, a qubit is defined as a vector of the form:

\[\begin{split} |q\rangle = \begin{bmatrix} \alpha_0 \\ \alpha_1 \end{bmatrix} = \alpha_0 |0\rangle + \alpha_1 |1\rangle, \end{split}\]

but where we can represent \(\alpha_0, \alpha_1 \in \mathbb{C}\) in polar form to factor out a global phase and express the qubit in terms of angles \(\theta, \varphi\), which correspond to the direction of a unit vector in the Bloch sphere:

\[ |q\rangle = \cos\left(\frac{\theta}{2}\right)|0\rangle + e^{i \varphi} \sin\left(\frac{\theta}{2}\right)|1\rangle. \]

1.4 Kets, Bras, Products, and Bases#

Up to this point, we have managed to construct a definition for the qubit based purely on observations. We now know that a qubit is represented by a column vector with complex-valued entries called probability amplitudes, and whose square modulus represent the probabilities of observing the qubit being in one of two possible states that we denote as \(0\) and \(1\). Mathematically, we arrived at an expression for this, which we expressed using the ket symbol \(|q\rangle\) given by:

\[\begin{split}\boxed{|q\rangle = \begin{bmatrix} \alpha_0 \\ \alpha_1 \end{bmatrix}, \text{ such that:} \; \alpha_j \in \mathbb{C}, \; \text{and} \; |\alpha_0|^2 + |\alpha_1|^2 = 1}\end{split}\]

All elements of this kind belong to what is known as a complex Hilbert Space of dimension \(2\), which we will denote as \(\mathcal{H}_2\)\(^{\ddagger}\). So, with this complete definition of a qubit (represented by a ket vector \(|q\rangle \in \mathcal{H}_2\)), we can now generalize the concept of it’s dual counterpart, the bra vector \(\langle q|\), which we briefly introduced in a previous chapter. Therefore, given the qubit state:

\[\begin{split} |q\rangle = \begin{bmatrix} \alpha_0 \\ \alpha_1 \end{bmatrix}, \quad \quad\end{split}\]

it’s bra counterpart is given by:

\[ \langle q| = \begin{bmatrix} \alpha_0^* & \alpha_1^* \end{bmatrix}, \]

where \(\alpha_i^*\) is the complex conjugate of \(\alpha_i\).

This notation allows us to represent the inner product between two vectors, let’s say \(|x\rangle, |y\rangle \in \mathcal{H}_2\), by multiplying the bra of one of them with the ket of the other:

\[\begin{split} \begin{aligned} \langle y | x \rangle &= \begin{bmatrix} y_0^* & y_1^* \end{bmatrix} \begin{bmatrix} x_0 \\ x_1 \end{bmatrix} = x_0 y_0^* + x_1 y_1^* \\ \\ \langle x | y \rangle &= \begin{bmatrix} x_0^* & x_1^* \end{bmatrix} \begin{bmatrix} y_0 \\ y_1 \end{bmatrix} = y_0 x_0^* + y_1 x_1^*. \end{aligned} \end{split}\]

And from the expressions above: \( \langle y | x \rangle = \langle x | y \rangle^* . \)

Furthermore, we can use the inner product of a vector with itself to calculate it’s length, often referred as the norm of the statevector:

\[ \|q\| = \sqrt{\langle q | q \rangle}, \]

where:

\[\begin{split} \begin{aligned} \langle q | q \rangle &= \begin{bmatrix} \alpha_0^* & \alpha_1^* \end{bmatrix} \begin{bmatrix} \alpha_0 \\ \alpha_1 \end{bmatrix} \\ \\ \langle q | q \rangle &= \alpha_0 \alpha_0^* + \alpha_1 \alpha_1^* \\ \\ \langle q | q \rangle &= |\alpha_0|^2 + |\alpha_1|^2. \end{aligned} \end{split}\]

The inner product is also an important tool to define the concept of an orthonormal basis. An orthonormal basis is a subset of elements in our Hilbert space that are both orthogonal (making them linearly independent), and normalized. Two vectors are orthogonal to each other if their inner product is equal to \(0\), and normalized if they have norm equal to \(1\). The concept of a basis is important because it allows us to express any statevector as a linear combination of the basis states. So, you can probably guess at this point that our states \(\{|0\rangle, |1\rangle\}\) form an orthonormal basis since they are orthogonal to each other:

\[\begin{split} \langle 1 | 0 \rangle = \begin{bmatrix} 0 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \end{bmatrix} = 0, \end{split}\]

and have norm equal to \(1\):

\[ \sqrt{\langle 0 | 0 \rangle} = 1 \quad \text{and} \quad \sqrt{\langle 1 | 1 \rangle} = 1 .\]

In a one-qubit Hilbert space \(\mathcal{H}_2\), there are an infinite number of vector pairs \(\{|u\rangle, |u^{\perp}\rangle\}\) that form a basis, but some of the most important ones are the sets of states that lie along the three main axis in the Bloch sphere, and therefore receive their own flashy names:

  1. States \(\{|0\rangle, |1\rangle\}\), which respectively point along the \(\texttt{+}z\) and \(\texttt{-}z\) axes, are often known as the Computational or Bit basis states.

  2. States \(\{|+\rangle, |-\rangle\}\), which respectively point along the \(\texttt{+}x\) and \(\texttt{-}x\) axes, are often known as the Hadamard or Sign basis states.

  3. States\(\{|r\rangle, |l\rangle\}\), which respectively point along the \(\texttt{+}y\) and \(\texttt{-}y\) axes, are often known as the Y or Hand basis states (as in \(r\) for right-hand, and \(l\) left hand).

A good exercise is to show that the sign states and the hand states do indeed form a basis, and to show that we can take a state the bit basis like the one shown below:

\[|\psi\rangle = \sqrt{\frac{2}{3}}|0\rangle - \sqrt{\frac{1}{3}}|1\rangle\]

And express it in the sign basis to obtain:

\[\begin{split} \begin{aligned} |\psi\rangle &= \frac{2\sqrt{3} - \sqrt{6}}{6}|+\rangle + \frac{2\sqrt{3} - \sqrt{6}}{6}|-\rangle \\ \\ |\psi\rangle &\approx 0.169|+\rangle + 0.986|-\rangle \end{aligned} \end{split}\]

One last concept worth introducing is that of the outer product. Jut like the inner product is the vector multiplication of a bra with a ket, the outer product is found by multiplying a ket with a bra, which results in a matrix of the form:

\[\begin{split} \begin{aligned} | x \rangle \langle y | = \begin{bmatrix} x_0 \\ x_1 \end{bmatrix} \begin{bmatrix} y_0^* & y_1^* \end{bmatrix} \\ \\ | x \rangle \langle y | = \begin{bmatrix} x_0 y_0^* & x_0 y_1^* \\ x_1 y_0^* & x_1 y_1^* \end{bmatrix} \end{aligned} \end{split}\]

Outer products allow us to construct important sets of operators necessary to formalize the concept of a quantum measurement. For example, two useful operators are the outer products of the computational basis states \(\{|0\rangle, |1\rangle \}\):

\[\begin{split} \begin{aligned} \Pi_{0} = |0\rangle \langle 0 | = \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}, \quad \Pi_{1} = |1\rangle \langle 1 | = \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix}. \end{aligned} \end{split}\]

We will use these in section to construct the mathematical tools needed to describe the process of quantum measurement.

2. Refining Quantum Gates#

When we first introduced the idea of quantum circuits, we discussed two main quantum gates: the \(X\) gate, which was equivalent to the classical NOT gate (i.e., it flips \(|0\rangle\) to \(|1\rangle\), and vice versa), and the Hadamard or \(H\) gate, which respectively transforms states \(|0\rangle, |1\rangle\) to \(|+\rangle, |-\rangle\), and vice versa. These were just two examples of a family of gates described by what are known as unitary matrices, which are a generalization of the invertible matrices we used to express classical reversible circuits. A matrix \(U\) is unitary if, when we multiply it by its conjugate transpose the result is the identity operator:

\[ U U^\dagger = U^\dagger U = I. \]

Here, the conjugate-transpose matrix \(U^\dagger\) is computed by taking the transpose of \(U\), and replacing each of the matrix entries by their complex conjugates. For example, given the matrix:

\[\begin{split} U = \begin{bmatrix} u_{00} & u_{01} \\ u_{10} & u_{11} \end{bmatrix} \end{split}\]

It’s conjugate transpose is:

\[\begin{split} U^{\dagger} = \begin{bmatrix} u_{00}^* & u_{10}^* \\ u_{01}^* & u_{11}^* \end{bmatrix} \end{split}\]

What the relation \( U^\dagger U = I\) implies, is that, for every quantum gate (represented by a unitary \(U\)), there exists another quantum gate (given by \(U^\dagger\)) that “reverses” its operation. This is why we say that, unlike conventional classical computing, quantum computing is reversible.

Unitary matrices have the important property that they are norm preserving. This means that, evolving a normalized state through a unitary results in a statevector that is also normalized (i.e., the probabilities of such vector will still add to \(1\)).

Let’s now discuss some of the more relevant gates used in quantum computing. This will help us unravel some of the most important properties their corresponding unitary matrices have.

2.1 The Pauli Gates#

The three Pauli gates \(X\), \(Y\), and \(Z\), play a very important role in quantum computing. Their corresponding unitary matrices (known as the Pauli matrices) are given by:

\[\begin{split} X = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}, \quad Y = \begin{bmatrix} 0 & -i \\ i & \phantom{-}0 \end{bmatrix}, \quad Z = \begin{bmatrix} 1 & \phantom{-}0 \\ 0 & -1 \end{bmatrix}\end{split}\]

It is often common to also denote these matrices as \(\sigma_x, \sigma_y, \sigma_z\), or sometimes \(\sigma_1, \sigma_2, \sigma_3\), but in this textbook we will stick with the \(X, Y, Z\) notation. These matrices have a very large number of important properties that we will be uncovering over time. For now, it suffices to analyze how they act on some of the more common states, like the bit- and sign-basis states.

As we’ve seen many times before, the \(X\) gate simply flips the bit-basis states:

\[ X|0 \rangle = |1 \rangle \quad \text{and} \quad X|1 \rangle = |0 \rangle \]

Another way to express this is by saying that the \(X\) gate rotates a vector pointing in the \(\texttt{+}z\) direction in the Bloch sphere about the \(\mathbf{x}\) axis all the way to having it point in the \(\texttt{-}z\) direction. Similarly, if the vector is pointing along \(\texttt{-}z\), the \(X\) gate takes it to point along the \(\texttt{+}z\) direction. Let’s visualize this in Qiskit:

from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector
# initialize states |0⟩ and |1⟩
ket_0 = Statevector.from_label('0')
ket_1 = Statevector.from_label('1')
# Create quantum circuit with X gate
qc_x = QuantumCircuit(1)
qc_x.x(0)
qc_x.draw()
../../_images/db025ba48b35b1c5971a9710b729480eb7fdde2139aed4ce2c983e1d77e87def.png
# Apply X gate to |0⟩ and display in Bloch sphere
ket_out = ket_0.evolve(qc_x)

# Show Bloch sphere before and after X gate
print('State before X gate:')
display(ket_0.draw('bloch'))
print('State after X gate:')
display(ket_out.draw('bloch'))
State before X gate:
../../_images/88c6cc9458723e85ebd4cc55c27044af44b3dbc72d7ef5933050586d3a6d9ada.png
State after X gate:
../../_images/93426789aa4aeb60bc8cb232839aa3d2375337ce5198ef5efca41e493ab3a4d2.png
# Apply X gate to |1⟩ and display in Bloch sphere
ket_out = ket_1.evolve(qc_x)

# Show Bloch sphere before and after X gate
print('State before X gate:')
display(ket_1.draw('bloch'))
print('State after X gate:')
display(ket_out.draw('bloch'))
State before X gate:
../../_images/93426789aa4aeb60bc8cb232839aa3d2375337ce5198ef5efca41e493ab3a4d2.png
State after X gate:
../../_images/88c6cc9458723e85ebd4cc55c27044af44b3dbc72d7ef5933050586d3a6d9ada.png

Now, let’s see how the \(Z\) gate acts on the sign basis states \(|+\rangle\) and \(|-\rangle\):

\[\begin{split} \begin{aligned} Z |+\rangle &= \begin{bmatrix} 1 & \phantom{-}0 \\ 0 & -1 \end{bmatrix} \begin{bmatrix} \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} \end{bmatrix} \\ \\ Z |+\rangle &= \begin{bmatrix} \phantom{-}\frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix} \\ \\ Z |+\rangle &= |-\rangle \end{aligned} \end{split}\]

Similarly, we can show that: \( Z |-\rangle = |+\rangle .\)

So the \(Z\) gate “flips” the sign of the sign-basis states. Or, seen another way, the \(Z\) gate rotates a vector pointing in the \(\texttt{+}x\) direction in the Bloch sphere about the \(\mathbf{z}\) axis all the way to having it point in the \(\texttt{-}x\) direction (and vice versa). Let’s see this in action using Qiskit:

# initialize states |+⟩ and |-⟩
ket_p = Statevector.from_label('+')
ket_m = Statevector.from_label('-')
# Create quantum circuit with Z gate
qc_z = QuantumCircuit(1)
qc_z.z(0)
qc_z.draw()
../../_images/d2655d18991e3f02bdabc1f1ede9ec178239854e602e4a46676ded0c0afa4097.png
# Apply Z gate to |+⟩ and display in Bloch sphere
ket_out = ket_p.evolve(qc_z)

# Show Bloch sphere before and after Z gate
print('State before Z gate:')
display(ket_p.draw('bloch'))
print('State after Z gate:')
display(ket_out.draw('bloch'))
State before Z gate:
../../_images/78f2ed2017442ef21032ef579d41e611aa44d17f76bc9469a993d15e2ec99606.png
State after Z gate:
../../_images/7e55e05ddb882e4b4af287048831af87a2ca5e4d7d71b4c30b71b56db337d45e.png
# Apply Z gate to |-⟩ and display in Bloch sphere
ket_out = ket_m.evolve(qc_z)

# Show Bloch sphere before and after Z gate
print('State before Z gate:')
display(ket_m.draw('bloch'))
print('State after Z gate:')
display(ket_out.draw('bloch'))
State before Z gate:
../../_images/7e55e05ddb882e4b4af287048831af87a2ca5e4d7d71b4c30b71b56db337d45e.png
State after Z gate:
../../_images/78f2ed2017442ef21032ef579d41e611aa44d17f76bc9469a993d15e2ec99606.png

More generally, the \(X\) and \(Z\) gates can be interpreted as gates that rotate states in the Bloch sphere about the \(x\) and \(z\) axes by an angle of \(\pi\) (or \(180°\)). This means that, since state \(|0\rangle\) and \(|1\rangle\) lie along the \(z\) axis, the \(Z\) should not change their state since they will be rotating about their own axis. And we can indeed show mathematically that this is true:

\[\begin{split} \begin{aligned} Z |0\rangle &= \begin{bmatrix} 1 & \phantom{-}0 \\ 0 & -1 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \end{bmatrix} = |0\rangle \\ \\ Z |1\rangle &= \begin{bmatrix} 1 & \phantom{-}0 \\ 0 & -1 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ -1 \end{bmatrix} = -|1\rangle \end{aligned} \end{split}\]

An interesting observation is that, for state \(|1\rangle\) we do pick up a global phase of \(-1\). This will become relevant when we discuss two-qubit controlled gates.

As for the \(Y\) gate, well, in a similar way, this gate will rotate a statevector in the Bloch sphere by angle of \(\pi\) about the \(y\) axis.

2.2 The Phase-Gate Family#

Another very important single-quibt gate is the phase gate \(P\), given by:

\[\begin{split} P(\varphi) = \begin{bmatrix} 1 & 0 \\ 0 & e^{i\varphi} \end{bmatrix} \end{split}\]

The role of this gate is to rotate a state in the Bloch sphere about the \(z\) axis by an angle of \(\phi\). Associated with it, there are two special cases of this gate: the \(S\) gate, which is a phase gate for an angle of \(\pi/2\), and the \(T\) gate, which is a phase gate for an angle of \(pi/4\). It is also worth noting that the \(Z\) is a phase gate for an angle of \(\pi\), so we have:

\[\begin{split} \begin{aligned} Z &= P(\pi) = \begin{bmatrix} 1 & 0 \\ 0 & e^{i\pi} \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix} \\ \\ S &= P(\pi/2) = \begin{bmatrix} 1 & 0 \\ 0 & e^{i\pi/2} \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & i \end{bmatrix} \\ \\ T &= P(\pi/4) = \begin{bmatrix} 1 & 0 \\ 0 & e^{i\pi/4} \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & \frac{(1+i)}{\sqrt{2}} \end{bmatrix} \end{aligned} \end{split}\]

Let’s define a circuit in Qiskit with 3 qubits to which we individually apply each of these gates to visualize the effect they have on the \(|+\rangle\) state in the Bloch sphere:

# Create quantum circuit with Z gate
qc_p = QuantumCircuit(3)
qc_p.h(range(3)) # prepare |+⟩ state in all three qubits
qc_p.z(2)
qc_p.s(1)
qc_p.t(0)
qc_p.draw()
../../_images/bd855220a854b706c106858cb3f977c6b2b0db2a816d6d43bf3e6c232e9a1116.png
Statevector(qc_p).draw('bloch', reverse_bits=True)
../../_images/604c01a3d3de40bb922ff17d5d378fac4e66878389f26e01615ac43993e3353f.png

Another interesting observation about these gates is that, their corresponding conjugate transpose correspond to matrices that still rotate the state about the \(z\) axis by the same angle, but in the opposite direction:

\[\begin{split} \begin{aligned} Z^{\dagger} &= \begin{bmatrix} 1 & 0 \\ 0 & e^{-i\pi} \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & e^{i\pi} \end{bmatrix} = Z \\ \\ S^{\dagger} &= \begin{bmatrix} 1 & 0 \\ 0 & e^{-i\pi/2} \end{bmatrix} \\ \\ T^{\dagger} &= \begin{bmatrix} 1 & 0 \\ 0 & e^{-i\pi/4} \end{bmatrix} \end{aligned} \end{split}\]
# Create quantum circuit with Z gate
qc_pdg = QuantumCircuit(3)
qc_pdg.h(range(3)) # prepare |+⟩ state in all three qubits
qc_pdg.z(2)
qc_pdg.sdg(1)
qc_pdg.tdg(0)
qc_pdg.draw()
../../_images/431cbb3d9a56fe298d29e53cdc84d88a199396ceb2726d54b0006a3e76de27e5.png
Statevector(qc_pdg).draw('bloch',reverse_bits=True)
../../_images/68685a99a618c6891b055eee9894a99cf831c7a5444801a0a36bd8c2b123efd6.png

The \(S\) gate is of particular importance because it has been demonstrated that, in combination with just two more gates (the \(H\) gate and the \(CX\) gate) one can efficiently approximate any other quantum gate. This result is known as the Solovay-Kitaev theorem, and it is of extreme importance in the development of fault-tolerant quantum computing.

2.3 General Rotation Gates#

The last set of single-qubit gates we will discuss (at least for the moment) are the rotation gates \(RX, RY, RX\), which as their name implies, rotate a statevector in the Bloch sphere about the \(x, y, z\) axes, respectively:

\[\begin{split} \begin{aligned} RX(\theta) &= \begin{bmatrix} \cos\left(\frac{\theta}{2}\right) & -i\sin\left(\frac{\theta}{2}\right) \\ -i\sin\left(\frac{\theta}{2}\right) & \cos\left(\frac{\theta}{2}\right) \end{bmatrix} \\ \\ RY(\theta) &= \begin{bmatrix} \cos\left(\frac{\theta}{2}\right) & -\sin\left(\frac{\theta}{2}\right) \\ \sin\left(\frac{\theta}{2}\right) & \cos\left(\frac{\theta}{2}\right) \end{bmatrix} \\ \\ RZ(\varphi) &= \begin{bmatrix} e^{-i\varphi/2} & 0 \\ 0 & e^{i\varphi/2} \end{bmatrix} . \end{aligned} \end{split}\]

It is worth noting that the \(RZ\) gate is equivalent to the \(P\) gate up to a global phase:

\[ RZ(\varphi) = e^{-i\varphi/2} P(\varphi) .\]

Let’s look at an example for these three gates where we rotate the \(|0\rangle\) state using \(RX\) and \(RY\) by an angle of \(\theta = \pi/4\), and state \(|+\rangle\) using \(RZ\) with \(\varphi = \pi/4\):

# Create quantum circuit with Z gate
qc_r = QuantumCircuit(3)

qc_r.h(2)            # Prepare qubit 2 in |+⟩ state
qc_r.barrier()
qc_r.rz(np.pi/4,2)   # Rotate qubit 2 using rz gate
qc_r.ry(np.pi/4,1)   # Rotate qubit 1 using ry gate
qc_r.rx(np.pi/4,0)   # Rotate qubit 0 using rx gate
qc_r.draw()
../../_images/7a54bb15ee51e3a1ad3bf238c5df0b50042c8dcbe2d8c30e5f39afc551cd404a.png
Statevector(qc_r).draw('bloch', reverse_bits=True)
../../_images/778103991b30a2119bc11dbce44b68a34dcb9920aeeb13325143a597148cd49b.png

Even though there are an infinite number of possible unitary transformations, the ones we covered here are perhaps the most utilized ones. In the next chapter we will expand all of these ideas to multiple qubits and discuss the most common gates used to have two or more qubits interact.

3. Refining Quantum Measurement#

Quantum measurement is the process by which a quantum property, like electron spin, is coupled to some classical property that we can physically observe or interpret, like the vertical position of a mark or flash on a screen:

../../_images/02_03_07_dest_meas.png

In this type of measurement, the electron gets absorbed by the screen, so there is no sense in talking about the spin of the electron after the measurement takes place. We call this type of measurement a destructive measurement.

However, an equally valid way to measure electron spin would be to “carve” holes into the screen and let the electron pass through:

../../_images/02_03_08_nondest_meas.png

If we observe the electron passing through the top opening, we now know that its spin is pointing in the \(+z\) direction. Similarly, if we see it passed through the bottom opening, we know its spin is pointing in the \(-z\) direction. We call this a non-destructive measurement because the electron survived the measurement process and it is therefore available to be modified and measured again. A common way to refer to the process of going from a superposition to one of the possible states associated with the measurement outcome is to say that the statevector of the electron spin collapsed. Here, we will avoid using the term “collapse” since sometimes it can be associated with very specific interpretations of quantum mechanics. We will instead use the term “projection” or “reduction” of the statevector.

There are then three important components associated with this type of measurement:

  1. The classical outcome \(j\) obtained after the measurement (for a qubit: \(j \in \{0,1\}\)).

  2. The probability \(\mathbb{P}_j\) of measuring the classical outcome \(j\).

  3. The quantum state \(|j\rangle\) onto which the pre-measurement statevector projects onto.

This is why, in some quantum circuit diagrams, we deliberately include a classical register that “stores” the classic result in addition to the quantum register reserved for the quantum state of the qubit even after it has been projected by a measurement:

../../_images/02_03_09_meas_cir.png

What we want to do next is present a slightly more formal way we obtain the three items described above given the statevector of a qubit \(|q\rangle\), and the basis on which we want to perform our measurement.

3.1 Projective Measurements#

So far, the way we have approached how measurements take place has been by simply stating that, given a qubit in a general superposition state:

\[ |q \rangle = \alpha_0 |0\rangle + \alpha_1 |1\rangle, \]

the probability of measuring \(0\) or \(1\) is given by the square modulus of the probability amplitude associated with each of the two corresponding states:

\[ \mathbb{P}_0 = |\alpha_0|^2, \quad \mathbb{P}_1 = |\alpha_1|^2 .\]

We also said that, after a measurement, the superposition state immediately changes to the basis state associated with an outcome of either \(0\) or \(1\):

\[\begin{split} |q \rangle = \alpha_0 |0\rangle + \alpha_1 |1\rangle \xrightarrow{\text{measure}} |q' \rangle = \begin{cases} |0\rangle \text{, with prob:} \; \mathbb{P}_0 \\ \\ |1\rangle \text{, with prob:} \; \mathbb{P}_1 . \end{cases} \end{split}\]

What we want though, is a mathematical procedure that, given a general state \(|q\rangle\), allows us to “extract” the probabilities of getting a certain outcome instead of simply stating that they correspond to the amplitudes squared. We can do this by noticing that taking the inner product between \(|q\rangle\) and each of the computational basis states \(\{|0\rangle, |1\rangle \}\) results their corresponding probability amplitude:

\[\begin{split} \begin{aligned} \langle 0 | q \rangle &= \langle 0 | (\alpha_0 |0\rangle + \alpha_1 |1\rangle) = \alpha_0 \langle 0|0\rangle + \alpha_1 \langle 0|1\rangle \\ \langle 0 | q \rangle &= \alpha_0, \end{aligned} \end{split}\]

and:

\[\begin{split} \begin{aligned} \langle 1 | q \rangle &= \langle 1 | (\alpha_1 |0\rangle + \alpha_1 |1\rangle) = \alpha_0 \langle 1|0\rangle + \alpha_1 \langle 1|1\rangle \\ \langle 1 | q \rangle &= \alpha_1 . \end{aligned} \end{split}\]

Additionally, we showed in section 1.4 that \(\langle y|x \rangle = \langle x|y \rangle^*\), which we can use here to get:

\[\begin{split} \begin{aligned} \langle q | 0 \rangle &= \alpha_0^* \\ \\ \langle q | 1 \rangle &= \alpha_1^* . \end{aligned} \end{split}\]

Lastly, since we know the probabilities \(\mathbb{P}_j\) are given by \(\alpha_j^* \alpha_j = |\alpha_j|^2\), we can combine the two results above to compute each of the two probabilities:

\[\begin{split} \begin{aligned} \mathbb{P}_0 &= \langle q | 0 \rangle \langle 0 | q \rangle = \alpha_0^* \alpha_0 = |\alpha_0|^2 \\ \\ \mathbb{P}_1 &= \langle q | 1 \rangle\langle 1 | q \rangle = \alpha_1^* \alpha_1 = |\alpha_1|^2 . \end{aligned} \end{split}\]

Recalling (also from section 1.4) that \(\Pi_{0} = | 0 \rangle \langle 0 |\) and \(\Pi_{1} = | 1 \rangle \langle 1 |\), we can rewrite the expressions above as:

\[ \mathbb{P}_0 = \langle q | \Pi_{0} | q \rangle \quad \text{and} \quad \mathbb{P}_1 = \langle q | \Pi_{1} | q \rangle .\]

\(\Pi_0\) and \(\Pi_1\) are what are known as projection operators, which we can use to compute the probability of measuring the basis state \(|j\rangle\) by “sandwiching” the corresponding projection operator \(\Pi_{j}\) in-between the bra and the ket of superposition state \(|q\rangle\). These two projection operators form a set known as a projector-value measure (PVM), and have several important properties, such as being orthogonal to each other, squaring to themselves, and being complete (summing to identity), but these details are not so relevant for our current discussion.

Now, to find which state our superposition projects onto after a measurement, we can multiply the initial state \(|q\rangle\) by the projection operator associated with each measurement outcome. In the case in which we measure \(0\), we have:

\[\begin{split} \begin{aligned} |q'\rangle &= \Pi_0 |q\rangle = | 0 \rangle \langle 0 |q \rangle = | 0 \rangle(\alpha_0 \langle 0|0\rangle + \alpha_1 \langle 0|1\rangle) \\ \\ |q'\rangle &= \alpha_0 | 0 \rangle . \end{aligned} \end{split}\]

We can follow the same procedure for when we measure \(1\):

\[\begin{split} \begin{aligned} |q'\rangle &= \Pi_1 |q\rangle = | 1 \rangle \langle 1 |q \rangle = | 1 \rangle(\alpha_0 \langle 1|0\rangle + \alpha_1 \langle 1|1\rangle) \\ \\ |q'\rangle &= \alpha_1 | 1 \rangle . \end{aligned} \end{split}\]

However, if we inspect these results closely, having the probability amplitudes \(\alpha_0, \alpha_1\) pre-multiplying the outcome states is problematic because that means the statevectors are not properly normalized. To solve this, we divide the outcomes above by the norm (length) of each of the resulting statevectors, which is given by:

\[\|\Pi_j |q\rangle \| = |\alpha_j| .\]

Here \(|\alpha_j|\) is nothing other than the square root of the probability \(\mathbb{P}_j\). So, the right way to compute the resulting state after a measurement is given by:

\[ |q'\rangle = \frac{\Pi_j |q\rangle}{\|\Pi_j |q\rangle \|} = \frac{\Pi_j |q\rangle}{\sqrt{\mathbb{P}_i}} = \frac{\Pi_j |q\rangle}{\sqrt{\langle q | \Pi_{j} | q \rangle}} .\]

This seems like a ridiculous amount of math to simply show that after measuring \(0\) or \(1\) the outcome state is respectively \(|0\rangle\) or \(|1\rangle\). However, the PVM formalism will come in handy in the context of multi-qubit systems, where it is of interest to find the resulting state after measuring just a subset of qubits.

It is also worth noting that we can generalize this to measuring in any other basis. For example, we can construct projectors for the Hadamard basis:

\[\Pi_+ = |+\rangle \langle+|, \quad \text{and} \quad \Pi_- = |-\rangle \langle-|, \]

and use them to compute the probabilities and outcome states resulting from measuring along the \(x\)-axis of the Bloch sphere. We will almost entirely focus on only measuring in the computational basis, so the details of this generalization are not that critical at this point, but it is at least worth knowing this can be done.

So, in summary, given a state \(|q\rangle\), and a PVM defined by the set of projection operators \(\{\Pi_j\}\) we have that:

  1. Each projector \(\Pi_j\) has a corresponding classical outcome \(j\).

  2. The probability of obtaining outcome \(j\) is given by \(\mathbb{P}_j = \langle q | \Pi_j | q \rangle .\)

  3. The projected quantum state associated with outcome \(j\) is computed as: \(|q'\rangle = \frac{1}{\sqrt{\mathbb{P}_i}} \Pi_j |q\rangle.\)

3.2 Post-Selection and Reset Operations#

Now that we have a more formal definition for what measurements are, we can go back to two operations we discussed before: post-selection and reset. Post-selection is nothing other than performing a measurement, and keeping only the state we’re interested in. For example, post-selection on \(|0\rangle\) will consist of measuring state

\[ |q\rangle = \alpha_0 |0\rangle + \alpha_1 |1\rangle, \]

and discarding the outcome if the classical result is \(1\). This is basically equivalent to applying to \(|q\rangle\) the projection operator associated with the outcome we want with the corresponding normalization factor.

A reset operation is similar; however, instead of discarding the result we don’t want, we applying an \(X\) gate condition on the classical outcome we got.

Here is an example in Qiskit where we initialize our qubit in a superposition, measure it, and then flip the result if the classical outcome is \(1\). This equivalent to a \(|0\rangle\) reset operation.

from qiskit_aer import AerSimulator
qc = QuantumCircuit(1,1)
qc.h(0)                           # Initialize in superposition
qc.save_statevector('q_pre')     # Save statevector before reset
qc.measure(0,0)                   # Measure state (should give 50/50 `0` or `1`)
with qc.if_test((0,1)): qc.x(0)   # Apply X gate if classical result is `1`
qc.save_statevector('q_pst')      # Save statevector after reset

qc.draw()
../../_images/46070ae862953c7695f218d3e435ec2efcb778798ac679d126d9df20a072d832.png
 AerSimulator().run(qc).result().data()['q_pre']
\[\frac{\sqrt{2}}{2} |0\rangle+\frac{\sqrt{2}}{2} |1\rangle\]
# Simulate circuit above and extract statevector (it should always be |0⟩)
result = AerSimulator().run(qc).result()

q_pre = result.data()['q_pre']
q_pst = result.data()['q_pst']

print('State before reset: ')
display(q_pre.draw('latex', prefix='|q\\rangle_{\\text{in}} = '))

print('State after reset: ')
display(q_pst.draw('latex', prefix='|q\\rangle_{\\text{out}} = '))
State before reset: 
\[|q\rangle_{\text{in}} = \frac{\sqrt{2}}{2} |0\rangle+\frac{\sqrt{2}}{2} |1\rangle\]
State after reset: 
\[|q\rangle_{\text{out}} = |0\rangle\]

3.3 Quantum Observables#

In quantum mechanics, observables correspond to the properties of a quantum system that can be physically measured. For instance, particles such as electrons have multiple observables, including position, momentum, and, as we previously discussed, their spin.

For the specific case of a qubit, we abstracted away the physics of electron spin, and associated the two possible outcomes of a measurement to be the numerical values of 0 and 1. This choice was made for two important reasons:

  1. To allow for qubits to represent other types of quantum systems (e.g., energy levels of atoms, polarization of photons, etc.).

  2. To align our model with the binary number system, which is the natural framework for quantum computation and quantum information.

However, the physical observable property associated with electron spin is actually the particle’s angular momentum, which can take values of:

\[ S = \left\{ + \frac{\hbar}{2}, - \frac{\hbar}{2} \right\},\]

where \(\hbar\) is the reduced Planck constant. Now, since \(\hbar/2\) is just a scaling factor, we could’ve equally defined the two possible outcomes of measuring a qubit to be \(\{+1, -1 \}\), rather than \(\{0, 1\} .\) That way, the results for a qubit measurement would be directly in line with the physical quantity measured for electron spin. This is a useful thing to do when simulating physical systems on a quantum computer, which is something done quite frequently.

Luckily, using the PVM formalism we described above, it is relatively straightforward to construct measurement operators whose classical outcomes are different from \(0\) or \(1\). For example, if we want our qubit states \(|0\rangle\) and \(|1\rangle\) to represent electron spin along the \(z\) axis, with corresponding measurement outcomes of \(1\) and \(-1\), all we need to do is take the sum of the projection operators \(\{\Pi_0, \Pi_1 \}\), weighted by the new measurement outcomes \(\{1, -1\}\):

\[\begin{split} \begin{aligned} Z &= 1 \cdot \Pi_0 + (-1) \cdot \Pi_1 . \\ \\ Z &= 1 \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} - 1 \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix} \\ \\ Z &= \begin{bmatrix} 1 & \phantom{-}0 \\ 0 & -1 \end{bmatrix} . \end{aligned} \end{split}\]

It is no coincidence that the resulting operator corresponds to the Pauli \(Z\) matrix. Similarly, we could associate values of \(\{+1, -1\}\) for measuring spin along the \(x\) (or \(y\)) axis with the projection operators \(\{\Pi_+, \Pi_- \}\) (or \(\{\Pi_r, \Pi_l \}\)), and find that the corresponding observable is the Pauli \(X\) (or Pauli \(Y\)) matrix. In general, we can construct an observable \(\mathcal{O}\) with measurement outcomes \( \lambda_j \in \mathbb{R}\) and projectors \(\Pi_j\) (corresponding to an orthonormal set of states \(|j\rangle\)) as:

\[ \mathcal{O} = \sum_j \lambda_j \Pi_j .\]

The advantage of defining observables this way is that, since quantum mechanics is a probabilistic theory, we can use \(\mathcal{O}\) to directly compute the expectation value (average) of measuring that observable given some arbitrary state \(|q\rangle\). For example, let’s say we want to calculate what is the average of measuring the \(Z\) observable for state:

\[ |q\rangle = \alpha_0 |0\rangle + \alpha_1 |1\rangle .\]

Well, we know that we will measure either \(1\) or \(-1\) with probabilities of \(|\alpha_0|^2\) and \(|\alpha_1|^2, \) respectively. So to calculate the expectation value, we take each possible outcome, multiply it by its corresponding probability of occurrence, and add them up:

\[\langle Z \rangle_q = 1 |\alpha_0|^2 + (-1) |\alpha_1|^2 .\]

Now recall that, from the PVM formalism, the probabilities \(|\alpha_j|^2\) can be computed from the projector operators as \(\langle q | \Pi_j | q \rangle\), so we can replace in these in the expression above:

\[ \langle Z \rangle_q = 1 \cdot \langle q | \Pi_0 | q \rangle + (-1) \cdot \langle q | \Pi_1 | q \rangle, \]

and then, by linearity, factor in the operators inside the bra-ket product:

\[\begin{split} \begin{aligned} \langle Z \rangle_q &= \langle q |\left( 1 \cdot \Pi_0 + (-1) \cdot \Pi_1 \right)| q \rangle \\ \\ \langle Z \rangle_q &= \langle q | Z | q \rangle . \end{aligned} \end{split}\]

What this shows is that, given some state \(|q\rangle\), we can calculate the expectation value of measuring some observable \(O\) as:

\[ \langle \mathcal{O} \rangle_{q} = \langle q | \mathcal{O} | q \rangle .\]

For example, given the state:

\[|q\rangle = \sqrt{\frac{1}{3}}|0\rangle + \sqrt{\frac{2}{3}}|1\rangle, \]

and the \(X\) observable, we have:

\[\begin{split} \begin{aligned} \langle X \rangle_q &= \langle q | X | q \rangle \\ \\ \langle X \rangle_q &= \begin{bmatrix} \sqrt{\frac{1}{3}} & \sqrt{\frac{2}{3}} \end{bmatrix} \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix}\sqrt{\frac{1}{3}} \\ \sqrt{\frac{2}{3}} \end{bmatrix} \\ \\ \langle X \rangle_q &= \begin{bmatrix} \sqrt{\frac{2}{3}} & \sqrt{\frac{1}{3}} \end{bmatrix} \begin{bmatrix}\sqrt{\frac{1}{3}} \\ \sqrt{\frac{2}{3}} \end{bmatrix} \\ \\ \langle X \rangle_q &= 2 \sqrt{\frac{2}{9}} \approx 0.943 . \end{aligned} \end{split}\]

In Qiskit, there a few different ways to compute expectation values. Let’s first take a look at how to compute the exact value given some Statevector and observable of type Operator:

from qiskit.quantum_info import Operator

# Define state
q = Statevector([np.sqrt(1/3), np.sqrt(2/3)])
display(q.draw('latex', prefix='|q\\rangle = '))

# Define observable
O = Operator.from_label('X')

# compute expectation value using 
O_expval = q.expectation_value(O)

print(f'⟨X⟩q = {np.around(O_expval, 4)}')
\[|q\rangle = \frac{\sqrt{3}}{3} |0\rangle+\frac{\sqrt{6}}{3} |1\rangle\]
⟨X⟩q = (0.9428+0j)

Alternatively, we can construct a circuit, and run a simulation that computes the expectation value of some observable using the Estimator class from Qiskit Runtime. The results, however, will be approximate, since they are computed based on the sampled results of measurements. The example below uses a circuit that prepares the same state as above, and computes the expectation value with respect to the \(X\) operator:

from qiskit.quantum_info import SparsePauliOp
from qiskit_ibm_runtime import Estimator

# Create circuit that initializes state |q⟩
qc_o = QuantumCircuit(1)
qc_o.ry(2*np.arccos(np.sqrt(1/3)),0)
display(qc_o.draw())

# Create Observable of type SparsePauliOp compatible with Estimator
Obs = SparsePauliOp.from_operator(O)

# Define the estimator using the AerSimulator
estimator = Estimator(mode=AerSimulator())

# Run the estimator by passing the circuit and observable of interest
result = estimator.run([(qc_o, Obs)]).result()
O_expval = result[0].data.evs
print(f'⟨X⟩q = {np.around(O_expval, 4)}')
../../_images/4c7724c32626fd22c81705c0c1d8753d2bc8b86500c3b4b8e41b6daed0659199.png
⟨X⟩q = 0.937

The type of matrices associated with observables are known as Hermitian matrices, and have several important features that guarantee these operators do indeed correspond to a valid representation of quantifiable physical properties. We will not go into these details in this section, but we will slowly uncover some of their most important aspects in future chapters where observables become important.

Footnotes#

\(^*\)Note that here we are just abstractly superimposing the complex plane with the physical \(xy\)-plane to assign a complex number to the corresponding statevector associated with the spin in some given direction on this plane. We are not saying that this complex number is related to the physical length of the spin vector, or directly multiplying the full statevector. The relation between this complex value and the state is clarified in the section discussing the Bloch sphere. (go back)

\(^\dagger\)Labeling these statevectors as \(|r\rangle\) and \(|l\rangle\) to respectively denote “right” and “left” seems a bit arbitrary; however, this nomenclature has become relatively standard, so that’s what we will use. Some references use \(|+i\rangle\) and \(|-i\rangle\) instead, but this can also be confusing because it could denote having the tensor product of states plus or minus with some arbitrary state \(i\). For example, \(|+i\rangle\) could be confused with state \(|+\rangle \otimes |i\rangle.\) (go back)

\(^\ddagger\)Admittedly, we are being fairly loose with the definition of a Hilbert space in here. More precisely, a finite-dimensional complex Hilbert space is a vector space over \(\mathbb{C}\), equipped with an inner product that induces a complete norm. The definition we have for a qubit constitutes the subset of elements of a Hilbert space with norm equal to 1. (go back)