## Grover’s Quantum Search Algorithm

Grover’s algorithm is a quantum search algorithm that runs quadratically faster than any equivalent classical algorithm. With a budget of a million operations, a classical algorithm is limited to searching through about a million (unsorted, unstructured) possibilities. Grover’s algorithm, on the other hand, can use those million operations to search through hundreds of **billions** of possibilities.

Describing how Grover’s algorithm pulls this trick is difficult, because it uses concepts fundamental to quantum mechanics. Instead of checking possibilities one by one, it (warning: incoming jargon) creates a uniform superposition over all possibilities and repeatedly destructively interferes states that are not solutions.

Hopefully, by the end of this post, you’ll understand qubits, quantum circuits, maybe a little bit of quantum mechanics, and (of course) Grover’s algorithm.

### Search a Function

Jumping ahead a bit, I need to make one important concept clear right away: Grover’s algorithm doesn’t search through lists, it searches through *function inputs*. Grover’s algorithm takes a function, searches through the *implicit* list of possible inputs to that function, and (with high probability) returns the *single* input that causes the function to return true. (To deal with cases with more than one satisfying input, we need a variant of Grover’s algorithm.)

The following C# code solves a problem equivalent, in spirit, to what Grover’s algorithm solves. It uses an entirely different approach, but solves the same basic problem:

```
int FindSingleSatisfyingInput(Func<int, bool> predicate, int inputRange) {
int solution = 0;
int resultCount = 0;
for (int i = 0; i < inputRange; i++) {
if (predicate(i)) {
solution = i;
resultCount += 1;
}
}
if (resultCount != 1) {
// Violated the 'exactly one solution' constraint
// Garbage in, probable garbage out
return RandomResult(0, inputRange);
}
return solution;
}
```

I really can't stress this point enough: Grover's algorithm searches a function for a single satisfying input. If you want to search through an explicit list of items, then you need a function backed by said list. Understanding how to do that is outside the scope of this post, so I'm going to stick to searching functions.

So, if Grover's algorithm searches a function, how is that function represented? Good question. I'll get back to that.

### Like probability theory, but over the complex numbers

My favorite summary of quantum computing, from a talk called Quantum information and the Brain by Scott Aaronson, goes as follows: "Like probability theory, but over the complex numbers". It is my favorite summary because, unlike misleading oversimplifications like "it tries every possibility", you can actually use "Like probability theory, but over the complex numbers" to make a reasonable guess at how quantum algorithms will work.

For example, how do we describe the state of a probabilistic system? Well, each possible state is associated with a real number between 0 and 1 (the probability), with the constraint that if we add up all of the probabilities we will get 1. Flip a coin and, before the coin lands, the result of the flip is in the state . If the coin lands on heads then the result of the flip is in the state .

How do we describe the state of a quantum system? Like probabilities, but complex. Each possible state is associated with a complex number with magnitude (absolute value) between 0 and 1 (the amplitude). The squared magnitude of a state's amplitude is the probability of that state, so we have the constraint that the squared magnitudes of the amplitudes must add up to 1. Because many different amplitudes can result in the same probabilities, our coin flip example no longer has a unique representation. Before the coin is flipped, the result of the flip could be represented as the state , the state , or many other possibilities.

For another example of summary goodness, consider how transitions between states can be represented. Probabilistic systems allow operations like "If in state 2 then go to state 3 with 50% chance". It turns out that these operations can always be represented as linear transformations: multiplying the vector of states-weighted-by-probabilities against a matrix. There's actually a name for the type of matrix that corresponds to a probabilistic operation: stochastic.

How are quantum transitions represented? Like stochastic matrices, but complex. All quantum operations are equivalent to a unitary matrix. (Knowing that, we can derive further properties like "operations must be reversible".)

So a quantum algorithm will necessarily look like... a bunch of carefully chosen matrix multiplications (involving imaginary numbers) to make 'solution' states have large amplitudes. Hurray. The larger the amplitudes of the solution states, the more likely it will be that the problem is solved when the quantum computer's state is un-isolated and reported.

### Working with Complex Numbers

If you're completely unfamiliar with complex numbers, they're introduced very well during How to Fold a Julia Fractal, by Steve Wittens (creator of MathBox). He starts with *rotations* of normal numbers (with a 180° turn being equivalent to multiplying by -1) and then notes that a 90° turn is suspiciously similar to the result of a negative square root...

The overall state of a quantum computation is made up of complex numbers referred to as amplitudes. I'll often be representing amplitudes using what I call a "square diagram", which visually represents the various components of a complex number:

The arrow rooted in the center of the diagram points to the complex value of the amplitude. The real part is the X offset of the arrow, the imaginary part is the Y offset, the magnitude is the length of the arrow, and the phase is related to the direction of the arrow. Because the squared magnitude of an amplitude is also an important value, since it determines the probability of a state, it is represented by the proportion of filled-in area of the unit radius square (width=height=2) surrounding the arrow.

Treating complex numbers as arrows is a useful intuition. When Richard Feynman gave a lecture on quantum mechanics intended for non-physicists, he took advantage of this and only mentioned in passing that these 'spinning arrows' were equivalent to complex numbers. After all, adding is as easy as placing arrows end to end:

The fact that arrows clearly show magnitude and phase also makes multiplication easier to visualize, because the effect on the phases is independent of the effect on the magnitudes. To multiply two complex numbers, you add their phases and multiply their magnitudes:

To represent what happens during a linear transformation / matrix multiplication, I combine the above multiplication and addition animations. In the following animation, a vector of two state amplitudes (each amplitude corresponds to a possible state) is transformed by a unitary matrix, using a combination of the multiplying and summing animations:

Notice how the matrix multiplication mixes both input state amplitudes (the blue values initially on the left) into the two output state amplitudes (the eventual blue values on the right) through the operation of the matrix (the yellow values in the middle). Each input may reinforce or counteract the effect of other inputs due to differences in phase. This ability to create interference may seem like a bad thing, but it's actually what gives quantum computers their power compared to a probabilistic or classical computer with no phase information.

### Quantum Circuits

How do we represent a quantum computation? One way is quantum circuits, which are made up of a fixed number of wires going through a sequence of gates.

The wires of a quantum circuit correspond to qubits, and store the state of the system. I say qubits, instead of bits, because the system can be placed into superpositions of the various possible classical states. A system with bits has possible classical states, but a system with qubits has a continuum of quantum states corresponding to various amplitude-weightings (superpositions) of the classical states.

The state space of bits can be visualized as the corners of an dimensional cube (maybe "visualized" is the wrong word), whereas the state space of qubits can be imagined abstractly as the points in the volume of a dimensional sphere (e.g. the Bloch sphere).

For example, if there are two wires then the quantum system can be in a pure classical state like , corresponding to both wires definitely being on, or in a mixed state like , corresponding to knowing exactly one wire is on but not which one it is. (These states can be represented more compactly in bra-ket notation as and , respectively.) The phases of the states matter when interference occurs.

The gates of a quantum circuit correspond to operations to perform on the wires' state. Each one is equivalent to a multiplication of the state by a unitary matrix.

Lets start with a nice simple example of a quantum circuit. The simplest gate I can think of is the NOT gate, which flips the On/Off state of a single wire:

In the above diagram, you can see that applying a NOT gate corresponds to multiplying the state of the system by the unitary matrix . The input is in the On state, which effectively selects the first column of the matrix as the output, resulting in an Off output state. If the input had been in the Off state, then the second column would have been selected and the output would have been in the On state.

The NOT gate is pretty boring. It can be performed in non-quantum computers, after all. How about a simple quantum gate with no classical equivalent? Enter the Hadamard Gate, corresponding to the matrix and used for creating uniform mixes. Shown below: an initially-On state (blue) and an initially-Off state (red) are mixed and unmixed by Hadamard gates.

The Hadamard Gate is interesting because it transforms both On and Off states into a mixed 50% On/Off state but, when applied again, undoes its own effects and restores the original state. This would be impossible with a classical or probabilistic computer, because the 50%/50% input would have to give the same result in both cases. If you pay close attention to the above animation, you can see how the magic works: the original value is encoded into the phases of the 50%/50% state (the phases agree if the original state was On and disagree if it was Off).

### More Wires

Things get a bit more complicated when there are more wires present. Suppose we want to apply a Hadamard gate to a single wire in an wire circuit. We have a -by- matrix, but any operation on an wire circuit must be represented by -by- matrices. How do we convert it, to simulate what nature does automatically?

Keep in mind that the state of the wire we want to affect isn't confined to out of the possible states-of-all-wires (the possible global states). Every possible global state includes the state of the wire we want to affect. If we want to touch one wire, we must touch every global state.

The basic idea of what needs to be done, which is a bit tricky to write down mathematically, is to repeat our -by- matrix for every combination of states of the other wires. If there are two wires then we need one copy for when the other wire is On and one copy for when the other wire is Off.

The following diagram shows a 2-wire circuit that applies a Hadamard gate on one wire and then on the other. Notice how the repeated entries of the -by- Hadamard matrix are placed differently based on the wire being affected:

An interesting property of Hadamard gates, which you might have noticed in the above diagram, is that applying them to multiple wires has the same mixing effect as the single-wire case. Every pure input state will become a uniformly mixed output state, with the original state encoded into the phases, and applying the gates again will unmix the mixed state back into the original pure state.

Since applying a Hadamard gate to each of wires acts so much like applying a Hadamard gate to wire, it makes a lot of sense to pretend we have Hadamard gates that operate on any number of wires. The following diagram shows an example of a -wire Hadamard gate in action, which is actually (under the hood) a Hadamard gate on each wire:

Shown above: an initially-OnOn state (blue) and an initially-OffOff state (red) are mixed and unmixed by a combined 2-wire Hadamard gate. The matrix representing the operation of a combined gate is obtained by multiplying the matrices of the underlying gates together.

There are many other common quantum gates. However, since Grover's algorithm is almost entirely made up of Hadamard gates, I won't be covering them here.

### Detour: Simulating a Physical System

We can simulate real physical systems with quantum circuits, and doing so can be useful for grounding our understanding in concrete examples. For example, consider the Mach-Zehnder interferometer:

The Mach-Zehnder interferometer is an interesting optical system. It uses a beam splitter to split an incoming photon along two paths, and then uses a second beam splitter to recombine the two paths into a single output path. Send a photon in from the left (on path A) and it will always exit towards the right (on path H), never towards the bottom (on path G). The reason this occurs is that contributions to G from E and from F destructively interfere.

We can represent the Mach-Zehnder interferometer as a quantum circuit using only a single wire. The state of the wire represents whether the photon is on the upper path (A,C,E,G) or the lower path (B,D,F,H). We will use gates to simulate the beam splitters and mirrors. A mirror rotates a photon's phase by 90°, and since both paths have a mirror at the same position we can use the matrix to represent their effects. Each beam splitter reflects the photon onto the opposite path (also rotating its phase by 90°) or lets it through unaffected, and can be represented by the matrix :

Notice how the above animation shows an input of A giving a definite output of H. Our circuit predicts the result of the physical system. The destructive interference from E and F's contributions can be clearly seen in the summation of the top row of the matrix of the last splitter.

Now to make things a bit more interesting.

One of the counter-intuitive things about Mach-Zehnder interferometers is what happens when you use a detector to determine whether the photon "really" took the upper or lower path. The interference goes away (!):

In the above diagram a detector has been placed on path E. Notice the appearance of opposing waves traveling along G, instead of no wave. Why aren't the opposing waves interfering destructively anymore? Can we use a quantum circuit to shed light on this? Yup.

To represent the state of the detector (on path E), we need to add a second wire to our circuit. The operation of the detector will be performed by a controlled-NOT gate: when a photon passes through E (i.e. the first wire is On), the state of the detector (second wire) is flipped:

Good: the above diagram shows the output of our circuit also isn't showing any interference, matching the physical system. Each possibility (G, G+detect, H, H+detect) has an equal 25% chance of being observed.

The reason we don't get interference anymore comes down to the necessary addition of the second wire, increasing the number of possible states from to . We no longer need to jam four possible paths into just two output states, creating interference effects. Instead, we're effectively using the detector to send each possible path to its own output state. Photons from different states don't interfere, even if they happen to be at the same position, because only entire states interfere.

I hope this detour has been helpful. We now return to Grover's algorithm.

### The Unknown Gate

I mentioned earlier that I would get to how the function-to-be-searched-by-Grover was represented. Now that I've covered the basics of a quantum circuit, I can finally explain it.

The input to Grover's algorithm is actually a gate, or combination of gates, whose function is unknown. For example, if we wanted to solve a large instance of the boolean satisfiability problem, then the unknown gate(s) given to Grover's search algorithm would compute whether an input satisfies a boolean equation. Given a pure state that is not a solution, the gates must pass it along unchanged. Given a pure state that *is* a solution, the gates must rotate the state's amplitude's phase by 180°.

For wires, there are allowed possibilities for the effect of the unknown gate(s):

Grover's algorithm will distinguish between these four possibilities. This might sound pointless, but remember that as the number of wires goes up the matrices become exponentially huge. With wires, there's about a trillion places for that to hide. Also, remember that in practice we wouldn't be explicitly picking the position of the . Instead, we'd be implicitly constraining its position by providing gates that solve a problem when given the right input.

The technical term for the unknown/unspecified gate is "quantum oracle". I'm just going to keep calling it the unknown gate/gates/operation/whatever.

### Grover's Gates

The soul of Grover's algorithm is an operation that computes the average of all the amplitudes, and then inverts all of the amplitudes through that average (and additionally negates all the amplitudes). This operation is referred to as the "Grover Diffusion Operator", and it is built out of Hadamard gates surrounding an operation that inverts the phase of the first state.

Shown below: an OnOn state, the only one with opposite phase in the initial uniformly mixed input state, is 'found' by the diffusion operator.

The input state in the above diagram has three states with an amplitude of and one state with an amplitude of . The average of these four amplitudes is . The states with amplitude go from to . The state with amplitude goes from to .

For an explanation of why the diffusion operator is an inversion-about-the-mean, see these lecture notes.

As the above diagram demonstrated, the diffusion operator can enhance the amplitude of a single discordant phase. Combine this knowledge with the fact that the unknown gate rotates the phase of the solution state by 180°, making it different from the other states, and suddenly the mechanism of Grover's algorithm becomes clear. Use the unknown gate to make the solution state's amplitude different from the other states' amplitudes, then use the diffusion operator to amplify the difference, and repeat until the difference is non-negligible.

I call the repeated part of Grover's algorithm the 'Grover Step':

Above: the unknown gate flips the phase of the "On,Off" state of a uniformly mixed input, and then the diffusion operation 'finds' this flipped phase. Only a single iteration is needed to find the solution when there are two wires.

Now we have all the elements needed to put the full algorithm together. The input to the circuit is all-wires-on. The first thing the circuit does is apply a Hadamard gate that, because of the input state we choose, will create a uniform mix where every state has an amplitude of . Then the circuit applies the Grover Step (that contains the unknown gate that determines what we're looking for) an appropriate number of times, amplifying the sought-after solution state until the probability of observing it when the wire states are measured is maximized.

Below: finding the solution "Off,On,On,On,On", out of a space of size , in Grover Steps. The matrix multiplications and underlying states are not shown in the diagram because they're too large (-by-) to fit comfortably.

Notice that the correct solution is the most likely answer after just one step, but it takes three more steps before the solution is near-certain to be observed. (Warning: the probabilities shown for each individual wire being in the correct state are partially dependent. The probability of actually observing the solution, i.e. every wire being in the right state, progresses as follows: %, %, %, %, %.)

### Counting Steps

How did I know to apply four iterations when there were five wires? How many iterations of the Grover step do we need to apply, in general? Lets do a bit of math.

Suppose we're searching through a state space of size , requiring wires (assume is a power of 2). Before the first Grover step, all the amplitudes will be equal to . Let be the amplitude of the state we want to find, after step , and be the amplitude of the other states. Within the Grover Step, the unknown operation will negate the target amplitude and then the diffusion operator will invert about the mean amplitude . The next amplitudes are and .

The probability that the measured result will be the target state, after a given step, is . Shove the above equations into Excel, graph the chance of success against various numbers of steps and problem sizes, and you get:

One interesting thing to note here is the periodic behavior. If we apply too many steps, the chance of success starts going back down (and then back up, and back down, and...)! Also, the achievable probability of success is not always 100%. The best number of steps is, presumably, just shy of things going downhill for the first time.

Visually, I can tell from the graph that the best number of steps is going up slower than . More analysis of the math will show it takes approximately steps to reach the first maximal chance of success. Thus classical algorithms are comparatively slower and slower, as the problem gets larger and larger, because they must take a linear number of steps with respect to .

### Summary

Grover's algorithm searches for satisfying inputs to a function by creating a uniform quantum superposition of states, and then cancelling out non-solutions. It repeatedly applies an operation (representing the problem to solve) that only inverts the phase of the solution, and then an operation that amplifies amplitudes that are different, until the most likely result-to-measure is the solution.

---

### Discuss on Hacker News, Reddit

---

techroach Kevin li

Twisted Oak Studios offers consulting and development on high-tech interactive projects. Check out our portfolio, or Give us a shout if you have anything you think some really rad engineers should help you with.

## Archive

- strilanc
- What Quantum Computers Do Faster, with Caveats
- Naming Things: Fail-Useful
- Detecting Simple Cycles Forming, Faster
- Third Party Bit Commitment
- Angular Velocity is Simple
- Collection Equality is Hard
- Deadlocks in Practice: Don’t Hold Locks While Notifying
- Brute Force Parallelization
- A Year’s Worth of Opinions about Objective-C
- Referencing Substrings Faster, without Leaking Memory
- Not Crying Over Old Code
- Exploring Universal Ternary Gates
- Impractical Experiments #2: Securing Peer to Peer Fog of War against Map Hacks
- Achieving Exponential Slowdown by Enumerating Twice
- Using Immortality to Kill Accidental Callback Cycles
- Cancellation Tokens (and Collapsing Futures) for Objective-C
- Visualizing the Eigenvectors of a Rotation
- Collapsing Futures in Objective-C
- Bug Hunting #1: Garbled Audio from End to End
- Impractical Experiments #1: Representing Numbers as Polynomials
- Implementing Quantum Pseudo-Telepathy
- Turn On Your Damn Warnings
- Big-O Made Trivial
- Unfathomable Bugs #7: The Broken Oven
- Solomonoff’s Mad Scientist
- Yearly Blogging Roundup #1
- What isn’t a Monad
- Searching a Sorted Matrix Faster
- How to Read Nested Ternary Operators
- Making Sublime Text 2 Jump to the Correct Line with Unity on OS X
- My Bug, My Bad #4: Reading Concurrently
- Whole API Testing with Reflection
- Optimizing a Parser Combinator into a memcpy
- Don’t Treat Paths Like Strings
- Breaking a Toy Hash Function
- Counting Iterators Lazily
- Unfathomable Bugs #6: Pretend Precision
- My Bug, My Bad #3: Accidentally Attacking WarCraft 3
- Collapsing Types vs Monads (followup)
- Collapsing Futures: Easy to Use, Hard to Represent
- Eventual Exceptions vs Programming in a Minimal Functional Style
- The Mystery of Flunf
- Explain it like I’m Five: The Socialist Millionaire Problem and Secure Multi-Party Computation
- Computer Science Blows My Mind
- A visit to Execution Labs in Montréal
- Transmuting Dice, Conserving Entropy
- Rule of Thumb: Ask for the Clock
- Rule of Thumb: Use Purposefully Weakened Methods
- Rule of thumb: Preconditions Should be Checked Explicitly
- Intersecting Linked Lists Faster
- Mouse Path Smoothing for Jack Lumber
- My Bug, My Bad #2: Sunk by Float
- Repeat Yourself Differently
- Grover’s Quantum Search Algorithm
- Followup to Non-Nullable Types vs C#
- Optimizing Just in Time with Expression Trees
- When One-Way Latency Doesn’t Matter
- Determining exactly if/when/where a moving line intersected a moving point
- Emulating Actors in C# with Async/Await
- Making an immutable queue with guaranteed constant time operations
- Improving Checked Exceptions
- Perishable Collections: The Benefits of Removal-by-Lifetime
- Decoupling shared control
- Decoupling inlined UI code
- Linq to Collections: Beyond IEnumerable<T>
- Publish your .Net library as a NuGet package
- When null is not enough: an option type for C#
- Unfathomable Bugs #5: Readonly or not
- Minkowski sums: examples
- My Bug, My Bad #1: Fractal Spheres
- Working around the brittle UI Virtualization in Windows 8
- Encapsulating Angles
- Unfathomable Bugs #4: Keys that aren’t
- How would I even use a monad (in C#)?
- Useful/Interesting Methods #1: Observable.WhenEach
- Unfathomable Bugs #3: Stringing you along
- Anonymous Implementation Classes – A Design Pattern for C#
- Tasks for ActionScript 3 – Improving on Event-Driven Programming
- Minkowski sums and differences
- Non-Nullable Types vs C#: Fixing the Billion Dollar Mistake
- Unfathomable Bugs #2: Slashing Out
- Script templates and base classes
- Unity font extraction
- Abusing “Phantom Types” to Encode List Lengths Into Their Type
- Constructive Criticism of the Reactive Extensions API
- Quaternions part 3
- Quaternions part 2
- Quaternions part 1
- Unfathomable Bugs #1: You can have things! You can have things IN things! You can have …
- Coroutines – More than you want to know
- Asset Bundle Helper
- The Visual Studio goes away
- .Net’s time traveling StopWatch
- Introducing Catalyst