Let's Learn Algorithms: An Introduction to Binary Search

Welcome to another article in the Let's Learn Algorithms series. In this article we are going to introduce the binary search algorithm.

Binary search is a little different than other algorithms we have covered because it can be applied to a large variety of problems, and the implementation of each could vary quite a bit. For instance, you can use a binary search to find the square root of a number, to determine if a list contains a specific number, or to determine which git commit caused a bug in your code (git bisect does this).

To help with this, I have selected practice problems that each illustrate a unique way that a binary search can be used. I definitely recommend checking out all of them (and more online), even if you feel you understand binary search; chances are you will learn at least one new way to apply the algorithm.

What is binary search?

In it’s simplest form, binary search is an algorithm used to search for an item in a sorted listed. We are going to discuss it shortly, but fist I want to present you with a problem and by walking through it we will eventually arrive at the binary search algorithm.

Finding the index of a number in a sorted list

Okay, so the problem is pretty simple. Imagine that you have the list of numbers, lets say 1, 3, 4, 6, 7, 9, 10, 11, 13 (shown below) and I asked you “Is 6 in your list, and if so what index is it at?”

A list with the numbers 1, 3, 4, 6, 7, 9, 10, 11, 13 in it

The naive way to solve this problem is to iterate over the list from left to right (lowest index to highest) checking each number to see if it is equal to 6.

This would work, but unfortunately there is an issue. What if I gave you a list of 1 million numbers and asked you about random numbers in the list? Or what if I asked you about numbers not in the list at all?

By iterating over the list and looking at every number, we could potentially have to look at 1 million different numbers just to answer the simple question, “Where is 6 in this list?” That sucks!

But what if I told you that the list was already sorted? Could we take advantage of that to improve our code?

Well one thing we could do is start trying to optimize our code to stop searching when we know the number can’t possibly be in our list. For example, if I asked you where 2 was in the list, you could look at 1 and then 3, and after seeing the 3 you could safely say “3 is greater than 2, and the list is sorted, so if we didn’t see a 2 before we ran into a 3 then 2 must not be in the list.”

2 should be between 1 and 3 if it is in this list

This is great for numbers where we can terminate our search early, but unfortunately it doesn’t work all the time. For example, if I asked if 100 was in your list, you might still have to look at every number in the list.

Well, we could try starting from the right side of the list so that we look at the highest numbers first, but unfortunately this has the same flaw. We will still end up looking at a majority of the list for queries about small numbers like 0 or -1.

What we really need is a way to quickly decide “Where in this list should 6 be if it was in the list?”

Binary search

The binary search algorithm does just that; It helps us quickly determine where in the list a 6 would be if it was in the list!

It does this by taking advantage of what we discussed in the last section; If we ever look at a number that is greater than the number we are looking for, we know that the value we are looking for must be somewhere to the left of that value in the list. Similarly, if we look at a number that is less than what we are looking for, then the number we are looking for must be in the right half of the list (if it is in the list).

Since we don’t know anything about the values in our list, the safest way to take advantage of this is to start by looking at the value in the very middle of the list. That way we are guaranteed to at least eliminate half of the list from our search.

Going back to our previous example, if we wanted to find 6 in the list we would first start with the very middle number in the list. In this case it happens to be a 7.

We look at the middle value, 7, first

After we look at the 7 we first ask ourselves “Is this a 6?” The answer to this is obviously no, so then we ask “Is this number greater than 6 or less than 6?”

In this case 7 is greater than 6, which means that if a 6 was in our list, it couldn’t possibly be anywhere to the right of the 7 because all of those values are greater than or equal to 7 in a sorted list.

6 can't be to the right of 7

While we didn’t find the number we are looking for, we did successfully remove half of the list from our search. We now know that if 6 is in the list, it must be in the left half of the list.

6 must be in the left half of the list

Now we want to find out, where in this smaller list would 6 be if it existed? But wait… that questions sounds very familiar. When we started this section we were asking ourselves that exact question, but with a larger list. That means we can use the same approach we did with the larger list to find out where a 6 would be in the smaller list!

Rinse and repeat

The first thing we are going to do is look at the middle value in our list. Our list has 4 numbers in it which means there are two different numbers we could choose from; We are going to always pick the the first number, but that isn’t a requirement of binary search. It is just a choice I decided to make.

Look at the middle number in the smaller list

Once again we ask ourselves, is this number equal to, greater than, or less than 6?

It is less than 6, so we know that if 6 is in the list it must be in the right half of the list if it exists at all.

6 must be in the right half of the smaller list

Just like before, we are able to discard half of our list after checking only one value in the list! And now we are safe to repeat the same steps on this smaller list. We first start by looking at the middle number - 4.

Look at the middle number - 4

This number is less than 6, so we would determine that 6 must be in the right half of the list. We would then repeat this one final time with the “list” that only contains the number 6, at which point we would find the number we are looking for.

We found our 6!

Complexity of binary search

Earlier we talked about how searching an entire list of numbers by iterating over each could be incredibly slow. If we were to write this in Big-O notation, we would say that each query would be O(N). So how does a binary search compare to this?

Well lets start by just looking at some rough numbers. We started with a list of nine numbers. After that we looked at a list with four numbers in it, followed by a list with two numbers, and finally a list with just one number in it.

If you are familiar with binary numbers, this should look pretty familiar. When we start out with a list of size N, each time we split the list we (roughly) cut it in half, getting a new list of the size N/2. Then we cut that in half and get a list of the size N/2/2. This pattern continues until we finally have a list of the size 1.

This is the same as saying we will perform log2(N) operations in our code, but in Big-O notation we can ignore the base of the the logarithm, giving us O(log N).

log2 n = log n / log 2

Log bases are irrelevant in Big-O

When we are talking about Big-O notation you will notice that we don’t write log2(N) and instead opt for log(N), which is another notation for a log with a base of 10 (ie log10(N)).

You can convert any log to a log of another base using the formula, LogB(X) = LogA(X)/LogA(B). This means that we can convert log base 2 to log base 10 like so: log2(N) = log N / log 2 = log N / 0.301.... That list bit - the 0.301... - is a constant, and in Big-O notation constants are dropped because they are typically only concerned with long term growth rates which won’t be affected by constants.

What does that mean in practical terms?

Well, if we were searching for numbers in a list of one million numbers, the naive approach of checking every number could take a million checks to find each number.

A binary search, on the other hand, would take at most 20 (log2(1 million) = 19.93) checks to find each number. That is a massive improvement!

Up Next…

You should now understand how a simpler binary search works to search a list, but that is by no means the extent of what a binary search can be used for.

Binary search is an incredibly versatile algorithm that can be applied to a wide variety of problems. For example, it is used in the git bisect command, which is useful for determining the first commit that introduced a bug in your code.

Unfortunately, many of these unique ways to use a binary search is deserving of its own post, so I will be saving a few of them for the practice problems.

Up next we will go over implementing a binary search that searches for numbers in a list, just like we discussed in this article. After that we will get into the really interesting problems where we start searching for things that aren’t necessary stored in lists, but can still be found using a binary search.

This article is part of the series, Let's Learn Algorithms.

Learn Web Development with Go!

Sign up for my mailing list and I'll send you a FREE sample from my course - Web Development with Go. The sample includes 19 screencasts and the first few chapters from the book.

You will also receive emails from me about Go coding techniques, upcoming courses (including FREE ones), and course discounts.

Email Address