Let's Learn Algorithms: Binary Searching for Case Insensitive Strings

In the last two posts we discussed how the binary search algorithm works as well as how to implement it. In this post we are going to put everything we learned to the test and we are going to look at a practice problem that is similar to our implementation, but will need some slight changes to make it work with 100% accuracy.

This post is part of the Let's Learn Algorithms series where we learn about how different algorithms work, how to implement them, and practice on problems using that algorithm. Check it out if you haven’t already!

 

If you want to stay up to date with new publications in this series I suggest you sign up for my mailing list. Not only will I keep you updated when I publish new articles, but you will also get content exclusive to my mailing list, including two free chapters from my upcoming book “Web Development with Go”.

The practice problem we are going to cover is quite simple to describe. Given a sorted list of words (types of animals, to be exact), we are going to write a binary search that will search for a specific word in that list without being case sensitive.

That last part is the tricky but important bit. Without it, this really wouldn’t be any different from our binary search using integers.

The initial code

The initial code for this problem is going to handle a good bit for us. It is going to already have a sorted list of strings, and in that sorted list we will already have animals with strings with both uppercase and lowercase letters. The list will already be sorted ignoring case.

We are also going to have a bit of code that asks the user for a word to search for and then passes that word into the binary search function. If you are writing in another language, accepting user input might be a little different. If you can’t figure it out, rather than getting stuck on it, I suggest you replace the part that reads the input from the user with a string variable and just update your code before each run to test different search strings.

Finally, we also will have a stubbed out binSearch() function that takes in a list of words and a single word to find and returns an integer representing the index of the word if found, and -1 otherwise.

You can see the code below, or over at the Go Playground.

package main

import "fmt"

// Given a list of sorted words (strings with no spaces),
// search for a user provided word in the list without
// being case sensitive. Return -1 if the word isn't found
// and return the index of the word if it is found.
func main() {
  var words []string = []string{
    "ALLigator",
    "bat",
    "bEEtle",
    "camel",
    "cat",
    "cheetah",
    "COLT",
    "cow",
    "dog",
    "eagle",
    "froG",
    "hamster",
    "horse",
    "mink",
    "moose",
    "porcupine",
    "RaT",
    "rooster",
    "steer",
  }
  fmt.Println("Sorted Words:", words)
  var toFind string
  fmt.Println("What word should we search for? No spaces please!")
  fmt.Scanf("%s", &toFind)
  var index int
  index = binSearch(words, toFind)
  if index < 0 {
    fmt.Println("The word", toFind, "could not be found!")
  } else {
    fmt.Println("The word", toFind, "was found at index:", index, words[index])
  }
}

func binSearch(words []string, word string) int {
  return -1
}

At this point our code doesn’t work, as it always returns negative one. Go ahead and take a bit to try to implement the binSearch() function using what we learned while implementing binary search.

You may also want to reference the practice problem sorting a list of strings in alphabetical order with bubble sort as you are likely going to need a greater() function like we used there, and a similar equal() function. Strictly speaking, you don’t have to write these as functions, but I suspect it will help if you do.

The solution

If you haven’t already given it a stab go back to the last section and try to code up a solution to the problem. I’ll wait here.

Alright, ready to check your solution with what I came up with?

The first thing we are going to do is start writing our code pretty much exactly like we did in the last article when we first implemented binary search. This won’t fix our problem with our code not being case sensitive, but at the very least it should get us a rough skeleton to work with.

After renaming a few variables, you should come up with some code like below.

func binSearch(words []string, word string) int {
  var lo int = 0
  var hi int = len(words) - 1

  for lo <= hi {
    var mid int = lo + (hi-lo)/2
    var midValue string = words[mid]

    if midValue == word {
      return mid
    } else if midValue > word {
      // We want to use the left half of our list
      hi = mid - 1
    } else {
      // We want to use the right half of our list
      lo = mid + 1
    }
  }

  // If we get here we tried to look at an invalid sub-list
  // which means the number isn't in our list.
  return -1
}

This might look like it works at first, but it definitely doesn’t 100% of the time. Even if you type RaT exactly as it appears in our list, our search will fail to find it because our search is currently case sensitive. In a case sensitive list, “RaT” shoudl come before any words starting with a lowercase letter, but that isn’t how our list is sorted.

As a result, our binary search won’t work as we expect unless write a greater() function that compares items just like they were compared when our list was sorted. Or in other words, we need to copy the greater() function from the second bubble sort practice problem.

func greater(a, b string) bool {
  if strings.ToLower(a) > strings.ToLower(b) {
    return true
  } else {
    return false
  }
}

Now we can update the line } else if midValue > word { to instead use our custom greater() function. The resulting code can be found here: https://play.golang.org/p/U3opRdqUfv. This code no longer asks the user to type in a string so that you can run the code on the Go Playground.

Our code will now find words as long as type them exactly as they are in the list, but what if we want to type “rat” and find the word “RaT” in the list?

Fixing this is similar to fixing our comparison function, but instead of writing a custom greater comparison, we instead want an equality comparison.

We will call this function equal() and it will be pretty much identical to greater() except it will check to see if two strings are equal, and return false otherwise.

func equal(a, b string) bool {
  if strings.ToLower(a) == strings.ToLower(b) {
    return true
  } else {
    return false
  }
}

To make use of this we need to update the first if statement in our search. The line reads if midValue == word { and should instead read if equal(midValue, word) {.

Putting that all together we have the working solution. You can see a runnable version on the Go Playground.

A comparator function

Wouldn’t it be nice if we didn’t have to write all of these custom comparison functions? If we needed a less than function we would end up with three custom comparison functions in our code, and that doesn’t even take less than or equal to and greater than or equal to into consideration. Yikes!

Luckily some intelligent engineers before us have come up with a pretty neat way of handling this. Instead of writing a custom function for each comparison, we are instead going to update our code to use a single compare() function. This function won’t return a boolean, but will instead return an integer representing the result of the comparison.

If the integer is negative, that means the first value is less than the second value. If the integer is zero, that means the first value is equal to the second value, and finally if the integer is positive (1 or higher - not 0!) that means the first value is greater than the second.

The code for this function is shown below.

func compare(a, b string) int {
  var aLow string = strings.ToLower(a)
  var bLow string = strings.ToLower(b)
  if aLow == bLow {
    return 0
  } else if aLow < bLow {
    return -1
  } else {
    return 1
  }
}

Using this in our code is actually pretty simple Instead of writing if greater(a, b) { ... } we would instead write if compare(a, b) > 0 { ... }. Similarly, less than can be expressed by checking if the result is less than 0, and equal can be checked by testing whether the result is equal to zero.

Updating our binary search code to use this we get the code shown below.

package main

import (
  "fmt"
  "strings"
)

// Given a list of sorted words (strings with no spaces),
// search for a user provided word in the list without
// being case sensitive. Return -1 if the word isn't found
// and return the index of the word if it is found.
func main() {
  var words []string = []string{
    "ALLigator",
    "bat",
    "bEEtle",
    "camel",
    "cat",
    "cheetah",
    "COLT",
    "cow",
    "dog",
    "eagle",
    "froG",
    "hamster",
    "horse",
    "mink",
    "moose",
    "porcupine",
    "RaT",
    "rooster",
    "steer",
  }
  fmt.Println("Sorted Words:", words)
  var toFind string
  toFind = "rat"
  var index int
  index = binSearch(words, toFind)
  if index < 0 {
    fmt.Println("The word", toFind, "could not be found!")
  } else {
    fmt.Println("The word", toFind, "was found at index:", index, words[index])
  }
}

func binSearch(words []string, word string) int {
  var lo int = 0
  var hi int = len(words) - 1

  for lo <= hi {
    var mid int = lo + (hi-lo)/2
    var midValue string = words[mid]

    if compare(midValue, word) == 0 {
      return mid
    } else if compare(midValue, word) > 0 {
      // We want to use the left half of our list
      hi = mid - 1
    } else {
      // We want to use the right half of our list
      lo = mid + 1
    }
  }

  // If we get here we tried to look at an invalid sub-list
  // which means the number isn't in our list.
  return -1
}

func compare(a, b string) int {
  var aLow string = strings.ToLower(a)
  var bLow string = strings.ToLower(b)
  if aLow == bLow {
    return 0
  } else if aLow < bLow {
    return -1
  } else {
    return 1
  }
}

Up Next…

Want to stay up to date with new releases in the series? Sign up for my mailing list and I will notify you when I publish new articles. I will also send you the first three chapters from my upcoming book, Web Development with Go, and other exclusive content that I only send to my mailing list.

As you can see, adapting the binary search to work with all sorts of lists is pretty easy to do, but as I said before, there are a lot of other types of problems that can be solved with a binary search.

In our next practice problem, which should be published later this week, we will focus on using a binary search to calculate the square root of a number.

Learn Web Development with Go!

Sign up for my mailing list and I'll send you a FREE sample from my course - Web Development with Go. The sample includes three chapters from the book, and over 2.5 hours of screencasts.

You will also receive notifications when I release new articles, along with other freebies that I only share with my mailing list.

Avatar of Jon Calhoun
Written by
Jon Calhoun

Jon Calhoun is a full stack web developer who also teaches about Go, web development, algorithms, and anything programming related. He also consults for other companies who have development needs. (If you need some development work done, get in touch!)

Jon is a co-founder of EasyPost, a shipping API that many fortune 500 companies use to power their shipping infrastructure, and prior to founding EasyPost he worked at google as a software engineer.

Related articles

Spread the word

Did you find this page helpful? Let others know about it!

Vote on Hacker News

Sharing helps me continue to create both free and premium Go resources.

Want to discuss the article?

See something that is wrong, think this article could be improved, or just want to say thanks? I'd love to hear what you have to say!

You can reach me via email or via twitter.

Recent Articles All Articles Mini-Series Tags About Me Go Courses

©2018 Jonathan Calhoun. All rights reserved.