In this tutorial we’ll look at one of the fundamental algorithms of computer science, binary search. We’ll also look at a practical application of binary search: implementing fast autocompletion.
Introduction
Consider the problem of finding an item in an array. The obvious solution is to go through all the items in the array and check if any of them is equal to our item. This approach is called linear search and an implementation can be seen below.
func linearSearch(array: [Int], key: Int) -> Bool {
for item in array {
if item == key {
return true
}
}
return false
}
Linear search is often fast enough for most practical purposes. However linear search slows down significantly when the number of elements in an array is large or comparing elements is expensive.
Another much faster approach is binary search. Binary search only works on sorted arrays so we have to sort our array or maintain it in sorted order for binary search to work.
Binary search works by dividing the array into 2 halves around the middle element. The search only continues in one of the halves depending on the found element.
If the found element is less than our target we continue searching in the upper half of the array.
If the element is greater than our target we continue searching in the lower half of the array.
If the found element is equal to our target we are done.
We continue this process until we have either found the element or we’re splitting an array of length 1
.
To get some intuition on how binary search works, consider a phone book that contains an alphabetic list of people. Let’s say you want to find the phone number of Nash John
, a good idea would be to open the book in the middle and search for the number from there. Now you only have to search half of the book because Nash John
will either be after the name you find in the middle or before it. Binary search works on a similar idea but repeats the process of halving the search space until the element is found.
The below diagrams illustrate the steps binary search takes when finding the element 36
in the sorted array below.
let array = [1, 2, 3, 5, 5, 7, 10, 36, 50]
Binary search is much more effective than linear search because it halves the search space at each step. This is not significant for our array of length 9
, here linear search takes at most 9
steps and binary search takes at most 4
steps. But consider an array with 1000
elements, here linear search takes at most 1000
steps while binary search takes at most 10
steps (subsequently considering arrays of size 1000, 500, 250, 75, 37, 18, 9, 4, 2, 1).
Even if we had 1 billion elements binary search will still be able to find our key in at most 30 steps.
Consider an array with a Googol(10^100) entries. That’s way more items than the number of particles in the observable universe. Binary search can find an item in that array in at most 333steps, while linear search would take longer than the life of the universe.
To go a little further if our array has K
elements, linear search will take K
steps while binary search will take [log₂(K)]
steps.
log₂(K)
is the base 2 logarithm of K, that is the power we have to raise 2
too so that we get K
. In other words how many times do we have to repeatedly multiply 1
by 2
so that we get K
.
If log₂(K) = n
then 2ⁿ = K
.
Swift implementation
Until now we only talked about binary search from a theoretical perspective, let’s see how we can implement the algorithm in swift. The algorithm will be implemented as a function that takes an array of ints and a target and returnstrue
/false
depending on whether the target is in our array.
We’ll maintain two indices (left
and right
) that indicate what slice of the array we’re considering for search. Initially this correspond to the first and last index of the array.
Next we’ll repeat the process of getting the middle element from our slice and comparing it to our target
. We want to repeat these steps in a while loop while the length of our slice is at least 1
. That is left <= right
.
At each step of the loop we compute the middle index as the average of left and right and retrieve the value from the middle.
Now we can encounter 3 cases: 1. We find our target. We’re done! We return true
. 2. The found value is less than our target. We set left
to mid + 1
and continue our search in the upper slice of our array. 3. The found value is greater than our target. We set right
to mid - 1
and continue our search in the lower slice of our array.
If we exit the loop then our target is definitely not in the array and we return false
.
func binarySearch(array: [Int], target: Int) -> Bool {
var left = 0
var right = array.count - 1
while (left <= right) {
let mid = (left + right) / 2
let value = array[mid]
if (value == target) {
return true
}
if (value < target) {
left = mid + 1
}
if (value > target) {
right = mid - 1
}
}
return false
}
Our implementation currently only works on integers but we can easily expand it to work on generic types that areComparable
.
func binarySearch<T : Comparable>(array: [T], target: T) -> Bool {
var left = 0
var right = array.count - 1
while (left <= right) {
let mid = (left + right) / 2
let value = array[mid]
if (value == target) {
return true
}
if (value < target) {
left = mid + 1
}
if (value > target) {
right = mid - 1
}
}
return false
}
Searching for prefixes
Our generic version of binary search can be used to search for a string in an array of strings. We can further expand binary search on strings such that instead of finding an exact match it looks if a string in our collection starts with our target strings. If a string starts with our target string we say that the target string is a prefix of that string.
To implement this we only have to replace our equality check with a call to the hasPrefix
method.
func binarySearchPrefix(array: [String], target: String) -> Bool {
var left = 0
var right = array.count - 1
while (left <= right) {
let mid = (left + right) / 2
let value = array[mid]
if (value.hasPrefix(target)) {
return true
}
if (value < target) {
left = mid + 1
}
if (value > target) {
right = mid - 1
}
}
return false
}
There can potentially be many items in our array that start with the same prefix. We can implement a function that instead of true/false returns the index of the first string in our array that starts with a given prefix. Similarly we can implement a function that returns the index of the last string in our array that starts with a given prefix. We’ll use these methods to implement autocomplete.
The trick is that we only return an index in one of 2 cases:
1. Our left
and right
indices are equal and the target is a prefix of the current value
2. The target is a prefix of the current value and the previous element in our array is not a prefix of the current value. This works for the binarySearchFirst
method. We also have to be careful not to access an element outside our array.
3. For binarySearchLast
we use a similar approach but we check if the next element is not a prefix of the current value.
If no string matches the given target the value -1
is returned.
binarySearchFirst
func binarySearchFirst(array: [String], target: String) -> Int {
var left = 0
var right = array.count - 1
while (left <= right) {
let mid = (left + right) / 2
let value = array[mid]
if (left == right && value.hasPrefix(target)) {
return left
}
if value.hasPrefix(target) {
if mid > 0 {
if !array[mid - 1].hasPrefix(target) {
return mid
}
}
right = mid - 1
} else if (value < target) {
left = mid + 1
} else if (value > target) {
right = mid - 1
}
}
return -1
}
binarySearchLast
func binarySearchLast(array: [String], target: String) -> Int {
var left = 0
var right = array.count - 1
while (left <= right) {
let mid = (left + right) / 2
let value = array[mid]
if (left == right && value.hasPrefix(target)) {
return left
}
if value.hasPrefix(target) {
if mid < array.count - 1 {
if !array[mid + 1].hasPrefix(target) {
return mid
}
}
left = mid + 1
} else if (value < target) {
left = mid + 1
} else if (value > target) {
right = mid - 1
}
}
return -1
}
Autocompletion
Consider the following problem, you’re building a browser and want to provide autocomplete when a user types in the url to a page. A user can potentially have thousands of URLs in their search history, doing a linear search through them for autocomplete just won’t cut it after a while.
Below you can see how autocomplete behaves with ~90K entries. Notice that linear search is extremely laggy at this point making for a bad user experience.
Linear Search
Binary Search
The UI code for this project is pretty straightforward: a textField
where the user types the URL and a tableView
below it that provides autocomplete suggestions. The data for the tableview is provided by the class SearchManager
. This class contains the following methods:
func updateFilter(filter: String)
– updates the string used for filtering the URLs
func filteredURLAtIndex(index: Int) -> String
– returns the URL at an index for the list of filtered URLs
func filteredURLCount() -> Int
– returns the number of URLs that match the current filter.
UI Code
class ViewController: UIViewController, UITableViewDataSource, UITableViewDelegate {
@IBOutlet weak var tableView: UITableView!
@IBOutlet weak var textField: UITextField!
var searchManager = SearchManager()
override func viewDidLoad() {
super.viewDidLoad()
}
@IBAction func textFieldDidChange(sender: AnyObject) {
searchManager.updateFilter(textField.text)
tableView.reloadData()
}
func tableView(tableView: UITableView, cellForRowAtIndexPath indexPath: NSIndexPath) -> UITableViewCell {
let cell = UITableViewCell()
cell.textLabel!.text = searchManager.filteredURLAtIndex(indexPath.row)
return cell
}
func tableView(tableView: UITableView, numberOfRowsInSection section: Int) -> Int {
return searchManager.filteredURLCount()
}
func tableView(tableView: UITableView, didSelectRowAtIndexPath indexPath: NSIndexPath) {
self.textField.text = searchManager.filteredURLAtIndex(indexPath.row)
}
}
Search manager
Search manager works by maintaining a list of all urls that it reads from a file at initialization. Two indices are maintained: filterStart
– the index of the first URL that matches the given filter and filterEnd
– the index of the last URL that matches the given filter.
Below you can see an implementation of SearchManager
that uses linear search.
class SearchManager {
private var urls: [String]
private var filterStart: Int
private var filterEnd: Int
init() {
let path = NSBundle.mainBundle().pathForResource("urls", ofType: "txt")!
let content = String(contentsOfFile: path, encoding: NSUTF8StringEncoding, error: nil)!
urls = content.componentsSeparatedByString("\n")
filterStart = 0
filterEnd = urls.count - 1
}
func filteredURLCount() -> Int {
if filterStart == -1 {
return 0
}
return filterEnd - filterStart + 1
}
func filteredURLAtIndex(index: Int) -> String {
return urls[filterStart + index]
}
func updateFilter(filter: String) {
if filter == "" {
filterStart = 0
filterEnd = urls.count - 1
return
}
var foundFirst = false
var index = 0
for url in urls {
if url.hasPrefix(filter) {
if !foundFirst {
filterStart = index
foundFirst = true
}
filterEnd = index
}
index++
}
if !foundFirst {
filterStart = -1
}
}
}
A version of SearchManager
that uses binary search is straightforward to write. Note that we don’t have to change our UI code in any way when changing the internals of SearchManager
.
class SearchManager {
private var urls: [String]
private var filterStart: Int
private var filterEnd: Int
init() {
let path = NSBundle.mainBundle().pathForResource("urls", ofType: "txt")!
let content = String(contentsOfFile: path, encoding: NSUTF8StringEncoding, error: nil)!
urls = content.componentsSeparatedByString("\n")
filterStart = 0
filterEnd = urls.count - 1
}
func filteredURLCount() -> Int {
if filterStart == -1 {
return 0
}
return filterEnd - filterStart + 1
}
func filteredURLAtIndex(index: Int) -> String {
return urls[filterStart + index]
}
func binarySearchLast(array: [String], target: String) -> Int {
var left = 0
var right = array.count - 1
while (left <= right) {
let mid = (left + right) / 2
let value = array[mid]
if (left == right && value.hasPrefix(target)) {
return left
}
if value.hasPrefix(target) {
if mid < array.count - 1 {
if !array[mid + 1].hasPrefix(target) {
return mid
}
}
left = mid + 1
} else if (value < target) {
left = mid + 1
} else if (value > target) {
right = mid - 1
}
}
return -1
}
func binarySearchFirst(array: [String], target: String) -> Int {
var left = 0
var right = array.count - 1
while (left <= right) {
let mid = (left + right) / 2
let value = array[mid]
if (left == right && value.hasPrefix(target)) {
return left
}
if value.hasPrefix(target) {
if mid > 0 {
if !array[mid - 1].hasPrefix(target) {
return mid
}
}
right = mid - 1
} else if (value < target) {
left = mid + 1
} else if (value > target) {
right = mid - 1
}
}
return -1
}
func updateFilter(filter: String) {
if filter == "" {
filterStart = 0
filterEnd = urls.count - 1
return
}
filterStart = binarySearchFirst(urls, target: filter)
filterEnd = binarySearchLast(urls, target: filter)
}
}
Extra Credits
Binary search works great when we want to autocomplete strings based on prefixes. Sometimes you want to check if any of the strings in your collection contains the target string as a substring. For this purpose binary search doesn’t cut it anymore.
If you’re interested in more advanced string matching algorithms have a look at:
Tries
Suffix arrays
Suffix trees
and in case you want to do fuzzy autocomplete:
Great post.
The way you calculate mid will overflow if (left + right > Int.max)
let mid = (left + right) / 2
vs.
let mid = low + ((high – low) / 2)
This of course won’t happen in most iOS apps but it’s something to watch out for at scale.
Hey, thanks for the comment. Yes, I was aware of the overflow issue, I didn’t discuss it because it would have just increased the complexity of the article.
And yes, it’s not possible to allocate an array with 2 billion+ entries on iOS. (2GB RAM max)
I think you article is very good, but you have an error in the your assigment:
let mid = (left + right) / 2
which can lead you to overflow. A very common solution is use
let mid = left + (right – left) / 2. You can read more about the topic in the book Programming Pearls or in this post http://stackoverflow.com/questions/504335/what-are-the-pitfalls-in-implementing-binary-search.
I hope this help you.
Hi, thanks for pointing this out. Programming Perls is one of my favorite coding books. I was aware of the overflow issue while writing the article. I omitted it because I want to make the article as accessible as possible, the overflow issue you mention won’t matter for someone who’s encountering binary search for the first time.
Great tutorial, you guys have become my favourite tutorial site. I think your tutorials are better than for example Ray Wenderlich tutorials where the code is not as clean and professional as yours.
Please do keep up the good work!
Wow, thanks for the comment! I appreciate it
Can I translated this post into Chinese? I hope more people could learn from this post. ^ ^
Nice article.
It would be better if you use a more Swift-like way for the functions that return the index into the array, binarySearchFirst etc. Your code falls back on the old way of indicating failure, i.e. to use an invalid value of the type returned. But in Swift, we don’t do that anymore. We have evolved, become smarter, etc etc… and now we use Optional, right?
Regards
Mark
Good article. I wrote a Swift Extension applying what I learned here:
https://github.com/giannif/BinarySearch
I abstracted whatever I could, like what the truth test is (hasPrefix in the example above), and what the value you’re comparing is (just a String in the above example, but maybe you have a collection with a String property that you want to extract and compare).
See the github repo for a better explanation.
Quick example:
if let index = [“apples”, “bananas”, “oranges”].binarySearch(“bana”, predicate: {$0.hasPrefix($1)} {
// index = 1
}
Feedback appreciated!