Contents

Complexity

When comparing algorithms or data structures, it is useful to have some way of measuring how beneficial one is over the other. We usually do this in term of complexity, which we further reduce to time complexity and space complexity. In this class, we'll focus almost exclusively on time complexity, but just know that the other exists.

So what is time complexity? It's a measure of how long in the number of steps an algorithm will take to perform an operation, with respect to the size of the data (or data structure) it's operating over. For example, say we have a sorted array with 100 things in it and we want to describe the complexity of two algorithms that will search the array for a specific entry. If algorithm one has to search the entire list in order to find the item, we might say that it has time complexity of 100. Say algorithm two only needs to compare a few things, maybe it has time complexity of 7. This comparison makes it easy to say that algorithm two is faster than algorithm one.

(Back to top)

Big-O

In the example above, we used the exact number of steps. That's not helpful in comparing algorithms in general. What we do instead is to relate the number of steps relative to the size of the input (the data or data structure that we're operating over), which we universally refer to as being of size n.

In addition, the typical complexity analysis considers the worst case scenario for an algorithm. In our example from earlier, if our first search algorithm performs a linear search, then in the worst case, it has to search every single entry, of which there are n. The notation we use to describe this is called Big-O (or Big-Oh), and for this algorithm, we write it as: O(n). This is pronounced "big oh of n".

If the second algorithm is binary search, it has a running complexity of O(log2 n). That's because it keeps splitting the part of the list to search in half (hence the base 2 portion of the log) and we keep performing those splits until we run out of things to search for. Any time we keep dividing something as many times as possible, we can count the number of division using log over the original length of the thing (in this case an array of size n). Because virtually any time we discuss log in complexity analysis it will be base 2, I will generally write log to mean log2.

(Back to top)

Dropping coefficients and terms

An interesting thing about Big-O notation is that we drop any coefficients of n and any constants. For example, say we came up with an algorithm that searched an array of entries by making two passes over it. According to the way I described Big-O earlier, we might say this algorithm has a time complexity of O(2n). However, because we strip coefficients, we should write this as O(n). Likewise, if we had an algorithm that searched the list once, then looked at the first five entries, it's unreduced complexity may look like O(n+5), but would ultimately be O(n).

To understand why we do this, first consider this formal way of writing the complexity of an algorithm, f(n):

f(n) = O(g(n)) as n → ∞

The point of g is to provide an upper bound on the complexity of f as n gets really large, up to a constant factor. So, we don't have to worry about coefficients or constants.

We do, however, have to worry about exponential growth. So, if we have to look over an array of length n n times (as we do in bubble sort), then the complexity is O(n × n) = O(n2). We need the exponent, 2, to stay there. However, if you have an algorithm that first passes over the array n times, then makes one additional pass, then the unreduced complexity is O(n2 + n), which can be reduced to O(n2) because n2 grows to be overwhelmingly larger than n as n gets really large.

(Back to top)

In closing

We will analyze the complexity of all the algorithms and data structures we see this semester, so you will get plenty of experience with Big-O notation. If you are curious or would like to see more examples, take a look at the Big-O Wikipedia page.