Skip to content

Use STL lower_bound even though it’s slower

Summarizing the last few posts on STL lower_bound: instead of falling back to a naive algorithm when it’s not given a random-access iterator, STL’s lower_bound algorithm still performs binary search using bidirectional or forward-only iterators. This causes it to perform 2N list traversals, which is four times as many as the naive algorithm’s average case of N/2 list traversals. The STL algorithm guarantees that there will only be O( log N ) comparisons, but we’ve seen that’s not much of an advantage unless the comparison requires an extra memory access.

You might be thinking: why would anybody ever do lower_bound on a list anyway? Just maintain the data in a vector: that has O(1) amortized append, and the STL lower_bound wil run in O( log n ). Remember that the disadvantage of a vector is not merely the O(n) cost of inserting/deleting in the middle: adding an element to a vector invalidates all of its iterators because the addition can, in principle, realloc.

So given that, for lists, naive_lower_bound is clearer, shorter, and faster for large N even when comparisons are costly, when should you use it instead of STL lower_bound? Never. Always use STL lower_bound. The maintenance costs of naive_lower_bound are too high. If you put it in your code, even if you comment it clearly and provide a link to this discussion, in six months or two or ten years, some maintenance programmer is going to be looking at that line of code and wondering “What was wrong with STL lower_bound? Is it still true that naive_lower_bound is fast? Is it even correct?” And that maintenance programmer might well be you. It’s like the old adage: Nobody ever got fired for buying IBM. Or Microsoft. Or today, STL.

Of course, like all generalizations, this one is partially false. If your program is the client for a popular MMORPG and you offer your users a discount if they let you run numerical algorithms on their machine while the client is running, and you need to maintain a linked list of all other available clients, sorted by latency or some other simple metric of speed/quality, and frequently partition that list using lower_bound periodically, then perhaps it makes sense for you to avoid STL lower_bound. But maybe you could use a vector instead. Or maybe instead of maintaining a single sorted list, you could decide a priori to divide your clients into slow, medium, and fast links based on latency < 50ms, < 200 ms, over 200 ms. There are many available ways to approach the problem, and improving the algorithm’s running time — although it’s in some ways the easiest to measure and implement — carries with it maintenance costs that you won’t have to pay.