posts/cpp-binary-search: added

2024-04-01 20:33:42 -04:00 · 2024-04-01 20:33:42 -04:00 · 16bec66daa
commit 16bec66daa
parent 1924c1e305
2 changed files with 140 additions and 1 deletions
--- a/posts/cpp-binary-search.md
+++ b/posts/cpp-binary-search.md
@ -0,0 +1,139 @@
+# lower\_bound, upper\_bound in c++, visually explained
+
+2024-04-02
+
+One of the most common problems programmers have to solve is retrieving specific data — finding the needle in the haystack.
+To make this simpler, languages provide standard tools to perform this task:
+in particular, this post focuses on C++'s `lower_bound` and `upper_bound`.
+
+Personally, I find that the documentation for these functions is quite unclear and verbose.
+For example, [cppreference.com](https://en.cppreference.com/w/cpp/algorithm/lower_bound)
+describes `lower_bound` like this:
+
+```
+Searches for the first element in the partitioned range [first, last) which is
+**not** ordered before value.
+```
+
+That sounds like gibberish, and is too technical to quickly understand.
+For that reason, I'm making this blog post to explain my own mental model of these functions.
+
+## refresher on binary search
+
+First, it's important to understand how `lower_bound` and `upper_bound` work under the hood.
+
+As you know, finding words in a dictionary is relatively fast.
+This is possible because the words are in alphabetical order.
+If the words weren't ordered, you'd have to look through every single word in the dictionary, one by one.
+That would be an excruciating, and much slower process.
+Because of the ordering, you can rapidly narrow down the word you want.
+
+Computers can do the same with ordered data: this is called *binary search*,
+and is what powers `lower_bound` and `upper_bound`.
+Binary search is like searching for a word in the dictionary, but more structured.
+For example, say our dictionary is 1000 pages, and the computer wants to look for the word "rabbit".
+These are the steps it takes:
+
+1. Start at exactly page 500.
+2. See the word "murmur", so go forwards to page 750.
+3. See the word "sunny", so go backwards to page 625.
+4. And so on.
+
+This is called "binary search" because we halve the region we are looking in every time (we pick either the left half, or the right half.)
+For step 1, the computer is halving the range `1-1000`.
+In step 2, `500-1000`. Then for step 3, `500-750`.
+This is like the way humans look at dictionaries, but more structured.
+
+Anyways, this is not intended to be a full explanation of binary search: refer to [Tom Scott's video](https://youtube.com/watch?v=KXJSjte_OAI) about it for more information.
+
+## lower bound and upper bound
+
+Back to the real subject of this post: `lower_bound` and `upper_bound` in C++.
+What I used to understand of these functions is that they use binary search to find elements in a sorted container.
+However, I didn't get what differentiated them.
+Again, if you read solely the documentation about these functions, it's not easily comprehensible.
+
+First of all, say we wish to search for the integer `k` (k for key) in a sorted vector (array) of integers `v`.
+We can find the lower and upper bounds with these function calls:
+
+```
+// (you could use auto here instead of the verbose type)
+std::vector<int>::iterator lb = std::lower_bound(v.begin(), v.end(), k);
+std::vector<int>::iterator ub = std::upper_bound(v.begin(), v.end(), k);
+```
+
+Based on the documentation, we know
+the first two arguments specify the region of `v` we're looking in.
+Here, it's the entire vector (from the beginning to the end).
+Also, put simply, the functions return by default:
+
+- `lower_bound`: the first element `e` where `k <= e`;
+- `upper_bound`: the first element `e` where `k < e`.
+
+> Note: Both functions return `v.end()` if no valid element is found.
+> This iterator points just **after** the last element of `v`.
+
+This is the technical definition; it doesn't mean much by itself.
+However, with a concrete example with real numbers, it clicked in my mind.
+For example, let `k = 3`.
+Here is an example sorted array `v`, with upper and lower bounds marked:
+
+```
+    lower   upper
+      ↓       ↓
+1 2 2 3 3 3 3 4 5 6
+      ───────
+         ↑
+ matching interval
+```
+
+The first `3` is the lower bound: it's the first element bigger or equal to our key.
+The `4` is the upper bound, the first element strictly bigger than our key.
+
+Here, when it's laid out visually, it's now clear what the lower and upper bounds mean:
+it's the *bounds of the interval* that matches our search key.
+This is mostly useful if the array has duplicate elements.
+
+Notice how the upper bound is one past the end of the interval,
+just like how `v.end()` is one past the last element of the vector.
+This is usually how C++ iterators work, and makes some tasks more convenient.
+Take this regular for loop:
+
+```
+for (int i = 0; i < 10; i++) { ... }
+```
+
+This loop will iterate over the numbers `0` to `9`,
+excluding the upper bound `10`.
+The same logic applies to C++ iterators.
+If we want to iterate over all elements of a vector, we'd use:
+
+```
+for (auto it = v.begin(); it != v.end(); it++) { ... }
+```
+
+Here, we use `!=` instead of `<` for iterators, but it does practically the same thing.
+When the iterator goes past the end of the vector, it'll hit `v.end()` (which is one past the last element),
+and as such the loop stops.
+
+> Note: Usually, you'd do `for (auto number : v)` to iterate over the entire array.
+
+So, having the upper bound be right past the end of the interval makes this possible:
+
+```
+for (auto it = lb; it != ub; it++) {
+    // *it is like pointer dereference:
+    // it gets the number pointed to by the iterator
+    std::cout << *it << std::endl;
+}
+```
+
+Anyways, I'll repeat it again: `lower_bound` and `upper_bound` represent the *interval* that matches what you're looking for.
+
+## conclusion
+
+So, that is my "visual" explanation how lower and upper bound works in C++.
+In hindsight, this seems obvious, but back when I was first told about these functions,
+I could not understand it because of the confusing descriptions.
+Having this intuition for concepts is pretty helpful for truly understanding them:
+you don't want to be stuck memorizing things that don't make sense.
--- a/public/css/style.css
+++ b/public/css/style.css
@ -201,7 +201,7 @@ p, pre, table, blockquote {
 	text-align: justify;
 }

-li {
+ul li {
 	list-style: square;
 }