Here’s something that I didn’t mention in my post the other day about my holiday book orders: each employee can only order books in discrete $100 blocks. I’m not allowed to get Mark’s $100 and blorb it into mine to make a single $200 order. And anything that goes over my $100 allotment in a single order, I have to cover that out of pocket. Kinda sucks, right?
Nope! It actually presents an incredibly interesting mathematical dilemma colloquially known as “the knapsack problem“:
The knapsack problem or rucksack problem is a problem in combinatorial optimization: Given a set of items, each with a mass and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. It derives its name from the problem faced by someone who is constrained by a fixed-size knapsack and must fill it with the most valuable items.
(I swear I heard a Radiolab podcast about this very problem once, but I can’t seem to find it.)
The first year I did the holiday book order, I had never heard of this problem before but came face-to-face with it as I tried squeezing $533 worth of books out of the $600 I’d amassed. Just looking at the numbers that seems easy, right? Not so when one of the books you want is an $85 textbook, and you can’t find any books under $15 on the site to take up the remaining space under that $100 cap. So if I wanted that textbook, I know right off the bat that I’m going to burn $15 on nothing. Now the problem is to squeeze $448 worth of books out of $500. And it gets tighter as you start arranging the books around to each of the $100 bins.
So I put together a perl script that brute forced a bunch of combinations for me, and it turned out that the best solutions it came up with weren’t any more efficient than if I just went down the list and assigned the most expensive book remaining to a new bin. If I’ve got a list of 16 books that I’ve got to squeeze into five $100 bins, it looks like this: most expensive book in bin A, second most expensive book in bin B, third most expensive book in bin C, … , sixth most expensive book in–oh, no more bins, better start back over at bin A, seventh most expensive book in bin-B, etc.
Now, that’s just one seemingly simple way to solve this problem. There’s a couple others that work just as well. And in all, this type of algorithm works, like, 60-80% of the time.* There are some optimizations that I did, like matching a $60 and a $40 books together and two $49.95 books together (gotta squeeze $248 out of $300 now), but in all, it’s amazing that what you might think of as the dumbest way to do something is also, generally-speaking, the most efficient.
And the proof that something like this works so well can be found by going back to my post the other day. I ordered $589.40 worth of books on $600. Now, I’m not going to lie here: part of my success lies in the fact that I didn’t know what to spend about $200 of that on (a couple of coworkers foisted their money on me without asking, and I certainly wasn’t going to let free books go to waste), so I just searched around for books that looked like they might be something I might read someday. But still: to go from wasting $67 my first year to $19 my second year to $10 my third year, man, that’s pretty awesome, right?
* Interesting side note: I think the fact that book prices appear to follow a normal distribution is what makes this work so well, but I could be wrong. It’s been four years since I’ve done anything approaching “real” math so, you know, grain of salt and all that.