Advent of Code in review

On December 1st, /u/topaz/2078 posted a challenge to /r/programming announcing that he had created a series of small programming challenges, from which he had made an advent calendar of code. Every day from December 1st to December 25th one of those challenges became available on his website for the world to solve.

Every day the question consists of two parts. One you can initially see, and one is shown when you correctly anwser the first part.

I took this as a nice way to learn Python, and so I did my best to solve each of these challenges using Python, while still being mostly efficient. All of my solutions can be found in my github repo. In this post, I will highlight parts of them. Others have done impossible things with bash, awk and grep, but these are mostly straightforward solutions.

Day 1 - The beginning

The first day was relatively easy. You start on floor 0 and read a string of parentheses. Every opening bracket raises you a floor and every closing brace lowers you a floor.

The first part asks you what floor you end up. This can be solved easily by just counting the number of both brackets, and substracting. Every standard library in every language provides an easy way to do this.

The second part nicely breaks the previous simplicity and asks you when you first arrive at the basement (floor -1). This requires you to simulate it. This way of the second part partly ruining your original algorithm is a running theme throughout the contest.

Day 4 - Breaking MD5

This day was interesting mostly because I could not find an efficient algorithm to do it. The challenge today was, given a random string (in my case bgvyzdsv) find the lowest numerical suffix such that the MD5 hash of the string starts with a specific number of zeroes.

I took the brute force approach by starting at 0 and increasing until I had the proper number of zeros. If anyone knows whether there is a faster way to do it, please let me know.

Day 6 - All the little lights

For this day, you are asked to simulate an array of 1000x1000 lights, with a set of instructions. Every instruction asks you to either turn on, turn off, or toggle a specified rectangle of lights.

The naive way to do this would be to actually take an array of the appropriate size, and loop over the rectangles doing the appropriate operation on every light. This is rather slow however, so I went for a different solution.

Intead of having a boolean for every cel, I have for every row a list of runs of consecutive lights in the same state. Initially, this means that for every row I have a list containing one run in state off of length 1000.

Then, when I get a rectangle, I iterate over all rows that are in that rectangle. Within each row, I look whether a specific run overlaps with the rectangle, and if so, I split it if it is not entirely contained, and then toggle the state of the run that is covered.

Finally, for every row modified, I merge adjacent runs that are in the same state. This greatly reduces the number of runs I need to process, speeding up the algorithm.

When testing my approach, the result was nearly instantaneous. This is a huge improvement over the naive approach, which took multiple seconds.

And now for something completely different

On day 1 I mentioned that the second part has a habit of ruining your algorithm for part 1. This is very true for this day. The algorithm described above works because it can do a simple operation on a large number of lights at the same time. This works because a lot of them are in the same state, and the end result only depends on their current state.

For part 2, the rules are slightly changed. Lights are not on or off, but have a numerical brightness. Turning them off lowers the brightness by 1 (to a minimum of 0), turning them on raises the brightness by 1 and toggle raises the brightness by 2. Now there are a lot more states lights can be in, and runs of lights in the same state are very short.

Now the algorithm is only marginally faster than the original, because while it still does less operations, its operations are significantly more expensive.

Day 17 - Subset sum works better with eggnog

Today, we have received a tonne of eggnog, and we need to store it precisely in some buckets we have, which are of various sizes. We want to know how many ways we can fit it exactly into a selection of our buckets.

This is related to the subset sum problem, which is NP-complete and has a nice exponential complexity when brute forcing it. It asks for a list of numbers whether there is a subset of the list for which the sum of numbers is equal to 0, or an integer n which is what we are interested in.

In our exercise, we have 20 buckets, which means that there are 2^20 subsets to consider. While this is still possible, it is slow when done using python.

Instead, we split the set of buckets into two sets, and compute the capacity of all of the subsets of those halves. There are 2 * 2 ^ 10 = 2 ^ 11 of these, which is way less.

Now we sort these two lists by subset capacity. This allows us to easily determine in each list, how many subsets are there with capacity m. We then know, that when we have a subset in the first list of capacity m, we need to find a subset in the second list of capacity n - m. Since we can compute quickly how many there are of both of those, we multiply them for every m and sum the results.

Other days

There are other days that probably deserve a write up, but these are the ones that stood out most to me. Complete solutions are found in my github repo for all other days. They should run in either

Python 3,
Python 2.7,
or Pypy. (equivalent version)

Since this was my first serious attempt at Python, and I’d appreciate feedback on the weird things I’ve done not knowing enough about the language. If you’ve got something, let me know.