Everyone who worked with C++ for longer than twelve seconds has noticed that keeping track of pointers is somehwat complicated. You have all kinds of wonderful new problems that you never knew you could have in other languages:

  • Null-pointer dereferencing
  • Use-after free
  • Memory leaks

To help with this, C++11 introduced among other things “smart pointers” to at least help with the last two. We now have a std::unique_ptr which attempts to be the only one with access to a pointer, a std::shared_ptr that keeps a reference count, and std::weak_ptr that can have access to a std::shared_ptr but does not contribute to the reference count.

Since it’s the holidays and I need to keep myself busy somehow, and since someone called it a stupid idea in a stackoverflow answer, I set out to use those constructs to implement a functional twin of std::list. Below is an overview of the design and lessons learned.

TL;DR. I made a linked list with smart pointers.

Goals

Because I want to make the most of smart pointers but also still want a serious linked list, I set out with the following goals:

  • The code should be fully compatible with C++11 and C++14 compilers.
  • All methods of std::list should be implemented according to their C++11/C++14 spec.
  • All specified computational complexities should be met.

But, given that, I gave myself some leeway with respect to the move overloads regarding the safelist itself, i.e. the ones with a signature of foo(..., safelist&&, ...). The main point of these is to avoid unecessary allocations and complex clean-up, but most of these don’t do that either way.

Design

The list I want to implement is a doubly linked list, so each element has a pointer forward and one backwards pointer. I chose to use an std::shared_ptr for the forwards pointer, since this enables me to use std::weak_ptrs for iterators. Using std::shared_ptrs for backwards pointers as well would force the reference count for each pointer to be at least 1, so those are going to be std::weak_ptrs as well.

The interesting one remains what to do with the end iterator, since it needs to be mostly valid, and have a reference to the last element. Here I copied the STL and implemented added the value to my list entries as a pointer, which is is a null pointer for the end element. This pointer is a std::unique_ptr since no other construct never actually needs access to the raw pointer of the value.

Finally, since size() needs to be a constant-time operation, I added a property that stores the current size. This, and the above, results in the data structure as shown below.

Structure overview

Compared to std::list

The most notable API changes visible are, as mentioned before, the lack of move overloads for merge and splice. Those are left out since they can always be cast to a reference, and used that way.

Originally, I did not have an O(n log n) algorithm for sorting, because I was lazy and just implemented bubble sort. I thought it would be way harder to implement a fast sorting algorithm without random access in your data structure. I was wrong of course, since you can easily build merge sort, using pieces already included in the list:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
void safelist<T>::sort(Compare compare)
{
    if (size() < 2) {
        return; // Already sorted.
    }

    safelist other;
    auto it = begin();
    std::advance(it, size() / 2);

    other.splice(other.end(), *this, begin(), it);

    sort(compare);
    other.sort(compare);

    merge(other, compare);
}

Unlike the STL version, it is not technically constant memory (it is logarithmic) but other than that the algorithm works exactly as fast as it should.

Performance

Does between two and three times as many allocations as std::list. This mostly comes down to the extras guard blocks that need to be allocated for shared pointers (the guard block stores the use counts). Additionally, I need to do two allocations for empty lists while the STL does none.

In terms of absolute memory usage, my safelist uses significantly more memory, because smart pointers use more memory than raw pointers: the pointer itself is twice as large and a control block with two integers needs to be allocated.

I wrote a small stress program that fills a list with a specified number of random numbers, and then sorts it. Using this, I measured my list to be about 16 times slower without optimizations, and 10 times with optimizations.

An unfortunate side-effect of my design is that the destructor is recursive: the reference count for the first element drops to zero, calls the destructor it and drops the reference count for the second element to zero, which calls its destructor, which drops the reference count for the third element, and so forth. This is a surefire way to reach a stack overflow for larger lists.

This problem can be worked around by explicitly iterating from the end of the list and removing elements one by one, but that kind of defeats the purpose of this project. Also, reordering the elements within the entry structure might allow for tail-call optimization, reducing the needed stack size to something constant.

Actual safety

The main reason for the use of smart pointers is to increase the memory safety of your program. So what does this library provide that std::list does not?

In your program you will often times have iterator handles while you are actively modifying your list. The documentation for std::list specifies a lot about when they remain valid and when they do not. However, it does not specify what happens when you actually use an invalidated iterator. The answer: segfault. If you’re lucky. More likely, you will have a use-after-free error that may or may not cause a vulnerability in your program.

This library instead will makes sure that the pointer for an invalidated iterator is always equivalent to a null pointer. A dereference will still not be a good idea, but you have a reliable crash when this happens.

Closing thoughts

As an academic excersise, this project has been useful in pointing me to all the things that go into making data structures work with common STL interfaces. As an actual library, I urge everyone to never use this. My checking has not been very thourough and its performance is way less than that of std::list Maybe stackoverflow answers are on to something.