The Infinite Loop

Tales from a lean programmer.


Leave a comment

Custom deleters for smart pointers in modern C++

Introduction

Many C++ programmers are not aware of a big difference in how the custom deleters of std::unique_ptr and std::shared_ptr are implemented in the C++ standard library. std::unique_ptr carries the custom deleter as part of its type (template<class T, class Deleter> std::unique_ptr). In contrast, the custom deleter of std::shared_ptr is not part of the type (template<class T> std::shared_ptr) but part of the constructor’s template argument list (template<class Y, class Deleter> shared_ptr(Y *ptr, Deleter d)). Mostly, this difference doesn’t matter much. Though, there are use-cases, like e.g. factories returning std::unique_ptr with a custom deleter, were the difference does matter.

Design choices

std::unique_ptr

The advantage of making the custom deleter part of std::unqiue_ptr‘s type is that, as long as the deleter is stateless (e.g. a lambda that doesn’t capture anything or a function with no member variables), storing it doesn’t take up any additional memory thanks to the empty base optimization. This makes std::unique_ptr a zero-overhead abstraction, which means that:

  1. Its size is identical to the size of a raw pointer on the underlying architecture.
  2. All calls to the deleter can be inlined.

One possible implemention which makes use of the empty base optimization is to store the wrapped pointer together with the deleter in a compressed pair. The obvious disadvantage of making the custom deleter part of the type is that two std::unique_ptrs with different custom deleters are of different type, even if they wrap the same pointer type.

std::shared_ptr

In contrast to std::unique_ptr, std::shared_ptr provides the convinience of a type erased deleter. Type erased means that the type of the custom deleter is not dragged into std::shared_ptr‘s type. Hence, one cannot know by just looking at the type if two std::shared_ptr instances have different custom deleters or not.
The type erasure makes std::shared_ptr more flexible. For example changing the allocation strategy of a factory, and with it the custom deleter of the returned std::shared_ptrs, doesn’t break source/binary compatibility and thereby, doesn’t require any recompilation of client software.
The drawback is that storing the custom deleter takes up additional memory, because some wrapper (e.g. std::function or a raw function pointer) is needed to store the custom deleter. The rationale behind this design choice is that std::shared_ptr must anyways heap allocate memory for its shared control block, containing the wrapped pointer and the reference counter. Additionally including the custom deleter didn’t seem like a big cost, taking the increased flexiblity into account.

Type erased custom deleters with std::unique_ptr

Imagine you’re building an object factory which returns std::unique_ptrs. The return type of the factory’s Create() function must allow casting instances of different derived classes to the same std::unique_ptr type. One way to do that is to use a std::unique_ptr to the base class. This, however, requires the base class’ destructor to be virtual. What if the destructor cannot be virtual for some reason or the implications for source and binary compatibility are limiting?

An alternative is to create a type erased custom deleter for std::unique_ptr by wrapping the deleter e.g. in an std::function. The wrapped function is then responsible for casting the void * to the correct type when deleting it. This construction works for virtual and non-virtual classes as well as for multiple inheritance, because the deleter casts the void * argument containing the address to the most derived class simply back to the type of the most derived class.

template<typename Type>
void MyDelete(void *ptr) // Casts 'ptr' to real type and deletes it
{
    delete static_cast<Type *>(ptr);
}

auto Factory::Create(int typeId)
{
    // Unique pointer with type erased custom deleter
    using UniquePtr = std::unique_ptr<Base, std::function<void(void *)>>;

    switch (typeId)
    {
    case 0: return UniquePtr(new Derived0, MyDelete<Derived0>); 
    case 1: return UniquePtr(new Derived1, MyDelete<Derived1>);
    // ...
   }
}

The applied type erasure doesn’t come for free. There are two penalties to pay:

  1. Destroying the pointer cannot be inlined anymore and therefore always requires an additional function call.
  2. Additional memory is required to store the deleter.

It turns out that the std::function wrapper increases the memory footprint of the std::unique_ptr type considerably (32 bytes with GCC 5.3.1’s libc++ and 64 bytes with Visual C++ 2015, both 64 bit). Luckily, we can use a simple function pointer to reduce the total size of the final std::unique_ptr to 16 bytes.

using UniquePtr = std::unique_ptr<Base, void(*)(void *)>;


2 Comments

Mutex lock guards in C++11

The new concurrency library of C++11 comes with two different classes for managing mutex locks: namely std::lock_guard and std::unique_lock. How do they compare with each other and in which situation which of the two classes should be preferably used?

The std::lock_guard class keeps its associated mutex locked during the entire life time by acquiring the lock on construction and releasing the lock on destruction. This makes it impossible to forget unlocking a critical section and it guarantees exception safety because any critical section is automatically unlocked when the stack is unwound after an exception was thrown. The std::lock_guard class should be used when a limited scope, like a class method, is to be locked.

void Foo::Bar()
{
    std::lock_guard<std::mutex> guard(this->Mutex);
    // mutex is locked now
}   // mutex is unlocked when lock guard goes out of scope

In contrast, the std::unique_lock class is a lot more flexible when dealing with mutex locks. It has the same interface as std::lock_guard but provides additional methods for explicitly locking and unlocking mutexes and deferring locking on construction. By passing std::defer_lock instead of std::adopt_lock the mutex remains unlocked when a std::unique_lock instance is constructed. The lock can then be obtained later by calling lock() on the std::unique_lock instance or alternatively, by passing it to the std::lock() function. To check if a std::unique_lock currently owns its associated mutex the owns_lock() method can be used. Hence, the mutex associated with a std::unique_lock doesn’t have to be locked (sometimes also referred to as owned) during the lock guard’s entire life time. As a consequence, the ownership of a std::unqiue_lock can be transferred between instances. This is why std::unique_lock is movable whereas std::lock_guard is not. Thus, more flexible locking schemes can be implemented by passing around locks between scopes.
For example a std::unique_lock can be returned from a function, or instances of a class containing a std::unique_lock attribute can be stored in containers. Consider the following example in which a mutex is locked in the function Foo(), returned to the function Bar() and only then unlocked on destruction.

std::mutex Mutex;

std::unique_lock<std::mutex> Foo()
{
    std::unique_lock<std::mutex> lock(Mutex);
    return lock;
    // mutex isn't unlocked here!
}

void Bar()
{
    auto lock = Foo();
}   // mutex is unlocked when lock goes out of scope

Keeping std::unique_lock‘s additional lock status up-to-date induces some additional, minimal space and speed overhead in comparison to std::lock_guard. Hence, as a general rule, std::lock_guard should be preferably used when the additional features of std::unique_lock are not needed.


2 Comments

A pitfall with initialization lists in C++

Welcome to this quick post on constructor initialization lists in C++. If you are not familiar with initialization lists here is a very short introduction. When in C++ a class is instantiated, first, all base classes and second, all class attributes are constructed. Initialization lists are used to control that process. In initialization lists the constructors of the base class and the class attributes can be explicitly called. Otherwise, they are initialized by calling their default constructor. For efficiency reasons it is important to use initialization lists, because all member initialization takes place before the body of the constructor is entered. So much about the basics of initialization lists. Now a little riddle. The following piece of code will crash. Can you find the problem? Give it a try your-self before you continue reading.

struct Foo
{
    Foo(int newId) : id(newId) {}
    int id;
};

struct Bar
{
    Bar(const Foo &newFoo) : foo(newFoo), fooId(foo.id) {}

    const int   fooId;
    const Foo & foo;
};

int main(int argc, char **argv)
{
    Foo foo(303);
    Bar bar(foo);
    return 0;
}

If you don’t know the nasty details of how class attributes are initialized via initialization lists, you most probably couldn’t figure out what causes the crash in those few lines of code. The issue with the code above is that the order of attribute initialization is not determined by the order of appearance in the initialization list, but by the order of declaration within the class. Note, that foo appears in the initialization list before fooId, but fooId is declared before foo. Hence, not foo but fooId is initialized first, accessing the still uninitialized attribute foo.

So, why is that you might ask? This, on the first glance, strange behavior actually makes a lot of sense. When an object is destroyed, its destructor calls the destructors of every class attribute in the reverse order they were initialized. As there can be potentially more than one constructor with different orders of attribute initialization the order of destruction wouldn’t be defined. To solve the ambiguity simply the order of declaration is used.