Nicolas Polomack
My personal blog, with all my random thoughts
C++17: Optionals and Visitor Pattern
2018-11-10 - One step closer to better type safety and true pattern matching

C++17 arrived with a lot of features, some notable ones include:

  • Better type deduction for auto and templated classes
  • std::string_view
  • Parallel STL algorithms
  • etc...

But the most important new thing to me is the introduction of std::optional, std::variant and std::visit.

The following bits of code have been tested using clang++ 10.0.0 on MacOS 10.14 (Mojave).
But all major compilers (including g++ and msvc) have implemented all of C++17 by now.

# std::optional

std::optional<T> is a sentinel type for a value who might be missing (from a failed computation, for instance).
This is better to me than throwing exceptions, because:

  • The failure possibility is expressed explicitly in the type signature.
  • It remind the caller to check its return values.
  • Preserves value semantics.

It can be used this way:

#include <iostream>
#include <optional>

auto divide(int a, int b) -> std::optional<int>
{
    if (b == 0) {
        return std::nullopt; // or just '{}'
    }
    return a / b; // implicitly wrapped in the std::optional
}

auto main() -> int
{
    const auto result = divide(5, 0);

    if (result.has_value()) {
        std::cout << "The result is: " << result.value() << std::endl;
    } else {
        std::cout << "Division by zero !" << std::endl;
    }

    return 0;
}

# std::variant

std::variant<...Ts> allows you to represent what are called sum types, discriminated unions or tagged unions in C++.
It allows you to hold an instance of one of the specified types.
so, std::variant<int, std::string> holds either an int or a std::string but not both.
It is type-safe because it knows which type it holds at any given time, and you can query for it.

Imagine we're writing a game, and we want to represent a player.
The player can be either alive or dead.

# First approach: Single class

We could represent it as a single class:

#include <cstdint>
#include <iostream>

using u32 = uint32_t;

class Player {
public:
    // Creates an alive player
    Player(u32 id, u32 health, u32 mana)
        : _is_alive(true), _id(id), _health(health), _mana(mana) {};
    // Creates a dead player
    Player(u32 id, u32 killed_by)
        : _is_alive(false), _id(id), _killed_by(killed_by) {};
    ~Player() = default;

    auto is_alive() -> bool { return _is_alive; };
    // And other methods...

protected:
private:
    bool _is_alive;
    u32 _id;
    u32 _health;
    u32 _mana;
    u32 _killed_by;
};

auto main() -> int
{
    const auto player = Player(1, 100, 100);

    if (player.is_alive()) {
        std::cout << "Alive !" << std::endl;
    } else {
        std::cout << "Dead !" << std::endl;
    }

    return 0;
}

This is error-prone, because the _killed_by field is only valid if _is_alive is false and, similarly, _health and _mana are valid only if _is_alive is true (a dead player does not have any hp or mana left).

In this simple case, some people don't consider it like a big deal but we might have methods that are only valid to call if the player is alive (Player::cast_spell() for instance) and throwing exceptions is a lazy way of fixing it.

# Second approach: Dynamic dispatch (with polymorphism)

So we could use polymorphism, and create an AlivePlayer class and a DeadPlayer class both inheriting from a Player base class.

#include <cstdint>
#include <iostream>
#include <memory>

using u32 = uint32_t;

class Player {
public:
    virtual ~Player() = default;

    virtual auto id() const -> u32 { return _id; };

protected:
    Player(u32 id): _id(id) {};
    u32 _id;

private:
};

class DeadPlayer : public Player {
public:
    DeadPlayer(u32 id, u32 killed_by)
        : Player(id), _killed_by(killed_by) {};
    ~DeadPlayer() = default;

    auto killed_by() const -> u32 { return _killed_by; };

protected:
private:
    u32 _killed_by;
};

class AlivePlayer : public Player {
public:
    AlivePlayer(u32 id, u32 health, u32 mana)
        : Player(id), _health(health), _mana(mana) {};
    ~AlivePlayer() = default;

    auto health() const -> u32 { return _health; };
    auto mana() const -> u32 { return _mana; };

protected:
private:
    u32 _health;
    u32 _mana;
};

auto main() -> int
{
    std::unique_ptr<Player> player = std::make_unique<AlivePlayer>(1, 100, 100);

    if (auto ptr = dynamic_cast<AlivePlayer *>(player.get()); ptr != nullptr) {
        std::cout << "Alive !" << std::endl;
    } else if (auto ptr = dynamic_cast<DeadPlayer *>(player.get()); ptr != nullptr) {
        std::cout << "Dead !" << std::endl;
    }

    return 0;
}

This has major drawbacks:

  • It requires allocations (hence the std::unique_ptr) and is therefore slower
  • dynamic_cast is not the nicest nor safest of constructs (deals with raw pointers and nullptr)
  • We lose all value semantics
  • We now need to think hard about ownership of this resource (or just wrap it in a std::shared_ptr, I guess)

# Third approach: Sum type with std::variant

To fix all of these, a third approach we could take here is to use a std::variant:

#include <cstdint>
#include <iostream>
#include <variant>

using u32 = uint32_t;

struct Alive {
    u32 id;
    u32 health;
    u32 mana;
};

struct Dead {
    u32 id;
    u32 killed_by;
};

using Player = std::variant<Alive, Dead>;

auto main() -> int
{
    Player player = Alive{
        .id = 1,
        .health = 100,
        .mana = 100,
    };

    if (std::holds_alternative<Alive>(player) == true) {
        const auto [id, health, mana] = std::get<Alive>(player);
        std::cout << "Alive { id: " << id << ", health: " << health;
        std::cout << ", mana: " << mana << " }" << std::endl;
    } else {
        const auto [id, killed_by] = std::get<Dead>(player);
        std::cout << "Dead { id: " << id << ", killed_by: " << killed_by;
        std::cout << " }" << std::endl;
    }

    return 0;
}

As opposed to dynamic dispatch (the inheritance example), we:

  • keep value semantics
  • don't need to allocate on the heap
  • don't deal with pointers

But the code to branch based on the type held by the std::variant is quite verbose (still shorter than the dynamic_cast thing but not ideal).

A different way to branch is to use a 'visitor' and apply it to the std::variant using std::visit.
A visitor is a simply a callable object overloaded for the types we want to handle.

We can use one like this:

struct Visitor {
    auto operator()(const Alive &player) -> void {
        const auto [id, health, mana] = player;
        std::cout << "Alive { id: " << id << ", health: " << health;
        std::cout << ", mana: " << mana << " }" << std::endl;
    };
    auto operator()(const Dead &player) -> void {
        const auto [id, killed_by] = player;
        std::cout << "Dead { id: " << id << ", killed_by: " << killed_by;
        std::cout << " }" << std::endl;
    };
};

auto main() -> int
{
    Player player = Alive{
        .id = 1,
        .health = 100,
        .mana = 100,
    };

    std::visit(Visitor(), player);

    return 0;
}

But again, creating a struct each time we want to branch is not ideal.
Sadly, the C++ standard gives us nothing to simplify it.
But luckily, a piece of template magic can save us here.

It looks like this:

template <class... Ts> struct match : Ts... { using Ts::operator()...; };
template <class... Ts> match(Ts...)->match<Ts...>;

auto main() -> int
{
    Player player = Alive{
        .id = 1,
        .health = 100,
        .mana = 100,
    };

    auto visitor = match{
        [](const Alive &player) {
            const auto [id, health, mana] = player;
            std::cout << "Alive { id: " << id << ", health: " << health;
            std::cout << ", mana: " << mana << " }" << std::endl;
        },
        [](const Dead &player) {
            const auto [id, killed_by] = player;
            std::cout << "Dead { id: " << id << ", killed_by: " << killed_by;
            std::cout << " }" << std::endl;
        },
    };

    std::visit(visitor, player);

    return 0;
}

We can now just write a lambda for each cases, without having to manually call std::holds_alternative<T> or create structs.

We can also return things from std::visit like this:

auto main() -> int
{
    Player player = Alive{
        .id = 1,
        .health = 100,
        .mana = 100,
    };

    auto visitor = match{
        [](const Alive &player) { return true; },
        [](const Dead &player) { return false; },
    };

    if (std::visit(visitor, player) == true) {
        std::cout << "Alive !" << std::endl;
    } else {
        std::cout << "Dead !" << std::endl;
    }

    return 0;
}

And since they are lambdas, you can capture your environment by reference to trigger different side-effects based on the type held by the variant and any other stuff you'd like to do.

It's all about making errors unrepresentable and increasing type-safety without impacting much on the performance or the semantics of the underlying types.

This is only the beginning on the way for true pattern matching support, either by the language itself or the standard library.
It is, for me, a great direction to take for C++ and std::variant is a great addition to support that.