C++23 is introducing some more data structures, some more associative containers. We are going to get the flat versions of map
/set
/multimap
/multiset
:
flat_map
flat_set
flat_multimap
flat_multiset
These new types will work as drop-in replacements for their non-flat types. The goal of these new types is to provide different time and space complexities compared to the original containers. The non-flat versions' implementations are using balanced binary trees under the hood which more-or-less defines their main characteristics. C++11 introduced the unordered versions of these containers, and even though in most cases they should be preferred, they are often neglected. As the name suggests, unordered containers are not sorted.
Now we are getting sorted containers that are more effective in a big chunk of our use cases.
As mentioned, the original containers use binary search trees, the unordered versions use hashmaps. The flat ones use sequence containers. In fact, the flat versions are not even containers, they are container adapters.
Container adapters are interfaces created by limiting functionality in a pre-existing container and providing a different set of functionality. When you declare a container adapter, you have the option of specifying which sequence container should be the underlying container.
The underlying data structure is configurable through template parameters, but they must be sequence containers with random access iterators. Why do I speak about template parameters in plural? Because for a flat_map
, you can use different containers for keys and values. From now on, I'll simply write about flat_map
, but the observations are also valid for the other flat container adapters unless I explicitly write so.
Specialties of a flat_{map|set}
So what are the specialities of these container adapters over the usual associate containers? What are those different time and space complexities that it has to provide? Let's start with enumerating the disadvantages, just to avoid thinking that a flat_map
is too good to be true.
- Insertion and deletion are going to be slower
- On insertion and deletion iterators will become unstable
- The exception safety is weaker because moves and copies do happen, we don't just pass around pointers anymore
- And we cannot store non-copyable, non-movable types in flat structures
And now let's see what we get for this price. We'll get:
- Faster iteration
- Random access iterators instead of bidirectional iterators
- Smaller memory consumption
- Improved cache-friendliness due to contiguous memory layout
Actually, all this comes from the fact that under a map
you'll find a balanced search tree, while under a flat_map
, you'll find a sequence container, like a std::vector
.
Let's have a closer look at some of these items above.
What happens when we insert into a map
or when we delete from it and the tree has to be rebalanced? There will be no copies or moves, because each node in the map
has a handle, a pointer to the data and only these handles will be moved around not the pointed objects.
Once we understand this, we also take it evident that the flat versions have a better memory footprint. When you work on a contiguous memory area when you deal with a sequence container, you have no handles, no metadata to store. You simply have to deal with the data. You also have to work more on them when you insert/delete as it's not enough to move around the handles anymore.
If you want to go deeper into the cache friendliness topic, I'd recommend watching the talk of Bjorn Fahller at C++OnSea. He explained that even in use cases when we might think that a linked list would serve us better than a sequence container, the latter might be a better choice. Even if it has to perform more work. More work is sometimes faster as the bottleneck is not the CPU anymore, but the cache. With sequence containers, the CPU has to access contiguous parts of memory, which is particularly cache-friendly.
One more word on speed
As it was already mentioned, flat_map
is an ordered map. If you give some inputs to it either at construction time or later, it will take care of keeping itself sorted. That requires some resources. But what if your data is already sorted? After all, C++ has the concept of not paying for what you don't use.
flat_map
provides a solution for that!
The standard library provides a tag type called sorted_unique_t
and the flat_map
constructors have overloads taking that tag as a first parameter. Not only the constructors but also those overloads of the insert
method that take multiple elements to insert. If you construct a flat_map
from elements that are already sorted, or when you insert
a range of items that are already sorted, don't forget to use the overloads with sorted_unique_t
, because you can benefit from a significant performance gain.
But beware! If you use a sorted_unique_t
overload with unsorted data, the behaviour is undefined! All bets are off!
How it differs in its API
flat_map
stores separately the keys and values, and the storage for those can be of different types. Because of that, there is a bunch of new constructors available. In addition, there are several new overloads for the constructor and the insert
method taking the previously explained sorted_unique_t
tag so that we don't resort to already sorted items.
The extract
member method moves out both underlying storage containers. The extract
function is overloaded with &&
showing that the original object should be used anymore.
The other direction of moving data is also possible through the replace
function. It takes containers as rvalue references and replaces the underlying containers with what was passed in.
What if you want to use the new adaptors?
You'll still have to wait for the standard versions. At the moment of writing this article, no compiler supports the flat container adapters.
At the same time, the proposal didn't just come out of the blue. boost
has had this feature for quite a while, which served as a basis for the standardization. You can go and experiment with it, if not on your local, you go to Compiler Explorer or Coliru.
Conclusion
In C++23, we are going to get some exciting new container adaptors in the standard library! The flat_{map|multimap|set|multiset}
containers offer different space and time complexities compared to their normal, original versions. It favors fast iteration, lookup and lower memory consumption at the expense of potentially slower writes. At the same time, it still offers the benefits of having sorted containers, unlike the unordered_*
versions.
Although there is no compiler support for the time being, we can already learn about both from the standard or by using the boost versions!
Connect deeper
If you liked this article, please
- hit on the like button,
- subscribe to my newsletter
- and let's connect on Twitter!