MemArena¶
Synopsis¶
#include "swoc/MemArena.h"
-
template<typename T>
class MemArena¶
MemArena provides a memory arena or pool for allocating memory. Internally MemArena reserves
memory in large blocks and allocates pieces of those blocks when memory is requested. Upon
destruction all of the reserved memory is released which also destroys all of the allocated memory.
This is useful when the goal is any (or all) of
amortizing allocation costs for many small allocations.
better memory locality for containers.
de-allocating memory in bulk.
Note that intrusive containers such as IntrusiveDList and
IntrusiveHashMap work well with a MemArena. If the container and
its elements are placed in the MemArena then no specific cleanup is needed beyond destroying the
MemArena.
A MemArena can also be inverted. This means placing the MemArena instance
in its own memory pool so the MemArena and associated objects can be created with a single
base library memory allocation and cleaned up with a single delete.
Usage¶
When a MemArena instance is constructed no memory is reserved. A hint can be provided so that the
first internal reservation of memory will have close to but at least that amount of free space
available to be allocated.
In normal use memory is allocated from MemArena using MemArena::alloc() to get chunks
of memory, or MemArena::make() to get constructed class instances. MemArena::make()
takes an arbitrary set of arguments which it attempts to pass to a constructor for the type
T after allocating memory (sizeof(T) bytes) for the object. If there isn’t enough
free reserved memory, a new internal block is reserved. The size of the new reserved memory will be at least
the size of the currently reserved memory, making each reservation larger than the last.
The arena can be frozen using MemArena::freeze() which locks down the currently reserved memory and forces the internal reservation of memory for the next allocation. By default this internal reservation will be the size of the frozen allocated memory. If this isn’t the best value a hint can be provided to the MemArena::freeze() method to specify a different value, in the same manner as the hint to the constructor. When the arena is thawed (unfrozen) using MemArena::thaw() the frozen memory is released, which also destroys the frozen allocated memory. Doing this can be useful after a series of allocations, which can result in the allocated memory being in different internal blocks, along with possibly no longer in use memory. The result is to coalesce (or garbage collect) all of the in use memory in the arena into a single bulk internal reserved block. This improves memory efficiency and memory locality. This coalescence is done by
Freezing the arena.
Copying all objects back in to the arena.
Thawing the arena.
Because the default reservation hint is large enough for all of the previously allocated memory, all
of the copied objects will be put in the same new internal block. If this for some reason this
sizing isn’t correct a hint can be passed to MemArena::freeze() to specify a different value
(if, for instance, there is a lot of unused memory of known size). Generally this is most useful for
data that is initialized on process start and not changed after process startup. After the process
start initilization, the data can be coalesced for better performance after all modifications have
been done. Alternatively, a container that allocates and de-allocates same sized objects (such as a
std::map) can use a free list to re-use objects before going to the MemArena for more
memory and thereby avoiding collecting unused memory in the arena.
Other than a freeze / thaw cycle, there is no mechanism to release memory except for the destruction
of the MemArena. In such use cases either wasted memory must be small enough or temporary enough
to not be an issue, or there must be a provision for some sort of garbage collection.
Generally MemArena is not as useful for classes that allocate their own internal memory (such as
std::string or std::vector), which includes most container classes. Intrusive
container classes, such as IntrusiveDList and IntrusiveHashMap, are more useful
because the links are in the instance and therefore also in the arena.
Objects created in the arena must not have delete called on them as this will corrupt
memory, usually leading to an immediate crash. The memory for the instance will be released when the
arena is destroyed. The destructor can be called if needed but in general if a destructor is needed
it is probably not a class that should be constructed in the arena. Looking at
IntrusiveDList again for an example, if this is used to link objects in the arena, there is
no need for a destructor to clean up the links - all of the objects will be de-allocated when the
arena is destroyed. Whether this kind of situation can be arranged with reasonable effort is a good
heuristic on whether MemArena is an appropriate choice.
While MemArena will normally allocate memory in successive chunks from an internal block, if the
allocation request is large (more than a memory page) and there is not enough space in the current
internal block, a block just for that allocation will be created. This is useful if the purpose of
MemArena is to track blocks of memory more than reduce the number of system level allocations.
Generally the MemArena will be declared as a member variable of the the class that needs the
local memory. The class destructor will then clean up the memory when the containing class is
destructed. An archetypical example is Lexicon which uses a MemArena to store its
data. IntrusiveHashMap instances are used to track the data which is owned by the arena
which in turn is owned by the Lexicon instance. Destroying the Lexicon releases
all of the memory. There is no explicit cleanup code in Lexicon precisely because the
MemArena destructor does that.
Temporary Allocation¶
In some situations it is useful to allocate memory temporarily. While alloca can be used
for this, that can be a bit risky because the stack is generally much smaller than the heap and
may be heavily used. MemArena provides the MemArena::require() method as an alternative.
MemArena allocates memory in blocks and then hands out pieces of these blocks when it is asked for
memory. Therefore it is usually the case that some amount of not currently used memory is available
in the active block. Access to this memory can be obtained via MemArena::remnant(). This
may be large enough for use as the needed temporary memory, in which case no allocation needs to be
done. What MemArena::require() does is guarantee the remnant is at least the specified
size. If it is already large enough, nothing is done, otherwise a new block is allocated such that
it has enough free space. If this is done multiple times without allocating from the arena, the same
memory space is reused. It is also guaranteed that if the next call to MemArena::alloc() or
MemArena::make() does not need more than the remnant, it will be allocated from the
remnant. This makes it possible to do speculative work in the arena and “commit” it (via allocation)
after the work is successful, or abandon it if not.
Static Memory¶
MemArena has a constructor <MemArena::MemArena(MemSpan<void>)> that enables using a
pre-defined block of memory. This memory is never released or destructed by the MemArena. This is
useful in local contexts where an abitrary amount of memory may be needed but commonly less than a
known amount of memory will actually be needed. In such a case the MemArena can be seeded with a
block of memory (static or on the stack) of that size so that no allocation is done in the common
case, only in the few cases where it is needed without special handling. Such a increase in
performance is essentially the only reason to use this feature.
The static block must be large enough to hold the internal block headers and still have at least the minimum free space available. As of this writing this requires a buffer of at least 64 bytes.
If (roughly) two kilobytes would normally be enough, that could be done with
static constexpr size_t SIZE = 2048;
std::byte buffer[SIZE];
MemArena arena{{buffer, SIZE}};
All methods are available and functional in this case, including freezing and thawing. Any allocations while frozen will never be in the static block, as it won’t be recycled until the thaw. Generally this is not a good technique as a situation where freeze / thaw is useful is almost certainly not a situation where a static block is useful and vice versa.
Although such an arena could be inverted (and thereby placed in the static block) this is also very unlikely to be useful, as the stack space for the arena would still be required during construction.
The expectation is such an instance would be scoped locally to a function or method and destroyed upon return, being used only for temporary storage.
Examples¶
It is expected that MemArena will be used most commonly for string storage, where a container needs
strings and can’t guarantee the lifetime of strings in arguments. A common function that is used is
a “localize” function that takes a string, allocates a copy in the arena, and returns a view of it. E.g.
TextView
localize(MemArena &arena, TextView view) {
auto span = arena.alloc(view.size()).rebind<char>();
memcpy(span, view);
return span;
}
This can then be used to create a view inside the arena.
REQUIRE(thing->n == 956);
That’s about it for putting strings in the arena. Now, consider a class to be put in an arena.
self_type *_next{nullptr};
self_type *_prev{nullptr};
Thing() {}
Thing(int x, std::string_view s) : n(x), name(s) {}
Thing(std::string_view const &s, int x) : n(x), name(s) {}
struct Linkage : swoc::IntrusiveLinkage<self_type> {
static std::string_view
key_of(self_type *thing) {
return thing->name;
}
thing = arena.make<Thing>(text, 956);
A key point to note is name is a view, not a std::string. This means it requires
storage elsewhere, i.e. in the arena. The string text has already been localized so instances
can be created safely
arena.alloc(arena.remaining() - 16);
FixedBufferWriter w(arena.remnant());
w.print("Much ado about not much text");
if (w.error()) {
FixedBufferWriter lw(arena.require(w.extent()).remnant());
lw.print("Much ado about not much text");
All of these are in the arena and are de-allocated when the arena is cleared or destructed. Note a
literal constant can be safely used because it has a process life time. That’s a bit obscure - in
actual use it’s unlikely the API will know this and will localize such strings. Also note
MemArena::make() can use any of the constructors in Thing.
Another case for string storage is formatted strings. Naturally BufferWriter will be used
because it is just so awesome. The MemArena::remnant() method will also be used to do
speculative formatted output. A problem is that at the time the formatted is provided, the final
size is not known, and so (in general) two passes are needed, one to size and one to do the actual
output. This can be optimized somewhat by using MemArena::remnant(). The concept is to do
the sizing pass in to the space. If it fits, win - the string is there and no more needs to be done.
Otherwise the size needed is known and MemArena::require() is used to get the contiguous
memory needed for the string. MemArena::alloc() is then used to allocate the already
initialized memory.
auto tv1 = bw_localize(arena, "Text: {} - '{}'", 956, "Additional");
REQUIRE(tv1 == "Text: 956 - 'Additional'");
REQUIRE(arena.contains(tv1.data()));
arena.clear();
using Map = swoc::IntrusiveHashMap<Thing::Linkage>;
Map *ihm = arena.make<Map>();
The code is a bit awkward and repetitious due to being example code. Production code would look something like
auto arg_tuple{std::forward_as_tuple(args...)};
w.print_v(fmt, arg_tuple);
if (w.error()) {
FixedBufferWriter(arena.require(w.extent()).remnant()).print_v(fmt, arg_tuple);
}
return arena.alloc(w.extent()).rebind<char>();
}
TEST_CASE("MemArena example", "[libswoc][MemArena][example]") {
struct Thing {
using self_type = Thing;
Because this is a function there’s no more repetition of the format string and arguments in use.
{
std::string key_1{"Key One"};
std::string key_2{"Key Two"};
Inversion¶
“Inversion” is the technique of putting the MemArena inside its own arena. For collections of
objects which all have the same lifetime this makes cleanup simple, as only the MemArena needs to
be destructed. In addition, it can minimize calls to the base allocator. If a good guess as to the
total size needed can be made, all of the objects, including the arena, can be provided for in a
single call to malloc. This is used in Errata for performance reasons. Here is a
simplified example.
{
std::unique_ptr<MemArena> arena{new MemArena};
Things of note:
A local stack variable is used to bootstrap the
MemArena.A new
MemArenais created in theMemArenaand the essence of the local arena moved there.This works even if data has already been allocated in the arena, but it’s best to do first.
The resulting pointer is cleaned up by calling the destructor, not by
delete. This is because, as noted earlier, it is not valid to calldeleteon an object in the arena.The destructor is carefully implemented to work correctly even if the class instance is in its own internal memory blocks.
In actual use, the pointer would be a class member, not another local variable, and the inversion delayed until the memory was actually needed so that if the arena isn’t needed, no allocation is done.
Although delete cannot be used on an inverted MemArena, it is still possible to use
code:std::unique_ptr by overriding the cleanup.
Now for the actual example. This uses the localize function from the previous example.
1 TextView local_tv = localize(ta, tv);
2 REQUIRE(local_tv == tv);
3 REQUIRE(ta.contains(local_tv.data()));
4
5 // 16 bytes.
6 std::unique_ptr<MemArena, void (*)(MemArena *)> arena(ta.make<MemArena>(std::move(ta)), destroyer);
7
8 REQUIRE(ta.size() == 0);
9 REQUIRE(ta.contains(local_tv.data()) == false);
10
11 REQUIRE(arena->size() >= local_tv.size());
12 REQUIRE(local_tv == tv);
13 REQUIRE(arena->contains(local_tv.data()));
14
15 TextView local_text = localize(*arena, text);
16 REQUIRE(local_text == text);
17 REQUIRE(local_tv != local_text);
18 REQUIRE(local_tv.data() != local_text.data());
19 REQUIRE(arena->contains(local_text.data()));
20 REQUIRE(arena->size() >= local_tv.size() + local_text.size());
21 }
22
23 {
24 MemArena ta;
In this code, ta is the local bootstrap temporary. The std::unique_ptr is declared to
take as a “deleter” a function taking a MemArena *. The actual function is the lambda in
destroyer declared in line 1. This could have been done directly inline, such as
}
{
auto destroyer = [](MemArena *arena) -> void { arena->~MemArena(); };
but I think separating it is better looking code.
The checking code shows that memory can be allocated in the arena first, but is not in the arena
after the inversion. It is, however, in the inverted arena at the same address as demonstrated by
line 16. All of the allocated memory involved gets cleaned up when the std::unqiue_ptr is
destructed.
Inversion is a bit tricky to understand but that’s all there is to writing code to use it, although
setting up the std::unique_ptr can be done in a variety of ways. Although these are
functionally identical, the memory footprint varies as noted in the comments. This can make a
difference for objects that are intended to be very small if no memory is allocated.
Two styles of lambda use have been shown, but there is one more of interest.
}
};
template <typename... Args>
This requires the separate declaration of the lambda but cuts the std::unique_ptr size in half. Note
every lambda is a distinct type, even if the code is character for character identical. This means the
pointer must be passed destroyer - there is no other object that will be valid. Despite this,
it must be passed as an argument, it cannot be omitted. But, hey, it saves 8 bytes!
Instead of a lambda, one can define a function
arena->~MemArena();
}
TEST_CASE("MemArena inversion", "[libswoc][MemArena][example][inversion]") {
TextView tv{"You done messed up A-A-Ron"};
and use that
{
MemArena ta;
Finally one can define a functor
void
operator()(T *t) {
t->~T();
}
};
void
and use that
{
MemArena ta;
Note in this case the std::unqiue_ptr is only 8 bytes (a single pointer) and doesn’t
require an argument to the constructor.
Internals¶
Allocated memory is tracked by two linked lists, one for current memory and the other for frozen memory. The latter is used only while the arena is frozen. The list of blocks is maintained by IntrusiveDList which means deallocating the memory blocks suffices to clean up. Memory blocks are kept in an active pool so that if a new block is allocated to satisfy a particular allocation or requirement, unused space in previous blocks is still available for use.