MemArena

Synopsis

#include "swoc/MemArena.h"

template<typename T>
class MemArena

Reference documentation.

MemArena provides a memory arena or pool for allocating memory. Internally MemArena reserves memory in large blocks and allocates pieces of those blocks when memory is requested. Upon destruction all of the reserved memory is released which also destroys all of the allocated memory. This is useful when the goal is any (or all) of

  • amortizing allocation costs for many small allocations.

  • better memory locality for containers.

  • de-allocating memory in bulk.

Note that intrusive containers such as IntrusiveDList and IntrusiveHashMap work well with a MemArena. If the container and its elements are placed in the MemArena then no specific cleanup is needed beyond destroying the MemArena.

A MemArena can also be inverted. This means placing the MemArena instance in its own memory pool so the MemArena and associated objects can be created with a single base library memory allocation and cleaned up with a single delete.

Usage

When a MemArena instance is constructed no memory is reserved. A hint can be provided so that the first internal reservation of memory will have close to but at least that amount of free space available to be allocated.

In normal use memory is allocated from MemArena using MemArena::alloc() to get chunks of memory, or MemArena::make() to get constructed class instances. MemArena::make() takes an arbitrary set of arguments which it attempts to pass to a constructor for the type T after allocating memory (sizeof(T) bytes) for the object. If there isn’t enough free reserved memory, a new internal block is reserved. The size of the new reserved memory will be at least the size of the currently reserved memory, making each reservation larger than the last.

The arena can be frozen using MemArena::freeze() which locks down the currently reserved memory and forces the internal reservation of memory for the next allocation. By default this internal reservation will be the size of the frozen allocated memory. If this isn’t the best value a hint can be provided to the MemArena::freeze() method to specify a different value, in the same manner as the hint to the constructor. When the arena is thawed (unfrozen) using MemArena::thaw() the frozen memory is released, which also destroys the frozen allocated memory. Doing this can be useful after a series of allocations, which can result in the allocated memory being in different internal blocks, along with possibly no longer in use memory. The result is to coalesce (or garbage collect) all of the in use memory in the arena into a single bulk internal reserved block. This improves memory efficiency and memory locality. This coalescence is done by

  1. Freezing the arena.

  2. Copying all objects back in to the arena.

  3. Thawing the arena.

Because the default reservation hint is large enough for all of the previously allocated memory, all of the copied objects will be put in the same new internal block. If this for some reason this sizing isn’t correct a hint can be passed to MemArena::freeze() to specify a different value (if, for instance, there is a lot of unused memory of known size). Generally this is most useful for data that is initialized on process start and not changed after process startup. After the process start initilization, the data can be coalesced for better performance after all modifications have been done. Alternatively, a container that allocates and de-allocates same sized objects (such as a std::map) can use a free list to re-use objects before going to the MemArena for more memory and thereby avoiding collecting unused memory in the arena.

Other than a freeze / thaw cycle, there is no mechanism to release memory except for the destruction of the MemArena. In such use cases either wasted memory must be small enough or temporary enough to not be an issue, or there must be a provision for some sort of garbage collection.

Generally MemArena is not as useful for classes that allocate their own internal memory (such as std::string or std::vector), which includes most container classes. Intrusive container classes, such as IntrusiveDList and IntrusiveHashMap, are more useful because the links are in the instance and therefore also in the arena.

Objects created in the arena must not have delete called on them as this will corrupt memory, usually leading to an immediate crash. The memory for the instance will be released when the arena is destroyed. The destructor can be called if needed but in general if a destructor is needed it is probably not a class that should be constructed in the arena. Looking at IntrusiveDList again for an example, if this is used to link objects in the arena, there is no need for a destructor to clean up the links - all of the objects will be de-allocated when the arena is destroyed. Whether this kind of situation can be arranged with reasonable effort is a good heuristic on whether MemArena is an appropriate choice.

While MemArena will normally allocate memory in successive chunks from an internal block, if the allocation request is large (more than a memory page) and there is not enough space in the current internal block, a block just for that allocation will be created. This is useful if the purpose of MemArena is to track blocks of memory more than reduce the number of system level allocations.

Generally the MemArena will be declared as a member variable of the the class that needs the local memory. The class destructor will then clean up the memory when the containing class is destructed. An archetypical example is Lexicon which uses a MemArena to store its data. IntrusiveHashMap instances are used to track the data which is owned by the arena which in turn is owned by the Lexicon instance. Destroying the Lexicon releases all of the memory. There is no explicit cleanup code in Lexicon precisely because the MemArena destructor does that.

Temporary Allocation

In some situations it is useful to allocate memory temporarily. While alloca can be used for this, that can be a bit risky because the stack is generally much smaller than the heap and may be heavily used. MemArena provides the MemArena::require() method as an alternative.

MemArena allocates memory in blocks and then hands out pieces of these blocks when it is asked for memory. Therefore it is usually the case that some amount of not currently used memory is available in the active block. Access to this memory can be obtained via MemArena::remnant(). This may be large enough for use as the needed temporary memory, in which case no allocation needs to be done. What MemArena::require() does is guarantee the remnant is at least the specified size. If it is already large enough, nothing is done, otherwise a new block is allocated such that it has enough free space. If this is done multiple times without allocating from the arena, the same memory space is reused. It is also guaranteed that if the next call to MemArena::alloc() or MemArena::make() does not need more than the remnant, it will be allocated from the remnant. This makes it possible to do speculative work in the arena and “commit” it (via allocation) after the work is successful, or abandon it if not.

Static Memory

MemArena has a constructor <MemArena::MemArena(MemSpan<void>)> that enables using a pre-defined block of memory. This memory is never released or destructed by the MemArena. This is useful in local contexts where an abitrary amount of memory may be needed but commonly less than a known amount of memory will actually be needed. In such a case the MemArena can be seeded with a block of memory (static or on the stack) of that size so that no allocation is done in the common case, only in the few cases where it is needed without special handling. Such a increase in performance is essentially the only reason to use this feature.

The static block must be large enough to hold the internal block headers and still have at least the minimum free space available. As of this writing this requires a buffer of at least 64 bytes.

If (roughly) two kilobytes would normally be enough, that could be done with

static constexpr size_t SIZE = 2048;
std::byte buffer[SIZE];
MemArena arena{{buffer, SIZE}};

All methods are available and functional in this case, including freezing and thawing. Any allocations while frozen will never be in the static block, as it won’t be recycled until the thaw. Generally this is not a good technique as a situation where freeze / thaw is useful is almost certainly not a situation where a static block is useful and vice versa.

Although such an arena could be inverted (and thereby placed in the static block) this is also very unlikely to be useful, as the stack space for the arena would still be required during construction.

The expectation is such an instance would be scoped locally to a function or method and destroyed upon return, being used only for temporary storage.

Examples

It is expected that MemArena will be used most commonly for string storage, where a container needs strings and can’t guarantee the lifetime of strings in arguments. A common function that is used is a “localize” function that takes a string, allocates a copy in the arena, and returns a view of it. E.g.

TextView
localize(MemArena &arena, TextView view) {
  auto span = arena.alloc(view.size()).rebind<char>();
  memcpy(span, view);
  return span;
}

This can then be used to create a view inside the arena.

  REQUIRE(thing->n == 956);

That’s about it for putting strings in the arena. Now, consider a class to be put in an arena.

    self_type *_next{nullptr};
    self_type *_prev{nullptr};

    Thing() {}


    Thing(int x, std::string_view s) : n(x), name(s) {}

    Thing(std::string_view const &s, int x) : n(x), name(s) {}

    struct Linkage : swoc::IntrusiveLinkage<self_type> {
      static std::string_view
      key_of(self_type *thing) {
        return thing->name;
      }
  thing = arena.make<Thing>(text, 956);

A key point to note is name is a view, not a std::string. This means it requires storage elsewhere, i.e. in the arena. The string text has already been localized so instances can be created safely

  arena.alloc(arena.remaining() - 16);

  FixedBufferWriter w(arena.remnant());
  w.print("Much ado about not much text");
  if (w.error()) {
    FixedBufferWriter lw(arena.require(w.extent()).remnant());
    lw.print("Much ado about not much text");

All of these are in the arena and are de-allocated when the arena is cleared or destructed. Note a literal constant can be safely used because it has a process life time. That’s a bit obscure - in actual use it’s unlikely the API will know this and will localize such strings. Also note MemArena::make() can use any of the constructors in Thing.

Another case for string storage is formatted strings. Naturally BufferWriter will be used because it is just so awesome. The MemArena::remnant() method will also be used to do speculative formatted output. A problem is that at the time the formatted is provided, the final size is not known, and so (in general) two passes are needed, one to size and one to do the actual output. This can be optimized somewhat by using MemArena::remnant(). The concept is to do the sizing pass in to the space. If it fits, win - the string is there and no more needs to be done. Otherwise the size needed is known and MemArena::require() is used to get the contiguous memory needed for the string. MemArena::alloc() is then used to allocate the already initialized memory.

  auto tv1 = bw_localize(arena, "Text: {} - '{}'", 956, "Additional");
  REQUIRE(tv1 == "Text: 956 - 'Additional'");
  REQUIRE(arena.contains(tv1.data()));

  arena.clear();

  using Map = swoc::IntrusiveHashMap<Thing::Linkage>;
  Map *ihm  = arena.make<Map>();

The code is a bit awkward and repetitious due to being example code. Production code would look something like

  auto arg_tuple{std::forward_as_tuple(args...)};
  w.print_v(fmt, arg_tuple);
  if (w.error()) {
    FixedBufferWriter(arena.require(w.extent()).remnant()).print_v(fmt, arg_tuple);
  }
  return arena.alloc(w.extent()).rebind<char>();
}

TEST_CASE("MemArena example", "[libswoc][MemArena][example]") {
  struct Thing {
    using self_type = Thing;

Because this is a function there’s no more repetition of the format string and arguments in use.

  {
    std::string key_1{"Key One"};
    std::string key_2{"Key Two"};

Inversion

“Inversion” is the technique of putting the MemArena inside its own arena. For collections of objects which all have the same lifetime this makes cleanup simple, as only the MemArena needs to be destructed. In addition, it can minimize calls to the base allocator. If a good guess as to the total size needed can be made, all of the objects, including the arena, can be provided for in a single call to malloc. This is used in Errata for performance reasons. Here is a simplified example.

  {
    std::unique_ptr<MemArena> arena{new MemArena};

Things of note:

  • A local stack variable is used to bootstrap the MemArena.

  • A new MemArena is created in the MemArena and the essence of the local arena moved there.

  • This works even if data has already been allocated in the arena, but it’s best to do first.

  • The resulting pointer is cleaned up by calling the destructor, not by delete. This is because, as noted earlier, it is not valid to call delete on an object in the arena.

  • The destructor is carefully implemented to work correctly even if the class instance is in its own internal memory blocks.

In actual use, the pointer would be a class member, not another local variable, and the inversion delayed until the memory was actually needed so that if the arena isn’t needed, no allocation is done.

Although delete cannot be used on an inverted MemArena, it is still possible to use code:std::unique_ptr by overriding the cleanup.

Now for the actual example. This uses the localize function from the previous example.

 1    TextView local_tv = localize(ta, tv);
 2    REQUIRE(local_tv == tv);
 3    REQUIRE(ta.contains(local_tv.data()));
 4
 5    // 16 bytes.
 6    std::unique_ptr<MemArena, void (*)(MemArena *)> arena(ta.make<MemArena>(std::move(ta)), destroyer);
 7
 8    REQUIRE(ta.size() == 0);
 9    REQUIRE(ta.contains(local_tv.data()) == false);
10
11    REQUIRE(arena->size() >= local_tv.size());
12    REQUIRE(local_tv == tv);
13    REQUIRE(arena->contains(local_tv.data()));
14
15    TextView local_text = localize(*arena, text);
16    REQUIRE(local_text == text);
17    REQUIRE(local_tv != local_text);
18    REQUIRE(local_tv.data() != local_text.data());
19    REQUIRE(arena->contains(local_text.data()));
20    REQUIRE(arena->size() >= local_tv.size() + local_text.size());
21  }
22
23  {
24    MemArena ta;

In this code, ta is the local bootstrap temporary. The std::unique_ptr is declared to take as a “deleter” a function taking a MemArena *. The actual function is the lambda in destroyer declared in line 1. This could have been done directly inline, such as

  }

  {
    auto destroyer = [](MemArena *arena) -> void { arena->~MemArena(); };

but I think separating it is better looking code.

The checking code shows that memory can be allocated in the arena first, but is not in the arena after the inversion. It is, however, in the inverted arena at the same address as demonstrated by line 16. All of the allocated memory involved gets cleaned up when the std::unqiue_ptr is destructed.

Inversion is a bit tricky to understand but that’s all there is to writing code to use it, although setting up the std::unique_ptr can be done in a variety of ways. Although these are functionally identical, the memory footprint varies as noted in the comments. This can make a difference for objects that are intended to be very small if no memory is allocated.

Two styles of lambda use have been shown, but there is one more of interest.

  }
};

template <typename... Args>

This requires the separate declaration of the lambda but cuts the std::unique_ptr size in half. Note every lambda is a distinct type, even if the code is character for character identical. This means the pointer must be passed destroyer - there is no other object that will be valid. Despite this, it must be passed as an argument, it cannot be omitted. But, hey, it saves 8 bytes!

Instead of a lambda, one can define a function

  arena->~MemArena();
}

TEST_CASE("MemArena inversion", "[libswoc][MemArena][example][inversion]") {
  TextView tv{"You done messed up A-A-Ron"};

and use that

  {
    MemArena ta;

Finally one can define a functor

  void
  operator()(T *t) {
    t->~T();
  }
};

void

and use that

  {
    MemArena ta;

Note in this case the std::unqiue_ptr is only 8 bytes (a single pointer) and doesn’t require an argument to the constructor.

Internals

Allocated memory is tracked by two linked lists, one for current memory and the other for frozen memory. The latter is used only while the arena is frozen. The list of blocks is maintained by IntrusiveDList which means deallocating the memory blocks suffices to clean up. Memory blocks are kept in an active pool so that if a new block is allocated to satisfy a particular allocation or requirement, unused space in previous blocks is still available for use.