### Magic Buddy A buddy allocator with zero metadata overhead. #### Benefits over buddy allocators that store a free-or-not bit on the block: 1. When you request an allocation of 2^k, it only allocates 2^k space, not 2^{k+1} to account for the extra free-or-not-bit. #### Benefits over buddy allocators that store a bitset: 1. You save at least a few MB in metadata, or more if you're allocating huge amounts of space. This can be pretty important if, e.g., you're doing a per-process buddy allocator over the entire virtual memory space in an embedded operating system context. 2. Comparably simple implementation; only ~250 loc, and there is no need to track a metadata region. Benchmarking magic_buddy allocator for the Lua interpreter against another single-file buddy allocator (https://spaskalev.github.io/buddy_alloc/) shows magic_buddy giving ~4x better end-to-end performance on my machine for some benchmarks, even though we have about 10x less code (see lua_benchmark/ and run "bash benchmark.sh tests/Lua-Benchmarks/binary-trees.lua"). The reason for this seems to be that Spaskalev's buddy_alloc always walks the tree, whereas we keep free lists. So if your program involves frequent allocations of similar sizes, we can do much better. #### Features of the allocator shared with other buddy allocators include: 1. All operations are constant time/linear only in the number of address bits. 2. The library is small, simple, and convenient to use. It even supports unaligned base addresses. 3. Support for realloc and reserving subregions (e.g., device MMIO addresses). 4. Core functionality is reasonably well-tested with an exhaustive implementation model checker. #### Caveats are: 1. The minimum allocation size is nontrivial, approximately 64 bytes 2. Your application must keep track of allocation sizes, and provide them when liberating a region. 3. There is a miniscule-but-nonzero chance the buddy allocator gets corrupted; the probability is configurable by the user. The default has a 1/2^128-chance of getting corrupted per program operation. 4. Some advanced allocator features, e.g., iterating over allocated regions or attempting to dynamically improve cache locality, are not supported. #### What should you use instead? 1. Zach Yedidia suggests a K&R style free list using two interleaved AVL trees instead of a linked list. This provides essentially the same worst-case time and space guarantees of our approach, without the 2^-128 chance of failure. Traditional optimized buddy allocators (see Knuth Vol 1) require at least 1/2 extra bit per allocation region. But this means (as Akshay Srivatsan pointed out to me recently) that an allocation of size 2^k might cause an actual allocation of size 2^k + 1/8, rounding up to 2^{k+1} because buddy allocators can only allocate in powers-of-two byte amounts. This is surprisingly wasteful. Instead, this repo has a buddy allocator that requires no extra metadata overhead. It stores the "freed-or-not bit" implicitly: a block is considered freed iff its first 128 bits is some magic cookie value determined randomly and kept secret from the application. The only sources of memory overhead are: 1. Fragmentation from allocations being forced to be powers of 2. 2. Fragmentation from a (rather large) minimum allowed allocation size of ~64 bytes. But it retains the nice property of buddy allocators that liberation is linear in the computer's address size. ### Model Checking The imc/ directory includes code to "exhaustively fuzz" the allocator. The basic approach is based on the EXPLODE paper by Yang, Sar, and Engler. Current status of the test harness(es): - [X] init_buddy - [X] allocate - [X] liberate - [X] reallocate - [X] move_buddy - [X] reserve - [X] grow_buddy - [X] shrink_buddy ### Usage Usage example in main.c. First, initialize the library by giving it both the base + size for the allocation pool to be managed, along with some "very random" (or otherwise secret) data. Then, you can allocate and liberate as normal. ### License AGPLv3