README


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113

### Magic Buddy

A buddy allocator with zero metadata overhead.

#### Benefits over buddy allocators that store a free-or-not bit on the block:

    1. When you request an allocation of 2^k, it only allocates 2^k space, not
       2^{k+1} to account for the extra free-or-not-bit.

#### Benefits over buddy allocators that store a bitset:

    1. You save at least a few MB in metadata, or more if you're allocating
       huge amounts of space. This can be pretty important if, e.g., you're
       doing a per-process buddy allocator over the entire virtual memory
       space in an embedded operating system context.

    2. Comparably simple implementation; only ~250 loc, and there is no need to
       track a metadata region.

Benchmarking magic_buddy allocator for the Lua interpreter against another
single-file buddy allocator (https://spaskalev.github.io/buddy_alloc/) shows
magic_buddy giving ~4x better end-to-end performance on my machine for some
benchmarks, even though we have about 10x less code (see lua_benchmark/ and
run "bash benchmark.sh tests/Lua-Benchmarks/binary-trees.lua").

The reason for this seems to be that Spaskalev's buddy_alloc always walks the
tree, whereas we keep free lists. So if your program involves frequent
allocations of similar sizes, we can do much better.

#### Features of the allocator shared with other buddy allocators include:

    1. All operations are constant time/linear only in the number of address
       bits.

    2. The library is small, simple, and convenient to use. It even supports
       unaligned base addresses.

    3. Support for realloc and reserving subregions (e.g., device MMIO
       addresses).

    4. Core functionality is reasonably well-tested with an exhaustive
       implementation model checker.

#### Caveats are:

    1. The minimum allocation size is nontrivial, approximately 64 bytes

    2. Your application must keep track of allocation sizes, and provide them
       when liberating a region.

    3. There is a miniscule-but-nonzero chance the buddy allocator gets
       corrupted; the probability is configurable by the user. The default has
       a 1/2^128-chance of getting corrupted per program operation.

    4. Some advanced allocator features, e.g., iterating over allocated regions
       or attempting to dynamically improve cache locality, are not supported.

#### What should you use instead?

    1. Zach Yedidia suggests a K&R style free list using two interleaved AVL
       trees instead of a linked list. This provides essentially the same
       worst-case time and space guarantees of our approach, without the
       2^-128 chance of failure.

Traditional optimized buddy allocators (see Knuth Vol 1) require at least 1/2
extra bit per allocation region.

But this means (as Akshay Srivatsan pointed out to me recently) that an
allocation of size 2^k might cause an actual allocation of size 2^k + 1/8,
rounding up to 2^{k+1} because buddy allocators can only allocate in
powers-of-two byte amounts. This is surprisingly wasteful.

Instead, this repo has a buddy allocator that requires no extra metadata
overhead. It stores the "freed-or-not bit" implicitly: a block is considered
freed iff its first 128 bits is some magic cookie value determined randomly and
kept secret from the application.

The only sources of memory overhead are:

    1. Fragmentation from allocations being forced to be powers of 2.

    2. Fragmentation from a (rather large) minimum allowed allocation size of
       ~64 bytes.

But it retains the nice property of buddy allocators that liberation is linear
in the computer's address size.

### Model Checking
The imc/ directory includes code to "exhaustively fuzz" the allocator. The
basic approach is based on the EXPLODE paper by Yang, Sar, and Engler.

Current status of the test harness(es):

    - [X] init_buddy
    - [X] allocate
    - [X] liberate
    - [X] reallocate
    - [X] move_buddy
    - [X] reserve
    - [X] grow_buddy
    - [X] shrink_buddy

### Usage

Usage example in main.c.

First, initialize the library by giving it both the base + size for the
allocation pool to be managed, along with some "very random" (or otherwise
secret) data. Then, you can allocate and liberate as normal.

### License

AGPLv3