LCOV - code coverage report
Current view: top level - util/alloc - fd_alloc.h (source / functions) Hit Total Coverage
Test: cov.lcov Lines: 18 22 81.8 %
Date: 2025-07-09 04:58:18 Functions: 8 3670 0.2 %

          Line data    Source code
       1             : #ifndef HEADER_fd_src_util_alloc_fd_alloc_h
       2             : #define HEADER_fd_src_util_alloc_fd_alloc_h
       3             : 
       4             : /* fd_alloc is a high performance lockfree fast O(1) (typically)
       5             :    allocator.
       6             : 
       7             :    It is optimized for high concurrency use and small-ish clustered /
       8             :    multi-modal distributed allocation sizes.  It is further optimized
       9             :    for single-threaded use cases and/or when malloc-free pairs have have
      10             :    good thread affinity (i.e. frees done by the same thread that did the
      11             :    corresponding malloc).  It can also be used optimally in more complex
      12             :    threading use cases (e.g. malloc in one or more producer threads,
      13             :    free in one or more consumer threads).  It behaves well with
      14             :    irregular sizes and exploits ultra fine grained alignment for good
      15             :    packing (e.g. reasonable low memory overhead packing of byte strings
      16             :    with irregular small-ish sizes).
      17             : 
      18             :    A fd_alloc stores its state in a wksp in a persistent way and backs
      19             :    its allocations by that same wksp.  This avoids many of the severe
      20             :    performance and reliability issues of malloc
      21             : 
      22             :    Critically, it _doesn't_ _lie_ and it _doesn't_ _blow_ _up_.
      23             : 
      24             :    fd_alloc_malloc will not stall your program behind your back, calling
      25             :    the OS to grow or shrink the program's memory footprint during the
      26             :    call; it will never use more memory than has already be procured for
      27             :    the underlying wksp.  And, if fd_alloc_malloc succeeds, the returned
      28             :    memory is real and is ready for use.
      29             : 
      30             :    Obligatory dynamic allocation rant *********************************
      31             : 
      32             :    That is, fd_alloc is not the absolute unforgivable garbage of
      33             :    Linux/libc malloc.  malloc often just reserves page table entries and
      34             :    returns, irrespective of whether or not the request can be satisfied
      35             :    (on the apparent belief that the malloc call was a bluff and the user
      36             :    is probably a bad dev who doesn't bother with error trapping anyway),
      37             :    in hopes that that a later glacially slow page fault to the OS will
      38             :    actually reserve the memory.
      39             : 
      40             :    Which, even when it does work, it will by its very nature will be at
      41             :    the worst possible times (e.g. in the middle of incoming line rate
      42             :    network traffic bursts ... data structures try to grow to accommodate
      43             :    but slowing down throughput faster than they are growing at a time
      44             :    when keeping up is critical to surviving ... and then on a
      45             :    ridiculously awful normal page by normal page basis), exposing the
      46             :    caller to non-deterministic performance and reduced throughput.
      47             : 
      48             :    Unfortunately, getting overrun by DoS-like traffic patterns is the
      49             :    least of the worries.  When Linux can't back one of the page by DRAM
      50             :    on a page fault (skipping over some additional TLB and NUMA
      51             :    dubiousness that goes on under the hood), it goes from glacial
      52             :    performance to continental drift levels of performance.  It will try
      53             :    to honor the request by shuffling things to swap, exacerbating the
      54             :    above.  Suddenly it is a feat to even keep up with a 1980s modem.
      55             : 
      56             :    But that's not the end of the horror.  Because Linux thinks it cool
      57             :    to overcommit beyond physical limits for no discernible reason and
      58             :    gets flaky if you try to disable swap and/or overcommit, the page
      59             :    fault might not be able honor the commitment.  Finding itself caught
      60             :    in a lie (it can't go back in time and rescind the success that
      61             :    malloc already returned to the unsuspecting developer), the Linux
      62             :    kernel goes full HAL-9000 and starts randomly killing things.  A dead
      63             :    process can't complain about malloc lying to it after all.  And,
      64             :    cherry on top, the victims of the oom killer are frequently not even
      65             :    the culprits.
      66             : 
      67             :    Sigh ... all completely unacceptable behaviors in any situation, much
      68             :    less mission critical ones.
      69             : 
      70             :    TL;DR Friends don't let friends malloc.
      71             : 
      72             :    If you truly need malloc-free semantics, use fd_alloc.  This at least
      73             :    eliminates the most egregious horrors above.  It can't help the
      74             :    intrinsic horrors though.
      75             : 
      76             :    (Though it is ingrained in CS teaching and languages to the extent
      77             :    there's rarely even recognition of the faintest possibility of the
      78             :    existence of alternatives, people rarely truly need malloc/free
      79             :    semantics.  But, after they convince themselves they still do because
      80             :    of the brainwashing, they need to remind themselves that computers
      81             :    don't work remotely like malloc/free suggest and then should try to
      82             :    think about resource acquisition more fundamentally.  And, after they
      83             :    still manage to talk themselves back into needing it because of the
      84             :    teaching and linguistic traps, repeat ... at least if they want to
      85             :    make something fast and robust.  Even if they can prove dynamic
      86             :    allocation requests have an attainable worst level at all points in
      87             :    time, they still have to prove that heap fragmentation over time will
      88             :    never cause malloc to fail.  Good luck with that.)
      89             : 
      90             :    The above rant applies to any paired dynamic memory strategies,
      91             :    including non-placement new, implicit copy constructors, dynamic
      92             :    resizing containers, etc.  Real world computers aren't just funky
      93             :    implementations of infinite tape Turing machines.  This make-believe
      94             :    that they are in code that interacts with the real world is a recipe
      95             :    for real world disaster.
      96             : 
      97             :    End of obligatory dynamic allocation rant **************************
      98             : 
      99             :    Since it is backed by a wksp, allocations have the same NUMA, TLB,
     100             :    IPC and persistence properties of the underlying wksp.  This allows
     101             :    fd_alloc to go far beyond the capabilities of a typical allocator
     102             :    Allocations done by fd_alloc can be shared between processes (can
     103             :    even malloc in one process, translate the pointer into the address
     104             :    space of another process, and free it there, even after the first
     105             :    process has terminated), a process can be stopped and then other
     106             :    processes can still find the stopped process's allocations and use
     107             :    them / free them / etc.
     108             : 
     109             :    Regarding time efficiency and concurrency, large allocations are
     110             :    passed through to the underlying wksp allocator (which is neither
     111             :    O(1) and only "quasi"-lockfree in the sense described in fd_wksp.h).
     112             :    But the allocation strategies used under the hood (loosely inspired
     113             :    by Hoard-style lockfree allocators but with a lot of optimizations
     114             :    and tweaks for the above) are such that, in the common case of not
     115             :    needing to fall back to the underlying wksp allocator, the allocator
     116             :    is lockfree O(1).
     117             : 
     118             :    Regarding spatial efficiency, it is reasonably space efficient
     119             :    (overhead for a cstr-style allocation is ~4 bytes) and adapts over
     120             :    time to try to bound the amount of pre-allocation for small requests. */
     121             : 
     122             : #include "../wksp/fd_wksp.h"
     123             : 
     124             : /* FD_ALLOC_{ALIGN,FOOTPRINT} give the required alignment and footprint
     125             :    needed for a wksp allocation to be suitable as a fd_alloc.  ALIGN is
     126             :    an integer pointer of 2 and FOOTPRINT is an integer multiple of
     127             :    ALIGN.  These are provided to facilitate compile time declarations.
     128             :    4096 for ALIGN has been is picked to be normal-page like and match
     129             :    the minimum alignment of a fd_wksp_alloc. */
     130             : 
     131             : #define FD_ALLOC_ALIGN     (4096UL)
     132         228 : #define FD_ALLOC_FOOTPRINT (20480UL)
     133             : 
     134             : /* FD_ALLOC_MALLOC_ALIGN_DEFAULT gives the alignment that will be used
     135             :    when the user does not specify an alignment.  This will be an integer
     136             :    power of 2 of at least 16 for C/C++ allocator alignment conformance.
     137             :    (16 instead of 8 on the grounds that 128-bit is a primitive type on
     138             :    platforms with FD_HAS_INT128.) */
     139             : 
     140    25354801 : #define FD_ALLOC_MALLOC_ALIGN_DEFAULT (16UL)
     141             : 
     142             : /* FD_ALLOC_JOIN_CGROUP_HINT_MAX is maximum value for a cgroup hint.
     143             :    This is an integer power of 2 minus 1 of at most FD_ALLOC_ALIGN. */
     144             : 
     145    75861772 : #define FD_ALLOC_JOIN_CGROUP_HINT_MAX (15UL)
     146             : 
     147             : /* A "fd_alloc_t *" is an opaque handle of an fd_alloc. */
     148             : 
     149             : struct fd_alloc;
     150             : typedef struct fd_alloc fd_alloc_t;
     151             : 
     152             : FD_PROTOTYPES_BEGIN
     153             : 
     154             : /* fd_alloc_{align,footprint} return FD_ALLOC_{ALIGN,FOOTPRINT}. */
     155             : 
     156             : FD_FN_CONST ulong
     157             : fd_alloc_align( void );
     158             : 
     159             : FD_FN_CONST ulong
     160             : fd_alloc_footprint( void );
     161             : 
     162             : /* fd_alloc_new formats an unused wksp allocation with the appropriate
     163             :    alignment and footprint as a fd_alloc.  Caller is not joined on
     164             :    return.  Returns shmem on success and NULL on failure (shmem NULL,
     165             :    shmem misaligned, shmem is not backed by a wksp ... logs details).  A
     166             :    workspace can have multiple fd_alloc created for it.  They will
     167             :    dynamically share the underlying workspace along with any other
     168             :    non-fd_alloc usage but will otherwise act as completely separate
     169             :    non-conflicting arenas (useful for logical grouping and improved
     170             :    concurrency).  To help with various diagnostics, garbage collection
     171             :    and what not, all allocations to the underlying wksp are tagged with
     172             :    the given tag, positive.  Ideally, the tag used here should be
     173             :    distinct from all other tags used by this workspace. */
     174             : 
     175             : void *
     176             : fd_alloc_new( void * shmem,
     177             :               ulong  tag );
     178             : 
     179             : /* fd_alloc_join joins the caller to a fd_alloc.  shalloc points to the
     180             :    first byte of the memory region backing the alloc in the caller's
     181             :    address space.  Returns an opaque handle of the join on success
     182             :    (IMPORTANT! THIS IS NOT JUST A CAST OF SHALLOC) and NULL on failure
     183             :    (NULL shalloc, misaligned shalloc, bad magic, ... logs details).
     184             :    Every successful join should have a matching leave.  The lifetime of
     185             :    the join is until the matching leave or the thread group is
     186             :    terminated (joins are local to a thread group).
     187             : 
     188             :    cgroup_hint is a concurrency hint used to optimize parallel and
     189             :    persistent use cases. Ideally each thread (regardless of thread
     190             :    group) should join the allocator with a different cgroup_hint system
     191             :    wide (note that joins are practically free).  And if using a fd_alloc
     192             :    in a persistent way, logical streams of execution would ideally
     193             :    preserve the cgroup_hint address starts and stops of that stream for
     194             :    the most optimal affinity behaviors.  0 is fine in single threaded
     195             :    use cases and 0 and/or collisions are fine in more general cases
     196             :    though concurrent performance might be reduced due to additional
     197             :    contention between threads that share the same cgroup_hint.  If
     198             :    cgroup_hint is not in [0,FD_ALLOC_JOIN_CGROUP_HINT_MAX], it will be
     199             :    wrapped to be in that range.
     200             : 
     201             :    TL;DR A cgroup_hint of 0 is often a practical choice single threaded.
     202             :    A cgroup_hint of fd_tile_idx() or just uniform random 64-bit value
     203             :    choice in more general situations. */
     204             : 
     205             : fd_alloc_t *
     206             : fd_alloc_join( void * shalloc,
     207             :                ulong  cgroup_hint );
     208             : 
     209             : /* fd_alloc_leave leaves an existing join.  Returns the underlying
     210             :    shalloc (IMPORTANT! THIS IS NOT A SIMPLE CAST OF JOIN) on success and
     211             :    NULL on failure.  Reasons for failure include join is NULL (logs
     212             :    details). */
     213             : 
     214             : void *
     215             : fd_alloc_leave( fd_alloc_t * join );
     216             : 
     217             : /* fd_alloc_delete unformats a wksp allocation used as a fd_alloc.
     218             :    Assumes nobody is or will be joined to the fd_alloc.  The caller
     219             :    further promises there are no allocations outstanding.  If there are
     220             :    still some outstanding allocations, it will try to clean up as many
     221             :    as it can find but it is not guaranteed to find all of them (those
     222             :    will continue to consume wksp space but could be theoretically be
     223             :    cleaned up in an application specific way by operating directly on
     224             :    the underlying workspace ... of course, if the application could do
     225             :    that, it probably such just clean up after itself before calling
     226             :    delete).  Returns shmem on success and NULL on failure (logs
     227             :    details).  Reasons for failure include shalloc is NULL, misaligned
     228             :    fd_alloc, bad magic, etc. */
     229             : 
     230             : void *
     231             : fd_alloc_delete( void * shalloc );
     232             : 
     233             : /* fd_alloc_join_cgroup_hint returns the cgroup_hint of the current
     234             :    join.  Assumes join is a current local join.  The return will be in
     235             :    [0,FD_ALLOC_JOIN_CGROUP_HINT_MAX].
     236             : 
     237             :    fd_alloc_join_cgroup_hint_set returns join with the cgroup_hint
     238             :    updated to provided cgroup_hint.  If cgroup hint is not in
     239             :    [0,FD_ALLOC_JOIN_CGROUP_HINT_MAX], it will be wrapped into this
     240             :    range.  Assumes join is a current local join.  The return value is
     241             :    not a new join. */
     242             : 
     243             : FD_FN_CONST static inline ulong
     244    25773428 : fd_alloc_join_cgroup_hint( fd_alloc_t * join ) {
     245    25773428 :   return ((ulong)join) & FD_ALLOC_JOIN_CGROUP_HINT_MAX;
     246    25773428 : }
     247             : 
     248             : FD_FN_CONST static inline fd_alloc_t *
     249             : fd_alloc_join_cgroup_hint_set( fd_alloc_t * join,
     250         150 :                                ulong        cgroup_hint ) {
     251         150 :   return (fd_alloc_t *)((((ulong)join) & (~FD_ALLOC_JOIN_CGROUP_HINT_MAX)) | (cgroup_hint & FD_ALLOC_JOIN_CGROUP_HINT_MAX));
     252         150 : }
     253             : 
     254             : /* fd_alloc_wksp returns a pointer to a local wksp join of the wksp
     255             :    backing the fd_alloc with the current local join.  Caller should not
     256             :    call fd_alloc_leave on the returned value.  Lifetime of the returned
     257             :    wksp handle is as long as the shalloc used on the fd_alloc_join is
     258             :    still mapped into the caller's address space.
     259             : 
     260             :    fd_alloc_tag returns the tag that will be used for allocations from
     261             :    this workspace. */
     262             : 
     263             : FD_FN_PURE fd_wksp_t * fd_alloc_wksp( fd_alloc_t * join ); // NULL indicates NULL join
     264             : FD_FN_PURE ulong       fd_alloc_tag ( fd_alloc_t * join ); // Positive, 0 indicates NULL join
     265             : 
     266             : /* fd_alloc_malloc_at_least allocates at least sz bytes with alignment
     267             :    of at least align from the wksp backing the fd_alloc.  join is a
     268             :    current local join to the fd_alloc.  align should be an integer power
     269             :    of 2 or 0.
     270             : 
     271             :    An align of 0 indicates to use FD_ALLOC_MALLOC_DEFAULT_ALIGN for the
     272             :    request alignment.  This will be large enough such that
     273             :    fd_alloc_malloc is conformant with C/C++ alignment specifications
     274             :    (i.e. can trivially wrap fd_alloc_malloc to use as a drop in
     275             :    replacement for malloc).
     276             : 
     277             :    Small values of align will NOT be rounded up to some minimum (e.g.
     278             :    allocating lots of 1 byte aligned short strings is fine and
     279             :    relatively space and time efficient ... the overhead is ~4 bytes per
     280             :    allocation).  fd_alloc is not particularly optimized when align>~sz
     281             :    and/or large alignments (>~4096B).  While large values for align are
     282             :    supported by fd_alloc_malloc, directly using fd_wksp_alloc is
     283             :    recommended in such cases.
     284             : 
     285             :    If an allocation is "large" (align + sz >~ 64KiB for the current
     286             :    implementation), it will be handled by fd_wksp_alloc under the hood.
     287             :    Otherwise, it will be handled by fd_alloc_malloc algorithms (which
     288             :    are ultimately backed by fd_wksp_alloc).  As such, if a small
     289             :    allocation is "new" (e.g. first allocation of a size around sz, an
     290             :    allocation that can't be packed near other existing allocations
     291             :    around that sz, etc), this might also fallback on fd_wksp_alloc.
     292             :    Typically though, after initial allocation and/or program warmup,
     293             :    fd_alloc_malloc calls will be a reasonably fast O(1) lockfree.
     294             : 
     295             :    Returns a pointer to the allocation in the local address space on
     296             :    success.  Note that this pointer will a wksp laddr.  As such, it can
     297             :    be converted to a gaddr, passed to other threads in other thread
     298             :    groups, and converted to a wksp laddr in their address spaces, freed
     299             :    via a join to the fd_alloc in that thread group, persisted beyond the
     300             :    lifetime of the calling thread, etc.
     301             : 
     302             :    Returns NULL on failure (silent to support HPC usage) or when sz is
     303             :    0.  Reasons for failure include NULL join, invalid align, sz overflow
     304             :    (sz+align>~2^64), no memory available for request (e.g. workspace has
     305             :    insufficient room or is too fragmented).
     306             : 
     307             :    On return, *max will contain the number actual number of bytes
     308             :    available at the returned gaddr.  On success, this will be at least
     309             :    sz and it is not guaranteed to be a multiple of align.  On failure,
     310             :    *max will be zero.
     311             : 
     312             :    fd_alloc_malloc is a simple wrapper around fd_alloc_malloc_at_least
     313             :    for use when applications do not care about the actual size of their
     314             :    allocation. */
     315             : 
     316             : void *
     317             : fd_alloc_malloc_at_least( fd_alloc_t * join,
     318             :                           ulong        align,
     319             :                           ulong        sz,
     320             :                           ulong *      max );
     321             : 
     322             : static inline void *
     323             : fd_alloc_malloc( fd_alloc_t * join,
     324             :                  ulong        align,
     325      105141 :                  ulong        sz ) {
     326      105141 :   ulong max[1];
     327      105141 :   return fd_alloc_malloc_at_least( join, align, sz, max );
     328      105141 : }
     329             : 
     330             : /* fd_alloc_free frees the outstanding allocation whose first byte is
     331             :    pointed to by laddr in the caller's local address space.  join is a
     332             :    current local join to the fd_alloc.  The caller promises laddr was
     333             :    allocated by the underlying fd_alloc (but not necessarily on the
     334             :    calling thread or even in this calling process or even by a thread /
     335             :    process that is still running).  Silent for HPC usage (NULL join and
     336             :    NULL laddr are a no-op).
     337             : 
     338             :    Like fd_alloc_malloc, if the allocation was large, this will be
     339             :    handled by fd_wksp_free under the hood, which is neither lockfree nor
     340             :    O(1).  If the allocation was small, this will typically be lockfree
     341             :    O(1).  It is possible that, if the amount of outstanding small
     342             :    allocations has reduced significantly, fd_alloc_free on a small
     343             :    allocation might trigger a fd_wksp_free to free up wksp space for
     344             :    other usage (including uses not through this fd_alloc).
     345             : 
     346             :    (It would be possible to implement this less efficiently in space and
     347             :    time such that join didn't need to be passed.  The current design has
     348             :    picked efficiency and consistency with other APIs though.)
     349             : 
     350             :    Note that this will implicitly optimize the freed memory to be
     351             :    preferentially reused by the join's concurrency group.  Thus the
     352             :    caller should have at least one join for each concurrency group to
     353             :    which it might want to return memory for reuse and then call free
     354             :    with the appropriate join. */
     355             : 
     356             : void
     357             : fd_alloc_free( fd_alloc_t * join,
     358             :                void *       laddr );
     359             : 
     360             : /* fd_alloc_compact frees all wksp allocations that are not required
     361             :    for any outstanding user mallocs (note that fd_alloc_free lazily
     362             :    returns unused memory from the underlying wksp to accelerate
     363             :    potential future allocations).  join is a current local join to the
     364             :    alloc.  This cannot fail from a user's POV but logs any wonkiness
     365             :    detected.
     366             : 
     367             :    fd_alloc_compact has the property that it minimizes the amount of
     368             :    wksp utilization for the set of outstanding user mallocs when there
     369             :    is no other concurrent alloc usage.  As such, if there is no
     370             :    concurrent alloc usage _and_ there are no outstanding mallocs, on
     371             :    return, all wksp allocations (except the user provided memory region
     372             :    that holds the state of the allocator) will be returned to the wksp.
     373             :    This can be then be used to reset the alloc and/or implement robust
     374             :    leak detection at program teardown.
     375             : 
     376             :    This function is safe to use even when there is other concurrent
     377             :    alloc usage.  It t is best effort in that case; it is not guaranteed
     378             :    that there was some point in time between call and return when the
     379             :    wksp utilization was minimized for the contemporaneous set of
     380             :    outstanding user mallocs.
     381             : 
     382             :    Also note that this function is not O(1) and the fd_alloc_free lazy
     383             :    return mechanism does not permit unbounded growth of unreturned free
     384             :    memory.  So this should be used sparingly at best (e.g. in teardown
     385             :    leak detection or rare non-critical path housekeeping). */
     386             : 
     387             : void
     388             : fd_alloc_compact( fd_alloc_t * join );
     389             : 
     390             : /* fd_alloc_is_empty returns 1 if the alloc has no outstanding mallocs
     391             :    and 0 otherwise.  join is a current local join to the alloc.  NULL
     392             :    join silently returns 0.
     393             : 
     394             :    Important safety tip!  This should only be run when there is no
     395             :    concurrent alloc usage.  It is not algorithmically fast.  This might
     396             :    temporarily lock the underlying wksp while running and might call
     397             :    fd_alloc_compact under the hood.  It assumes the user provided memory
     398             :    region holding the alloc state is contained within a region returned
     399             :    by a single fd_wksp_alloc call (it would be hard to create an alloc
     400             :    where that isn't the case).  It assumes alloc is the only user of the
     401             :    alloc's tag in the wksp.  As such this should be used carefully and
     402             :    sparingly (e.g. at program teardown for leak detection).
     403             : 
     404             :    It will "work" with concurrent alloc usage in that the return value
     405             :    will be in 0 or 1 and it will not corrupt the alloc or underlying
     406             :    wksp.  But the return value will not be well-defined (e.g. it is not
     407             :    guaranteed to correspond the state of the alloc at some point in time
     408             :    between when this was called and it when it returned). */
     409             : 
     410             : int
     411             : fd_alloc_is_empty( fd_alloc_t * join );
     412             : 
     413             : /* fd_alloc_max_expand computes a recommended value to use for max when
     414             :    needing to dynamically resize structures.  The below is very subtle
     415             :    and fixes a lot of pervasive errors with dynamic resizing
     416             :    implementations (either explicit or implicitly done under the hood).
     417             :    It doesn't fix the main error with dynamic resizing though.  The main
     418             :    error being deciding to use anything with dynamic resizing (outside
     419             :    of, maybe, initialization at program start).
     420             : 
     421             :    Consider an all too common case of an initially too small dynamically
     422             :    sized array that is getting elements appended to it one at a time.
     423             :    E.g. without proper error trapping, overflow handling and the like:
     424             : 
     425             :      foo_t * foo       = NULL;
     426             :      ulong   foo_max   = 0UL;
     427             :      ulong   foo_cnt   = 0UL;
     428             :      ulong   foo_delta = ... some reasonable increment ...;
     429             : 
     430             :      while( ... still appending ... ) {
     431             : 
     432             :        if( foo_cnt==foo_max ) { // Need to resize
     433             :          foo_max += foo_delta;
     434             :          foo = (foo_t *)realloc( foo, foo_max*sizeof(foo_t) );
     435             :        }
     436             : 
     437             :        foo[ foo_cnt++ ] = ... next val to append ...;
     438             :      }
     439             : 
     440             :    This is terrible theoretically and practically and yet it looks like
     441             :    it does everything right.
     442             : 
     443             :    The theoretical issue is that, if the realloc can't be done in-place
     444             :    (which is more common than most realize ... depends on how the
     445             :    underlying realloc implementation details), the memory will have to
     446             :    be copied from the original location to the resized location with a
     447             :    typical cost of final_foo_max/2 -> O(final_foo_cnt).  Because max is
     448             :    increased by fixed absolute amount each resizing, there will be
     449             :    final_foo_cnt/foo_delta -> O(final_foo_cnt) such resizes.
     450             : 
     451             :    That is, we've accidentally written a method that has a slow
     452             :    O(final_foo_cnt^2) worst case even though it superficially looks like
     453             :    a fast O(final_foo_cnt) method.  Worse still, this behavior might
     454             :    appear suddenly in previously fine code if realloc implementation
     455             :    changes or, yet again worse, because a larger problem size was used
     456             :    in the wild than used in testing.
     457             : 
     458             :    The practical issue is realloc is painfully slow and it gets worse
     459             :    for large sizes because large sizes are usually handled by operating
     460             :    system calls (e.g. mmap or sbrk under the hood).  We've also now done
     461             :    O(final_foo_cnt) slow operating system calls in our already
     462             :    algorithmically slow O(final_foo_cnt^2) worst case algorithm that
     463             :    still superficially looks like a fast O(final_foo_cnt).  (And throw
     464             :    in the other issues with malloc described above about TLB and NUMA
     465             :    inefficiency, the gaslighting the kernel does "clearly the crash had
     466             :    nothing to do with the OOM killer shooting processes randomly in the
     467             :    head, your program probably just had a bug ... yeah ... that's the
     468             :    ticket" ... for good measure).
     469             : 
     470             :    We can get an algorithmic improvement if we change the above to
     471             :    increase max by a fixed relative amount each resize.  Since we are
     472             :    dealing with integers though, we should make sure that we always
     473             :    increase max by some minimal amount.  Instead of:
     474             : 
     475             :      foo_max += foo_delta;
     476             : 
     477             :    we can use something like:
     478             : 
     479             :      foo_max = fd_ulong_max( foo_max*gamma, foo_max + foo_delta );
     480             : 
     481             :    If gamma>1, asymptotically, we will only do O(lg cnt) resizes.
     482             :    Theoretically, we've gone from an O(final_foo_cnt^2) worst case
     483             :    method to an O(final_foo_cnt lg final_foo_cnt) worst case method.  It
     484             :    is still irritating that it looks superficially like a fast
     485             :    O(final_foo_cnt) method but this is amongst the many reasons why
     486             :    dynamic resizing is gross and wrong and to be avoided when possible.
     487             : 
     488             :    The larger gamma is, the smaller the leading coefficient is in the
     489             :    O(final_foo_cnt lg final_foo_cnt) and thus the better this
     490             :    approximates the fast O(final_foo_cnt) method that it superficially
     491             :    seems to be.  But using a very large gamma is clearly absurd.  There
     492             :    are obvious memory footprint limitations for large sizes and each
     493             :    resize would trigger an ever larger amount of OS work.  This raises
     494             :    the question:
     495             : 
     496             :    What is the optimal gamma?
     497             : 
     498             :    Suppose we have worst case realloc implementation (alloc new memory,
     499             :    copy, free old memory and, when no free fragment large enough is
     500             :    available, use sbrk like semantics to get memory from the O/S ...
     501             :    not uncommon as it is trivial to implement and often works "good
     502             :    enough" in lab settings).  It always works out-of-place and it always
     503             :    just appends new memory at the end of the heap when the heap runs out
     504             :    of space.  Then, while doing the above, asymptotically, we expect the
     505             :    heap to look something like:
     506             : 
     507             :      other allocs | M foo_t alloc | padding free | unmapped
     508             : 
     509             :    On the next resize, we'd request space for M gamma foo_t.  Since
     510             :    there are no free fragments large enough for this, realloc is going
     511             :    to have to map some space from the operating system, copy our memory
     512             :    into it and free up the original space for reuse.  Post resize, we
     513             :    expect the heap to look like:
     514             : 
     515             :      other allocs | M foo_t free | M gamma foo_t alloc | padding free | unmapped
     516             : 
     517             :    On the next resize, we'd request space for M gamma^2 foo_t.  This
     518             :    also can't fit within any free fragment above for gamma>1 (noting
     519             :    that, in this worst case realloc, we have to allocate the memory
     520             :    first and then copy and then free the old).  So we end up with:
     521             : 
     522             :      other allocs | M (1+gamma) foo_t free | M gamma^2 foo_t alloc | padding free | unmapped
     523             : 
     524             :    On the next resize, we'd request space for M gamma^3 foo_t.  If we
     525             :    have:
     526             : 
     527             :      gamma^3 < 1 + gamma
     528             : 
     529             :    we can fit this request in the hole left by the two previous resizes.
     530             :    This implies we need gamma<1.32471... where the magic number is the
     531             :    positive real root of:
     532             : 
     533             :      x^3 - x - 1  = 0
     534             : 
     535             :    This is the "silver ratio" in the sense that the positive real root
     536             :    of x^2 - x - 1 is the "golden ratio" of 1.61803...  (Note that the
     537             :    golden ratio would apply if we had a more sophisticated realloc under
     538             :    the hood that aliased the resized allocation over top the M foo_t
     539             :    free and the existing M gamma foo_t alloc and then moved the aliased
     540             :    memory.  Presumably such a sophisticated realloc would also just
     541             :    append to the end of the heap without any move or copy at all but
     542             :    that eventually leads to a question about how much overallocation and
     543             :    operating system overhead is acceptable on resize discussed further
     544             :    below).
     545             : 
     546             :    After a resize with something near but smaller than the silver ratio,
     547             :    we expect the heap to look like:
     548             : 
     549             :      other allocs | M gamma^3 foo_t alloc | padding free | unmapped
     550             : 
     551             :    which is back to where we started, except with a larger allocation.
     552             : 
     553             :    We don't want to be doing floating point math in methods like this.
     554             :    Noting that gamma = 1 + 1/4 + 1/16 = 1.3125 is very close to the
     555             :    silver yields the very practical:
     556             : 
     557             :      new_max = fd_ulong_max( max + (max>>2) + (max>>4), max + delta );
     558             : 
     559             :    This is friendly with even the worst case realloc behaviors under the
     560             :    hood.  It also works will in similar situations with linear storage
     561             :    media (e.g. disk storage).  The limit also means that the worst case
     562             :    overallocation for cases like the above at most ~32% and on average
     563             :    ~16%.  This is a comparable level of overallocation that already
     564             :    happens under the hood (e.g. on par with the level of waste that
     565             :    naturally happens in allocators for metadata and padding and much
     566             :    less waste than the golden ratio or larger growth rates if we
     567             :    dubiously trust that the realloc method under the hood).
     568             : 
     569             :    In cases where we might need to resize to even larger than this, we
     570             :    just resize to the caller's requested amount and keep our fingers
     571             :    crossed that the caller realized by this time dynamic resizing was a
     572             :    mistake and is allocating the correct size this time.
     573             : 
     574             :    Adding arithmetic overflow handling then yields the below.
     575             : 
     576             :    TL;DR  Example usage (ignoring size calculation overflow handling and
     577             :    allocation error trapping):
     578             : 
     579             :      ulong   foo_cnt   = 0UL;
     580             :      ulong   foo_max   = ... good estimate the actual amount needed;
     581             :      ulong   foo_delta = ... reasonable minimum resizing increment;
     582             :      foo_t * foo       = (foo_t *)malloc( foo_max*sizeof(foo_t) );
     583             : 
     584             :      while( ... still appending ... ) {
     585             : 
     586             :        if( FD_UNLIKELY( foo_cnt==foo_max ) ) {
     587             :          foo_max = fd_alloc_max_expand( foo_max, foo_delta, foo_cnt + foo_delta );
     588             :          foo     = (foo_t *)realloc( foo, foo_max*sizeof(foo_t) );
     589             :        }
     590             : 
     591             :        foo[ foo_cnt++ ] = ... next val to append ...;
     592             : 
     593             :      }
     594             : 
     595             :      ... at this point
     596             :      ... - foo has foo_cnt elements initialized
     597             :      ... - foo has room for foo_max elements total
     598             :      ... - when the initial foo_max estimate was correct or oversized,
     599             :      ...   no resizing was done
     600             :      ... - when the initial foo_max was undersized, asymptotically,
     601             :      ...   foo_max is at most ~32% larger worst case (~16% larger
     602             :      ...   average case) than foo_cnt with at most O(lg foo_cnt)
     603             :      ...   reallocs needed to initialize foo.
     604             :      ... - the resizing test branch is highly predictable
     605             :      ... - the underlying heap shouldn't be too fragmented or
     606             :      ...   overallocated regardless of the allocator implementation
     607             :      ...   details. */
     608             : 
     609             : FD_FN_CONST static inline ulong       /* new_max, new_max>=max(needed,max), if max<ULONG_MAX, will be new_max>max */
     610             : fd_alloc_max_expand( ulong max,
     611             :                      ulong delta,     /* Assumed > 0 */
     612     3000000 :                      ulong needed ) {
     613     3000000 :   ulong t0 = max + delta;               t0 = fd_ulong_if( t0<max, ULONG_MAX, t0 ); /* Handle overflow */
     614     3000000 :   ulong t1 = max + (max>>2) + (max>>4); t1 = fd_ulong_if( t1<max, ULONG_MAX, t1 ); /* Handle overflow */
     615     3000000 :   return fd_ulong_max( fd_ulong_max( t0, t1 ), needed );
     616     3000000 : }
     617             : 
     618             : /* fd_alloc_vtable is the virtual function table implementing fd_valloc
     619             :    for fd_alloc. */
     620             : 
     621             : extern const fd_valloc_vtable_t fd_alloc_vtable;
     622             : 
     623             : /* fd_alloc_virtual returns an abstract handle to the fd_alloc join.
     624             :    Valid for lifetime of join. */
     625             : 
     626             : FD_FN_CONST static inline fd_valloc_t
     627           0 : fd_alloc_virtual( fd_alloc_t * alloc ) {
     628           0 :   fd_valloc_t valloc = { alloc, &fd_alloc_vtable };
     629           0 :   return valloc;
     630           0 : }
     631             : 
     632             : FD_PROTOTYPES_END
     633             : 
     634             : #endif /* HEADER_fd_src_util_alloc_fd_alloc_h */

Generated by: LCOV version 1.14