Line data Source code
1 : #ifndef HEADER_fd_src_funk_fd_funk_h
2 : #define HEADER_fd_src_funk_fd_funk_h
3 :
4 : /* Funk is a hybrid of a database and version control system designed
5 : for ultra high performance blockchain applications.
6 :
7 : The data model is a flat table of records. A record is a xid/key-val
8 : pair and records are fast O(1) indexable by their xid/key. xid is
9 : short for "transaction id" and xids have a compile time fixed size
10 : (e.g. 32-bytes). keys also have a compile time fixed size (e.g.
11 : 64-bytes). Record values can vary in length from zero to a compile
12 : time maximum size. The xid of all zeros is reserved for the "root"
13 : transaction described below. Outside this, there are no
14 : restrictions on what a record xid, key or val can be. Individual
15 : records can be created, updated, and deleted arbitrarily. They are
16 : just binary data as far as funk is concerned.
17 :
18 : The maximum number of records is practically only limited by the size
19 : of the workspace memory backing it. At present, each record requires
20 : 160 bytes of metadata (this includes records that are published and
21 : records that are in the process of being updated). In other words,
22 : about 15 GiB record metadata per hundred million records. The
23 : maximum number of records that can be held by a funk instance is set
24 : when that it was created (given the persistent and relocatable
25 : properties described below though, it is straightforward to resize
26 : this).
27 :
28 : The transaction model is richer than what is found in a regular
29 : database. A transaction is a xid-"updates to parent transaction"
30 : pair and transactions are fast O(1) indexable by xid. There is no
31 : limitation on the number of updates in a transaction. Updates to the
32 : record value are represented as the complete value record to make it
33 : trivial to apply cryptographic operations like hashing to all updated
34 : values in a transaction with file I/O, operating system calls, memory
35 : data marshalling overhead, etc.
36 :
37 : Like records, the maximum number of transactions in preparation is
38 : practically only limited by the size of the workspace memory backing
39 : it. At present, a transaction requires 96 bytes of memory. As such,
40 : it is practical to track a large number of forks during an extended
41 : period of time of consensus failure in a block chain application
42 : without using much workspace memory at all. The maximum number of
43 : transactions that can be in preparation at any given time by a funk
44 : instance is set when that it was created (as before, given the
45 : persistent and relocatable properties described below, it is
46 : straightforward to resize this).
47 :
48 : That is, a transaction is a compact representation of the entire
49 : history of _all_ the database records up to that transaction. We can
50 : trace a transaction's ancestors back to the "root" give the complete
51 : history of all database records up to that transaction. The “root”
52 : transaction is the ancestor of all transactions. The transaction
53 : history is linear from the root transaction until the "last
54 : published" transaction and cannot be modified.
55 :
56 : To start "preparing" a new transaction, we pick the new transaction's
57 : xid (ideally unique among all transactions thus far) and fork off a
58 : "parent" transaction. This operation virtually clones all database
59 : records in the parent transaction, even if the parent itself has not
60 : yet been "published". Given the above, the parent transaction can be
61 : the last published transaction or another in-preparation transaction.
62 :
63 : Record creates, reads, writes, erases take place within the context
64 : of a transaction, effectively isolating them to a private view of the
65 : world. If a transaction is "cancelled", the changes to a record are
66 : harmlessly discarded. Records in a transaction that has children
67 : cannot be changed ("frozen").
68 :
69 : As such, it is not possible to modify the records in transactions
70 : strictly before the last published transaction. However, it is
71 : possible to modify the records of the last published transaction if
72 : there no transactions in preparation. This is useful, for example,
73 : loading up a transaction from a checkpointed state on startup. A
74 : common idiom at start of a block though is to fork the potential
75 : transaction of that block from its parent (freezing its parent) and
76 : then fork a child of the the potential transaction that will hold
77 : updates to the block that are incrementally "merged" into the
78 : potential transaction as block processing progresses.
79 :
80 : Critically, in-preparation transactions form a tree of dependent and
81 : competing histories. This model matches blockchains, where
82 : speculative work can proceed on several blocks at once long before
83 : the blocks are finalized. When a transaction is published, all its
84 : ancestors are also published, any competing histories are
85 : cancelled, leaving only a linear history up to the published
86 : transaction. There is no practical limitation on the complexity of
87 : this tree.
88 :
89 : Funk tolerates applications crashing or being killed. On a clean
90 : process termination, the state of the database will correspond to the
91 : last published transactions and all in-preparation transactions as
92 : they were at termination. Extensive memory integrity checkers are
93 : provided to help with resuming / recovering if a code is killed
94 : uncleanly / crashes / etc in the middle of funk operations. Hardware
95 : failures (or abrupt power loss) are not handled. These latter
96 : scenarios require hardware solutions such redundant disk arrays and
97 : uninterruptible power supplies and/or background methods for writing
98 : published records to permanent storage described below.
99 :
100 : Under the hood, the database state is stored in NUMA and TLB
101 : optimized shared memory (i.e. fd_wksp) such that various database
102 : operations can be used concurrently by multiple threads distributed
103 : arbitrarily over multiple processes zero copy.
104 :
105 : Database operations are at algorithmic minimums with reasonably high
106 : performance implementations. Most are fast O(1) time and all are
107 : small O(1) space (e.g. in complex transaction tree operations, there
108 : is no use of dynamic allocation to hold temporaries and no use of
109 : recursion to bound stack utilization at trivial levels). Further,
110 : there are no explicit operating system calls and, given a well
111 : optimized workspace (i.e. the wksp pages fit within a core's TLBs) no
112 : implicit operating system calls. Critical operations (e.g. those
113 : that actually might impact transaction history) are fortified against
114 : memory corruption (e.g. robust against DoS attack by corrupting
115 : transaction metadata to create loops in transaction trees or going
116 : out of bounds in memory). Outside of record values, all memory used
117 : is preallocated. And record values are O(1) lockfree concurrent
118 : allocated via fd_alloc using the same wksp as funk (the
119 : implementation is structured in layers that are straightforward to
120 : retarget for particular applications as might be necessary).
121 :
122 : The shared memory used by a funk instance is within a workspace such
123 : that it is also persistent and remotely inspectable. For example, a
124 : process attached to a funk instance can be terminated and a new
125 : process can resume exactly where the original process left off
126 : instantly (e.g. no file I/O). Or a real-time monitor could
127 : visualizing the ongoing activity in a database non-invasively (e.g.
128 : forks in flight, records updated by forks, etc). Or an auxiliary
129 : process could be lazily and non-invasively writing all published
130 : records to permanent storage in the background in parallel with
131 : on-going operations.
132 :
133 : The records are further stored in the workspace memory relocatably.
134 : For example, workspace memory could just be committed to a persistent
135 : memory as is (or backed by NVMe or such directly), copied to a
136 : different host, and processes on the new host could resume (indeed,
137 : though it wouldn't be space efficient, the shared memory region is
138 : usable as is as an on-disk checkpoint file). Or the workspace could
139 : be resized and what not to handle large needs than when the database
140 : was initially created and it all "just works". */
141 :
142 : //#include "fd_funk_base.h" /* Includes ../util/fd_util.h */
143 : //#include "fd_funk_txn.h" /* Includes fd_funk_base.h */
144 : //#include "fd_funk_rec.h" /* Includes fd_funk_txn.h */
145 : #include "fd_funk_val.h" /* Includes fd_funk_rec.h */
146 : #include "fd_funk_part.h"
147 : #include "fd_funk_archive.h"
148 :
149 : /* FD_FUNK_{ALIGN,FOOTPRINT} describe the alignment and footprint needed
150 : for a funk. ALIGN should be a positive integer power of 2.
151 : FOOTPRINT is multiple of ALIGN. These are provided to facilitate
152 : compile time declarations. */
153 :
154 : #define FD_FUNK_ALIGN (128UL)
155 : #define FD_FUNK_FOOTPRINT (256UL)
156 :
157 : /* The details of a fd_funk_private are exposed here to facilitate
158 : inlining various operations. */
159 :
160 110874 : #define FD_FUNK_MAGIC (0xf17eda2ce7fc2c01UL) /* firedancer funk version 1 */
161 :
162 : struct __attribute__((aligned(FD_FUNK_ALIGN))) fd_funk_private {
163 :
164 : /* Metadata */
165 :
166 : ulong magic; /* ==FD_FUNK_MAGIC */
167 : ulong funk_gaddr; /* wksp gaddr of this in the backing wksp, non-zero gaddr */
168 : ulong wksp_tag; /* Tag to use for wksp allocations, positive */
169 : ulong seed; /* Seed for various hashing function used under the hood, arbitrary */
170 : ulong cycle_tag; /* Next cycle_tag to use, used internally for various data integrity checks */
171 : volatile ulong write_lock; /* Incremented at the start of a write operation, and again at the end */
172 :
173 : /* The funk transaction map stores the details about transactions
174 : in preparation and their relationships to each other. This is a
175 : fd_map_giant and more details are given in fd_funk_txn.h
176 :
177 : txn_max is the maximum number of transactions that can be in
178 : preparation. Due to the use of compressed map indices to reduce
179 : workspace memory footprint required, txn_max is at most
180 : FD_FUNK_TXN_IDX_NULL (currently ~4B). This should be more than
181 : ample for anticipated uses cases ... e.g. every single validator in
182 : a pool of tens of thousands Solana validator had its own fork and
183 : with no consensus ever being achieved, a funk with txn_max at the
184 : limits of a compressed index will be chug along for days to weeks
185 : before running out of indexing space. But if ever needing to
186 : support more, it is straightforward to change the code to not use
187 : index compression. Then, a funk (with a planet sized workspace
188 : backing it) would survive a similar scenario for millions of years.
189 : Presumably, if such a situation arose, in the weeks to eons while
190 : there was consensus, somebody would notice and care enough to
191 : intervene (if not it is probably irrelevant to the real world
192 : anyway).
193 :
194 : txn_map_gaddr is the wksp gaddr of the fd_funk_txn_map_t used by
195 : this funk. Since this is a fd_map_giant under the hood and those
196 : are relocatable, it is possible to move this around within the wksp
197 : backing the funk if necessary. Such can be helpful if needing to
198 : do offline rebuilding, resizing, serialization, deserialization,
199 : etc.
200 :
201 : child_{head,tail}_cidx are compressed txn map indices. After
202 : decompression, they give the txn map index of the {oldest,youngest}
203 : child of funk (i.e. an in-preparation transaction whose parent
204 : transaction id is last_publish). FD_FUNK_TXN_IDX_NULL indicates
205 : the funk is childless. Thus, if head/tail is FD_FUNK_TXN_IDX_NULL,
206 : tail/head will be too. Records in a childless funk can be
207 : modified. Will be FD_FUNK_TXN_IDX_NULL if txn_max is zero.
208 :
209 : last_publish is the ID of the last published transaction. It will
210 : be the root transaction if no transactions have been published.
211 : Will be the root transaction immediately after construction. */
212 :
213 : ulong txn_max; /* In [0,FD_FUNK_TXN_IDX_NULL] */
214 : ulong txn_map_gaddr; /* Non-zero wksp gaddr with tag wksp_tag
215 : seed ==fd_funk_txn_map_seed (txn_map)
216 : txn_max==fd_funk_txn_map_key_max(txn_map) */
217 : uint child_head_cidx; /* After decompression, in [0,txn_max) or FD_FUNK_TXN_IDX_NULL, FD_FUNK_TXN_IDX_NULL if txn_max 0 */
218 : uint child_tail_cidx; /* " */
219 :
220 : /* Padding to FD_FUNK_TXN_XID_ALIGN here */
221 :
222 : fd_funk_txn_xid_t root[1]; /* Always equal to the root transaction */
223 : fd_funk_txn_xid_t last_publish[1]; /* Root transaction immediately after construction, not root thereafter */
224 :
225 : /* The funk record map stores the details about all the records in
226 : the funk, including all those in the last published transaction and
227 : all those getting updated in an in-preparation translation. This
228 : is a fd_map_giant and more details are given in fd_funk_txn.h
229 :
230 : rec_max is the maximum number of records that can exist in this
231 : funk.
232 :
233 : rec_map_gaddr is the wksp gaddr of the fd_funk_rec_map_t used by
234 : this funk. Since this is a fd_map_giant under the hood and those
235 : are relocatable, it is possible to move this around within the wksp
236 : backing the funk if necessary. Such can be helpful if needing to
237 : do offline rebuilding, resizing, serialization, deserialization,
238 : etc. */
239 :
240 : ulong rec_max;
241 : ulong rec_map_gaddr; /* Non-zero wksp gaddr with tag wksp_tag
242 : seed ==fd_funk_rec_map_seed (rec_map)
243 : rec_max==fd_funk_rec_map_key_max(rec_map) */
244 : ulong rec_head_idx; /* Record map index of the first record, FD_FUNK_REC_IDX_NULL if none (from oldest to youngest) */
245 : ulong rec_tail_idx; /* " last " */
246 :
247 : ulong partvec_gaddr; /* Address of partition header vector */
248 :
249 : /* The funk alloc is used for allocating wksp resources for record
250 : values. This is a fd_alloc and more details are given in
251 : fd_funk_val.h. Allocations from this allocator will be tagged with
252 : wksp_tag and operations on this allocator will use concurrency
253 : group 0.
254 :
255 : TODO: Consider letter user just passing a join of alloc (and maybe
256 : the cgroup_idx to give the funk), inferring the wksp, cgroup from
257 : that and allocating exclusively from that? */
258 :
259 : ulong alloc_gaddr; /* Non-zero wksp gaddr with tag wksp tag */
260 :
261 : int speed_load; /* Is "speed load mode" active */
262 : /* Address and size of remaining bump allocation space */
263 : ulong speed_bump_gaddr;
264 : ulong speed_bump_remain;
265 :
266 : /* Padding to FD_FUNK_ALIGN here */
267 : };
268 :
269 : FD_PROTOTYPES_BEGIN
270 :
271 : /* Constructors */
272 :
273 : /* fd_funk_{align,footprint} return FD_FUNK_{ALIGN,FOOTPRINT}. */
274 :
275 : FD_FN_CONST ulong
276 : fd_funk_align( void );
277 :
278 : FD_FN_CONST ulong
279 : fd_funk_footprint( void );
280 :
281 : /* fd_wksp_new formats an unused wksp allocation with the appropriate
282 : alignment and footprint as a funk. Caller is not joined on return.
283 : Returns shmem on success and NULL on failure (shmem NULL, shmem
284 : misaligned, zero wksp_tag, shmem is not backed by a wksp ... logs
285 : details). A workspace can be used by multiple funk concurrently.
286 : They will dynamically share the underlying workspace (along with any
287 : other non-funk usage) but will otherwise act as completely separate
288 : non-conflicting funks. To help with various diagnostics, garbage
289 : collection and what not, all allocations to the underlying wksp are
290 : tagged with the given tag (positive). Ideally, the tag used here
291 : should be distinct from all other tags used by this workspace but
292 : this is not required. */
293 :
294 : void *
295 : fd_funk_new( void * shmem,
296 : ulong wksp_tag,
297 : ulong seed,
298 : ulong txn_max,
299 : ulong rec_max );
300 :
301 : /* fd_funk_join joins the caller to a funk instance. shfunk points to
302 : the first byte of the memory region backing the funk in the caller's
303 : address space. Returns an opaque handle of the join on success
304 : (IMPORTANT! DO NOT ASSUME THIS IS A CAST OF SHFUNK) and NULL on
305 : failure (NULL shfunk, misaligned shfunk, shfunk is not backed by a
306 : wksp, bad magic, ... logs details). Every successful join should
307 : have a matching leave. The lifetime of the join is until the
308 : matching leave or the thread group is terminated (joins are local to
309 : a thread group). */
310 :
311 : fd_funk_t *
312 : fd_funk_join( void * shfunk );
313 :
314 : /* fd_funk_leave leaves an existing join. Returns the underlying
315 : shfunk (IMPORTANT! DO NOT ASSUME THIS IS A CAST OF FUNK) on success
316 : and NULL on failure. Reasons for failure include funk is NULL (logs
317 : details). */
318 :
319 : void *
320 : fd_funk_leave( fd_funk_t * funk );
321 :
322 : /* fd_funk_delete unformats a wksp allocation used as a funk
323 : (additionally frees all wksp allocations used by that funk). Assumes
324 : nobody is or will be joined to the funk. Returns shmem on success
325 : and NULL on failure (logs details). Reasons for failure include
326 : shfunk is NULL, misaligned shfunk, shfunk is not backed by a
327 : workspace, etc. */
328 :
329 : void *
330 : fd_funk_delete( void * shfunk );
331 :
332 : /* Accessors */
333 :
334 : /* fd_funk_wksp returns the local join to the wksp backing the funk.
335 : The lifetime of the returned pointer is at least as long as the
336 : lifetime of the local join. Assumes funk is a current local join. */
337 :
338 1553687040 : FD_FN_PURE static inline fd_wksp_t * fd_funk_wksp( fd_funk_t * funk ) { return (fd_wksp_t *)(((ulong)funk) - funk->funk_gaddr); }
339 :
340 : /* fd_funk_wksp_tag returns the workspace allocation tag used by the
341 : funk for its wksp allocations. Will be positive. Assumes funk is a
342 : current local join. */
343 :
344 22026411 : FD_FN_PURE static inline ulong fd_funk_wksp_tag( fd_funk_t * funk ) { return funk->wksp_tag; }
345 :
346 : /* fd_funk_seed returns the hash seed used by the funk for various hash
347 : functions. Arbitrary value. Assumes funk is a current local join.
348 : TODO: consider renaming hash_seed? */
349 :
350 3 : FD_FN_PURE static inline ulong fd_funk_seed( fd_funk_t * funk ) { return funk->seed; }
351 :
352 : /* fd_funk_txn_max returns maximum number of in-preparations the funk
353 : can support. Assumes funk is a current local join. Return in
354 : [0,FD_FUNK_TXN_IDX_NULL]. */
355 :
356 3 : FD_FN_PURE static inline ulong fd_funk_txn_max( fd_funk_t * funk ) { return funk->txn_max; }
357 :
358 : /* fd_funk_txn_map returns a pointer in the caller's address space to
359 : the funk's transaction map. */
360 :
361 : FD_FN_PURE static inline fd_funk_txn_t * /* Lifetime is that of the local join */
362 : fd_funk_txn_map( fd_funk_t * funk, /* Assumes current local join */
363 313954953 : fd_wksp_t * wksp ) { /* Assumes wksp == fd_funk_wksp( funk ) */
364 313954953 : return (fd_funk_txn_t *)fd_wksp_laddr_fast( wksp, funk->txn_map_gaddr );
365 313954953 : }
366 :
367 : /* fd_funk_last_publish_child_{head,tail} returns a pointer in the
368 : caller's address space to {oldest,young} child of funk, NULL if the
369 : funk is childless. All pointers are in the caller's address space.
370 : These are all a fast O(1) but not fortified against memory data
371 : corruption. */
372 :
373 : FD_FN_PURE static inline fd_funk_txn_t * /* Lifetime as described in fd_funk_txn_query */
374 : fd_funk_last_publish_child_head( fd_funk_t * funk, /* Assumes current local join */
375 5541 : fd_funk_txn_t * map ) { /* Assumes map == fd_funk_txn_map( funk, fd_funk_wksp( funk ) ) */
376 5541 : ulong idx = fd_funk_txn_idx( funk->child_head_cidx );
377 5541 : if( fd_funk_txn_idx_is_null( idx ) ) return NULL; /* TODO: Consider branchless? */
378 5538 : return map + idx;
379 5541 : }
380 :
381 : FD_FN_PURE static inline fd_funk_txn_t * /* Lifetime as described in fd_funk_txn_query */
382 : fd_funk_last_publish_child_tail( fd_funk_t * funk, /* Assumes current local join */
383 5541 : fd_funk_txn_t * map ) { /* Assumes map == fd_funk_txn_map( funk, fd_funk_wksp( funk ) ) */
384 5541 : ulong idx = fd_funk_txn_idx( funk->child_tail_cidx );
385 5541 : if( fd_funk_txn_idx_is_null( idx ) ) return NULL; /* TODO: Consider branchless? */
386 5538 : return map + idx;
387 5541 : }
388 :
389 : /* fd_funk_root returns a pointer in the caller's address space to the
390 : transaction id of the root transaction. Assumes funk is a current
391 : local join. Lifetime of the returned pointer is the lifetime of the
392 : current local join. The value at this pointer will always be the
393 : root transaction id. */
394 :
395 274550217 : FD_FN_CONST static inline fd_funk_txn_xid_t const * fd_funk_root( fd_funk_t * funk ) { return funk->root; }
396 :
397 : /* fd_funk_last_publish returns a pointer in the caller's address space
398 : to transaction id of the last published transaction. Assumes funk is
399 : a current local join. Lifetime of the returned pointer is the
400 : lifetime of the current local join. The value at this pointer will
401 : be constant until the next transaction is published. */
402 :
403 3145734 : FD_FN_CONST static inline fd_funk_txn_xid_t const * fd_funk_last_publish( fd_funk_t * funk ) { return funk->last_publish; }
404 :
405 : /* fd_funk_is_frozen returns 1 if the records of the last published
406 : transaction are frozen (i.e. the funk has children) and 0 otherwise
407 : (i.e. the funk is childless). Assumes funk is a current local join. */
408 :
409 : FD_FN_PURE static inline int
410 173665497 : fd_funk_last_publish_is_frozen( fd_funk_t const * funk ) {
411 173665497 : return fd_funk_txn_idx( funk->child_head_cidx )!=FD_FUNK_TXN_IDX_NULL;
412 173665497 : }
413 :
414 : /* fd_funk_rec_max returns maximum number of records that can be held
415 : in the funk. This includes both records of the last published
416 : transaction and records for transactions that are in-flight. */
417 :
418 0 : FD_FN_PURE static inline ulong fd_funk_rec_max( fd_funk_t * funk ) { return funk->rec_max; }
419 :
420 : /* fd_funk_rec_map returns a pointer in the caller's address space to
421 : the funk's record map. */
422 :
423 : FD_FN_PURE static inline fd_funk_rec_t * /* Lifetime is that of the local join */
424 : fd_funk_rec_map( fd_funk_t * funk, /* Assumes current local join */
425 1510842912 : fd_wksp_t * wksp ) { /* Assumes wksp == fd_funk_wksp( funk ) */
426 1510842912 : return (fd_funk_rec_t *)fd_wksp_laddr_fast( wksp, funk->rec_map_gaddr );
427 1510842912 : }
428 :
429 : /* fd_funk_rec_global_cnt returns current number of records that are held
430 : in the funk. This includes both records of the last published
431 : transaction and records for transactions that are in-flight. */
432 : FD_FN_PURE static inline ulong
433 : fd_funk_rec_global_cnt( fd_funk_t * funk, /* Assumes current local join */
434 0 : fd_wksp_t * wksp ) { /* Assumes wksp == fd_funk_wksp( funk ) */
435 0 : fd_funk_rec_t * map = (fd_funk_rec_t *)fd_wksp_laddr_fast( wksp, funk->rec_map_gaddr );
436 0 : return fd_funk_rec_map_key_cnt( map );
437 0 : }
438 :
439 : /* fd_funk_last_publish_rec_{head,tail} returns a pointer in the
440 : caller's address space to {oldest,young} record (by creation) of all
441 : records in the last published transaction, NULL if the last published
442 : transaction has no records. All pointers are in the caller's address
443 : space. These are all a fast O(1) but not fortified against memory
444 : data corruption. */
445 :
446 : FD_FN_PURE static inline fd_funk_rec_t const * /* Lifetime as described in fd_funk_rec_query */
447 : fd_funk_last_publish_rec_head( fd_funk_t const * funk, /* Assumes current local join */
448 105369282 : fd_funk_rec_t const * rec_map ) { /* Assumes == fd_funk_rec_map( funk, fd_funk_wksp( funk ) ) */
449 105369282 : ulong rec_head_idx = funk->rec_head_idx;
450 105369282 : if( fd_funk_rec_idx_is_null( rec_head_idx ) ) return NULL; /* TODO: consider branchless */
451 105364695 : return rec_map + rec_head_idx;
452 105369282 : }
453 :
454 : FD_FN_PURE static inline fd_funk_rec_t const * /* Lifetime as described in fd_funk_rec_query */
455 : fd_funk_last_publish_rec_tail( fd_funk_t const * funk, /* Assumes current local join */
456 102223554 : fd_funk_rec_t const * rec_map ) { /* Assumes == fd_funk_rec_map( funk, fd_funk_wksp( funk ) ) */
457 102223554 : ulong rec_tail_idx = funk->rec_tail_idx;
458 102223554 : if( fd_funk_rec_idx_is_null( rec_tail_idx ) ) return NULL; /* TODO: consider branchless */
459 102223551 : return rec_map + rec_tail_idx;
460 102223554 : }
461 :
462 : /* fd_funk_alloc returns a pointer in the caller's address space to
463 : the funk's allocator. */
464 :
465 : FD_FN_PURE static inline fd_alloc_t * /* Lifetime is that of the local join */
466 : fd_funk_alloc( fd_funk_t * funk, /* Assumes current local join */
467 32000739 : fd_wksp_t * wksp ) { /* Assumes wksp == fd_funk_wksp( funk ) */
468 32000739 : return fd_alloc_join_cgroup_hint_set( (fd_alloc_t *)fd_wksp_laddr_fast( wksp, funk->alloc_gaddr ), fd_tile_idx() );
469 32000739 : }
470 :
471 : /* Operations */
472 :
473 : /* fd_funk_descendant returns the funk's youngest descendant that has no
474 : globally competing transaction history currently or NULL if funk
475 : has no children or all of the children of funk are in competition.
476 : That is, this is as far as fd_funk_txn_publish can publish before it
477 : needs to start canceling competing transaction histories. This is
478 : O(length of descendant history) and this is not fortified against
479 : transaction map data corruption. Assumes funk is a current local
480 : join. The returned pointer lifetime and address space is as
481 : described in fd_funk_txn_query. */
482 :
483 : FD_FN_PURE static inline fd_funk_txn_t *
484 : fd_funk_last_publish_descendant( fd_funk_t * funk,
485 3145731 : fd_funk_txn_t * txn_map ) { /* Assumes == fd_funk_txn_map( funk, fd_funk_wksp( funk ) ) */
486 3145731 : ulong child_idx = fd_funk_txn_idx( funk->child_head_cidx );
487 3145731 : if( fd_funk_txn_idx_is_null( child_idx ) ) return NULL;
488 3040383 : return fd_funk_txn_descendant( txn_map + child_idx, txn_map );
489 3145731 : }
490 :
491 : /* Misc */
492 :
493 : /* Enable/disable "speed load mode". When in this mode, record values
494 : are bump allocated and never freed. This speeds up the case where
495 : we are initializing the database with a vast number of
496 : mostly read-only records. */
497 :
498 : void
499 : fd_funk_speed_load_mode( fd_funk_t * funk, int flag );
500 :
501 : /* fd_funk_verify verifies the integrity of funk. Returns
502 : FD_FUNK_SUCCESS if funk appears to be intact and FD_FUNK_ERR_INVAL
503 : otherwise (logs details). Assumes funk is a current local join (NULL
504 : returns FD_FUNK_ERR_INVAL and logs details.) */
505 :
506 : int
507 : fd_funk_verify( fd_funk_t * funk );
508 :
509 : /* fd_funk_log_mem_usage logs useful statistics about memory usage */
510 :
511 : void
512 : fd_funk_log_mem_usage( fd_funk_t * funk );
513 :
514 : /* APIs for marking the start and end of an operation that modifies
515 : the database. These should be called by the application before and
516 : after doing an update. */
517 :
518 : void fd_funk_start_write( fd_funk_t * funk );
519 : void fd_funk_end_write( fd_funk_t * funk );
520 :
521 : /* Checks that we are inside a start_write/end_write block. Fails if
522 : * we are not. */
523 :
524 : void fd_funk_check_write( fd_funk_t * funk );
525 :
526 : FD_PROTOTYPES_END
527 :
528 : #endif /* HEADER_fd_src_funk_fd_funk_h */
|