Line data Source code
1 : #ifndef HEADER_fd_src_discof_forest_fd_forest_h
2 : #define HEADER_fd_src_discof_forest_fd_forest_h
3 :
4 : /* Forest is an API for repairing blocks as they are discovered from the
5 : cluster via Turbine or Gossip. Shreds (from Turbine) and
6 : confirmations (from Tower) inform forest that slot exists. Repair
7 : ensures that this block is received in its entirety by requesting
8 : repairs for missing shreds for the block.
9 :
10 : Note that forest needs to track the strict subset of shreds that are
11 : known by fec_resolver, store, and reasm. If any of these structures
12 : have evicted shreds, forest needs to clear out the corresponding FEC
13 : sets from forest to be re-requested. It's okay if shreds are evicted
14 : from reasm and we re-request for them and they pass through
15 : fec_resolver again. Although we could be creating duplicate ctxs,
16 : thats fine! We might later on get an evict notice for the second
17 : incomplete ctx but that's okay too!!! just have a bunch of useless
18 : messages that eventually will get ignored when we publish past it!!!!
19 :
20 : Like other fork-aware structures, forest maintains a tree that
21 : records the ancestry of slots. It also maintains references to the
22 : tips of each known fork (the frontier map), and also the latest
23 : slot we finished repairing on each fork (the consumed map). Any slot
24 : that doesn't have a known ancestry connecting it back to the root yet
25 : is part of the orphaned map. And the head of every orphaned tree is
26 : part of the subtrees map. While this seems very verbose, it allows
27 : for fast iteration and lookup of the different types of slots.
28 :
29 : fd_policy makes orphan requests that recover gaps between orphaned
30 : subtrees and the main ancestry tree, and then the forest iterator
31 : suggest repairs to make progress on the tree forwards (using BFS). */
32 :
33 : /* Merkle root tracking.
34 : For each FEC set in the slot, we record the merkle root of the first
35 : shred we receive in `.merkle_roots[ fec_set_idx / 32 ]`. Then for any
36 : shred in the same FEC inserted later, the merkle root of the new
37 : shred is compared to the merkle root we have stored.
38 :
39 : If they are the same -> good.
40 : If they are different -> we're going to mark this merkle root as
41 : incorrect. We do this by setting the merkle
42 : root to a null hash for later detection.
43 :
44 : Note we don't verify the chain on each FEC arrival, because we can't
45 : tell whether the CMR of the following FEC is incorrect or if the
46 : current MR we have is incorrect. We can only verify the chain when
47 : we get a confirmation of a block_id.
48 :
49 : Eventually one of two things happen:
50 : 1. We are able to complete the version of the FEC with the merkle root
51 : we have stored. This is the common case, and means we only saw one
52 : version of the merkle root.
53 :
54 : 2. We are not able to complete any version of the FEC.
55 : - Imagine we get shred 0-15 of FEC_A. then get shreds 16-31 of
56 : FEC_B. We would have set the merkle root to the null hash for
57 : that FEC set, but fec_resolver would not be able to complete the
58 : FEC because from a shred index POV, we don't have anything we
59 : need to repair (and we won't be making any new requests for that
60 : FEC set).
61 : It's difficult to differentiate between a slot where we haven't
62 : finished repairing, and a slot we can't repair because the
63 : version we have is a bad version. So merkle chaining
64 : verification can only be performed on slots that have all the
65 : shreds received.
66 :
67 : 3. We receive some shreds for both FEC_A and FEC_B, but get a FEC
68 : completion for FEC_B.
69 : - Could possibly happen during turbine, like we repair some data
70 : shreds from FEC_A, but get a completion for FEC_B through
71 : turbine. At this point we'll take whatever we have completed
72 : first, so overwrite our merkle root entry. It's likely being
73 : overwritten from the null_hash to the FEC_B merkle root.
74 :
75 : So unfortunately...because of case 2, we determine "slot completion"
76 : status when all the shreds in the slot have been received, NOT when
77 : the slot completes with all the FEC completions. We can rely on that
78 : at least some version of all the shreds in the slot will arrive
79 : eventually.
80 :
81 : As soon as we have a confirmed block id, we can verify the slot by
82 : verifying the chain of merkle roots backwards. As the CMRs correctly
83 : chain, the verified status on each FEC set is set. If they don't
84 : chain, we dump & repair that specific FEC set. For example, say the
85 : 2nd & 3rd FEC set is incorrect. In this case, the merkle roots array
86 : and bitset will look like the following after one call of
87 : chain_verify(slot, confirmed_bid):
88 : actual last fec
89 : |
90 : v
91 : merkle_roots [ A, B', C', D, E, F, confirmed_bid] <- confirmed_bid stored for convenience
92 : merkle_verified [ 0, 0, 0, 1, 1, 1, 1 ]
93 :
94 : At this point, C' will be dumped and repaired. Since D is verified,
95 : and the CMR entry contains the correct version of C's merkle root, we
96 : can now verify any shred of FEC set C that arrives and reject if the
97 : merkle root doesn't match the cmr entry in D.
98 :
99 : After C is successfully repaired, the after_fec call in repair_tile
100 : will re-trigger chain_verify on the slot again. After this call of
101 : chain_verify, the merkle roots array and bitset will look like this:
102 :
103 : merkle_roots [ A, B', C, D, E, F, confirmed_bid] <- confirmed_bid stored for convenience
104 : merkle_verified [ 0, 0, 1, 1, 1, 1, 1 ]
105 :
106 : At this point, C is verified, but B' is detected as incorrect. The same
107 : dump and repair process is repeated for B'. Once that after_fec on B
108 : is called, the merkle roots array and bitset will look like this:
109 :
110 : merkle_roots [ A, B, C, D, E, F, confirmed_bid] <- confirmed_bid stored for convenience
111 : merkle_verified [ 1, 1, 1, 1, 1, 1, 1 ]
112 : confirmed = 1
113 :
114 : The chain verify progresses beyond this slot, and the ancestors of
115 : this slots will also be traversed until a confirmed slot is found, or
116 : another incorrect FEC is detected. Note that because earlier
117 : confirmations may have confirmed ancestors, and because there is once
118 : verification "in-progress at all times", confirmation status can look
119 : like:
120 :
121 : slot 1 - slot 2 - slot 3 - slot 4 - slot 5 - slot 6 - slot 7 ....
122 : confirmed: 1 1 0 0 0 1 1
123 :
124 : i.e. there will be up to two contiguous chains of confirmed slots in
125 : the forest, but not more. There can be unconfirmed slots after slot 7.
126 : There may be forks as well, but only one fork can be confirmed.
127 : */
128 :
129 : #include "../../disco/fd_disco_base.h"
130 : #include "../../disco/shred/fd_fec_set.h"
131 :
132 96 : #define FD_FOREST_MAGIC (0xf17eda2ce7b1c0UL) /* firedancer forest version 0 */
133 :
134 : #define SET_NAME fd_forest_blk_idxs
135 0 : #define SET_MAX FD_SHRED_BLK_MAX
136 : #include "../../util/tmpl/fd_set.c"
137 :
138 : /* fd_forest_blk_t implements a left-child, right-sibling n-ary
139 : tree. Each ele maintains the `pool` index of its left-most child
140 : (`child_idx`), its immediate-right sibling (`sibling_idx`), and its
141 : parent (`parent_idx`).
142 :
143 : This tree structure is gaddr-safe and supports accesses and
144 : operations from processes with separate local forest joins. */
145 :
146 : struct __attribute__((aligned(128UL))) fd_forest_blk {
147 : ulong slot; /* map key */
148 : ulong parent_slot; /* map key of the parent. invariant: if parent is populated, parent_slot is populated. the converse is not necessarily true. */
149 : ulong next; /* internal use by fd_pool, fd_map_chain */
150 : ulong parent; /* pool idx of the parent in the tree */
151 : ulong child; /* pool idx of the left-child */
152 : ulong sibling; /* pool idx of the right-sibling */
153 :
154 : ulong head; /* reserved by dlist. not all blks will be part of a dlist. */
155 : ulong tail; /* reserved by dlist */
156 :
157 : uint buffered_idx; /* highest contiguous buffered shred idx */
158 : uint complete_idx; /* shred_idx with SLOT_COMPLETE_FLAG ie. last shred idx in the slot */
159 :
160 : fd_forest_blk_idxs_t idxs[fd_forest_blk_idxs_word_cnt]; /* received data shred idxs */
161 : struct {
162 : fd_hash_t mr;
163 : fd_hash_t cmr;
164 : } merkle_roots[ FD_FEC_BLK_MAX ]; /* received merkle roots. mr is initialized to null hash, written to when a shred is
165 : received. invalidated to invalid_mr on multiple versions of the merkle root are detected. */
166 :
167 : fd_hash_t confirmed_bid; /* confirmed block id - can't be wrapped in the above struct because we can create sentinel blocks
168 : on confirmation, and don't know the index of the last fec set until we repair the slot.
169 : hash_null if unknown. Otherwise populated by the child slot's CMR on confirmation,
170 : or by a confirmation msg from tower. Has no bearing on if the full slot is correct or not. */
171 : uint lowest_verified_fec; /* lowest fec index that has been verified so far, inclusive. Equivalent to complete_idx / 32UL
172 : if the last merkle root is verified, n if every merkle root after fec set n*32 is verified.
173 : Otherwise, it is UINT_MAX. If non-UINT_MAX, then confirmed_bid must be populated (but not
174 : the vice versa). */
175 :
176 : uchar chain_confirmed; /* 1 if all the FECs the slot have been confirmed via fec_chain_verify, 0 otherwise. Note confirmed_bid
177 : can be populated before this is set to 1. */
178 :
179 : int est_buffered_tick_recv; /* tick of shred at buffered_idx. Note since we don't track all the
180 : ticks received, this will be a lower bound estimate on the highest tick we have seen.
181 : But this is only used for limiting eager repair, so an exact value is not necessary. */
182 :
183 : /* Metrics */
184 :
185 : fd_forest_blk_idxs_t code[fd_forest_blk_idxs_word_cnt]; /* code shred idxs */
186 : long first_shred_ts; /* tick of first shred rcved in slot != complete_idx */
187 : long first_req_ts; /* tick of first request sent in slot != complete_idx */
188 : uint turbine_cnt; /* number of shreds received from turbine */
189 : uint repair_cnt; /* number of data shreds received from repair */
190 : uint recovered_cnt; /* number of shreds recovered from reedsol recovery */
191 : };
192 : typedef struct fd_forest_blk fd_forest_blk_t;
193 :
194 : #define POOL_NAME fd_forest_pool
195 192 : #define POOL_T fd_forest_blk_t
196 : #include "../../util/tmpl/fd_pool.c"
197 :
198 : #define MAP_NAME fd_forest_ancestry
199 : #define MAP_ELE_T fd_forest_blk_t
200 474 : #define MAP_KEY slot
201 : #include "../../util/tmpl/fd_map_chain.c"
202 :
203 : #define MAP_NAME fd_forest_frontier
204 : #define MAP_ELE_T fd_forest_blk_t
205 654 : #define MAP_KEY slot
206 : #include "../../util/tmpl/fd_map_chain.c"
207 :
208 : #define MAP_NAME fd_forest_orphaned
209 : #define MAP_ELE_T fd_forest_blk_t
210 12363 : #define MAP_KEY slot
211 : #include "../../util/tmpl/fd_map_chain.c"
212 :
213 : #define MAP_NAME fd_forest_subtrees
214 : #define MAP_ELE_T fd_forest_blk_t
215 153 : #define MAP_KEY slot
216 : #include "../../util/tmpl/fd_map_chain.c"
217 :
218 : #define DLIST_NAME fd_forest_subtlist /* thread a dlist through the subtree elements for fast iteration */
219 : #define DLIST_ELE_T fd_forest_blk_t
220 48267 : #define DLIST_NEXT head
221 258 : #define DLIST_PREV tail
222 : #include "../../util/tmpl/fd_dlist.c"
223 :
224 : /* A reference to a forest element
225 :
226 : The following maps/pools are used to track future requests.
227 :
228 : Requests:
229 : - slots that branch from the main tree (ancestry) that are being
230 : repaired / have yet to be repaired. Maintained in a dlist, where
231 : the head is the current slot being repaired. Any slot in the
232 : requests list must be in ancestry or frontier.
233 :
234 : Orphreqs (orphaned requests):
235 : - slots that branch from the unconnected trees (subtrees/orphans) that are being repaired /
236 : have yet to be repaired. Maintained in a dlist, where the head
237 : is the current orphan request being repaired.
238 :
239 : Note that orphan requests are specifically an optimization from when
240 : we are catching up from very far behind. In the usual case when we
241 : boot and we are catching up from close behind, need orphans is
242 : very fast and has a non-negligible cost on total repair time. But
243 : during special cases where we are catching up from very far behind,
244 : need orphans can take a significant time because orphan requests
245 : cannot be pipelined. In this case, we can use time waiting for
246 : orphan requests to respond to also repair the full slots of these
247 : orphan trees.
248 :
249 : Consumed:
250 : - slots where the entire ancestry up to the root has been completed.
251 : There should be <= num forks elements in the consumed map.
252 : */
253 : struct fd_forest_ref {
254 : ulong idx; /* forest pool idx of the ele this ref refers to */
255 : ulong next; /* reserved by dlist */
256 : ulong prev; /* reserved by dlist */
257 : ulong hash; /* reserved by pool and map_chain */
258 : };
259 : typedef struct fd_forest_ref fd_forest_ref_t;
260 :
261 : #define MAP_NAME fd_forest_requests
262 : #define MAP_ELE_T fd_forest_ref_t
263 405 : #define MAP_KEY idx
264 771 : #define MAP_NEXT hash
265 : #include "../../util/tmpl/fd_map_chain.c"
266 :
267 : #define DLIST_NAME fd_forest_reqslist
268 : #define DLIST_ELE_T fd_forest_ref_t
269 834 : #define DLIST_NEXT next
270 546 : #define DLIST_PREV prev
271 : #include "../../util/tmpl/fd_dlist.c"
272 :
273 : #define POOL_NAME fd_forest_reqspool
274 192 : #define POOL_T fd_forest_ref_t
275 7785 : #define POOL_NEXT hash
276 : #include "../../util/tmpl/fd_pool.c"
277 :
278 : /* Below for fast tracking of contiguous completes slots */
279 : #define MAP_NAME fd_forest_consumed
280 : #define MAP_ELE_T fd_forest_ref_t
281 519 : #define MAP_KEY idx
282 1722 : #define MAP_NEXT hash
283 : #include "../../util/tmpl/fd_map_chain.c"
284 :
285 : #define DLIST_NAME fd_forest_conslist
286 : #define DLIST_ELE_T fd_forest_ref_t
287 1071 : #define DLIST_NEXT next
288 915 : #define DLIST_PREV prev
289 : #include "../../util/tmpl/fd_dlist.c"
290 :
291 : #define POOL_NAME fd_forest_conspool
292 192 : #define POOL_T fd_forest_ref_t
293 8043 : #define POOL_NEXT hash
294 : #include "../../util/tmpl/fd_pool.c"
295 :
296 : /* Reuse reqslist for orphan requests list, and share pool */
297 :
298 : /* Internal use only for BFSing */
299 : #define DEQUE_NAME fd_forest_deque
300 142839 : #define DEQUE_T ulong
301 : #include "../../util/tmpl/fd_deque_dynamic.c"
302 :
303 :
304 : /* fd_forest_t is the top-level structure that holds the root of
305 : the tree, as well as the memory pools and map structures.
306 :
307 : These structures are bump-allocated and laid out contiguously in
308 : memory from the fd_forest_t * pointer which points to the
309 : beginning of the memory region.
310 :
311 : --------------------- <- fd_forest_t *
312 : | metadata |
313 : |-------------------|
314 : | pool |
315 : |-------------------|
316 : | ancestry |
317 : |-------------------|
318 : | frontier |
319 : |-------------------|
320 : | subtrees |
321 : |-------------------|
322 : | orphaned |
323 : |-------------------|
324 : | requests |
325 : |-------------------|
326 : | reqslist |
327 : |-------------------|
328 : | reqspool |
329 : |-------------------|
330 : | orphreqs |
331 : |-------------------|
332 : | orphlist (reqlist)|
333 : |-------------------|
334 : | consumed |
335 : |-------------------|
336 : | conspool |
337 : |-------------------|
338 : | deque |
339 : ---------------------
340 :
341 : A valid, initialized forest is always non-empty. After
342 : `fd_forest_init` the forest will always have a root ele unless
343 : modified improperly out of forest's API.*/
344 :
345 : struct fd_forest_iter {
346 : ulong ele_idx;
347 : uint shred_idx;
348 : ulong list_gaddr; /* wksp gaddr of the list this iterator corresponds to */
349 : };
350 : typedef struct fd_forest_iter fd_forest_iter_t;
351 : struct __attribute__((aligned(128UL))) fd_forest {
352 : ulong root; /* pool idx of the root */
353 : ulong wksp_gaddr; /* wksp gaddr of fd_forest in the backing wksp, non-zero gaddr */
354 : ulong pool_gaddr; /* wksp gaddr of fd_pool */
355 : ulong ancestry_gaddr; /* wksp_gaddr of fd_forest_ancestry */
356 : ulong frontier_gaddr; /* leaves that needs repair */
357 : ulong subtrees_gaddr; /* head of orphaned trees */
358 : ulong orphaned_gaddr; /* map of parent_slot to singly-linked list of ele orphaned by that parent slot */
359 :
360 : ulong subtlist_gaddr; /* wksp gaddr of fd_forest_subtlist - linkedlist of subtree elements*/
361 :
362 : /* Request trackers */
363 :
364 : ulong requests_gaddr; /* map of slot to pool idx of the completed repair frontier */
365 : ulong reqslist_gaddr; /* wksp gaddr of fd_forest_reqslist */
366 : ulong reqspool_gaddr; /* wksp gaddr of fd_forest_reqspool */
367 :
368 : ulong consumed_gaddr; /* wksp gaddr of fd_forest_consumed */
369 : ulong conslist_gaddr; /* wksp gaddr of fd_forest_conslist */
370 : ulong conspool_gaddr; /* wksp gaddr of fd_forest_conspool */
371 :
372 : ulong orphreqs_gaddr; /* wksp gaddr of fd_forest_orphreqs */
373 : ulong orphlist_gaddr; /* wksp gaddr of fd_forest_orphlist */
374 :
375 : fd_forest_iter_t iter; /* requests iterator corresponding to head of requests deque */
376 : fd_forest_iter_t orphiter; /* orphan requests iterator corresponding to head of orphan requests list */
377 :
378 : ulong deque_gaddr; /* wksp gaddr of fd_forest_deque. internal use only for BFSing */
379 : ulong magic; /* ==FD_FOREST_MAGIC */
380 : };
381 : typedef struct fd_forest fd_forest_t;
382 :
383 : FD_PROTOTYPES_BEGIN
384 :
385 : /* Constructors */
386 :
387 : /* fd_forest_{align,footprint} return the required alignment and
388 : footprint of a memory region suitable for use as forest with up to
389 : ele_max eles and vote_max votes. */
390 :
391 : FD_FN_CONST static inline ulong
392 1110 : fd_forest_align( void ) {
393 1110 : return alignof(fd_forest_t);
394 1110 : }
395 :
396 : FD_FN_CONST static inline ulong
397 192 : fd_forest_footprint( ulong ele_max ) {
398 192 : return FD_LAYOUT_FINI(
399 192 : FD_LAYOUT_APPEND(
400 192 : FD_LAYOUT_APPEND(
401 192 : FD_LAYOUT_APPEND(
402 192 : FD_LAYOUT_APPEND(
403 192 : FD_LAYOUT_APPEND(
404 192 : FD_LAYOUT_APPEND(
405 192 : FD_LAYOUT_APPEND(
406 192 : FD_LAYOUT_APPEND(
407 192 : FD_LAYOUT_APPEND(
408 192 : FD_LAYOUT_APPEND(
409 192 : FD_LAYOUT_APPEND(
410 192 : FD_LAYOUT_APPEND(
411 192 : FD_LAYOUT_APPEND(
412 192 : FD_LAYOUT_APPEND(
413 192 : FD_LAYOUT_APPEND(
414 192 : FD_LAYOUT_APPEND(
415 192 : FD_LAYOUT_INIT,
416 192 : alignof(fd_forest_t), sizeof(fd_forest_t) ),
417 192 : fd_forest_pool_align(), fd_forest_pool_footprint ( ele_max ) ),
418 192 : fd_forest_ancestry_align(), fd_forest_ancestry_footprint( ele_max ) ),
419 192 : fd_forest_frontier_align(), fd_forest_frontier_footprint( ele_max ) ),
420 192 : fd_forest_subtrees_align(), fd_forest_subtrees_footprint( ele_max ) ),
421 192 : fd_forest_orphaned_align(), fd_forest_orphaned_footprint( ele_max ) ),
422 192 : fd_forest_subtlist_align(), fd_forest_subtlist_footprint( ) ),
423 :
424 192 : fd_forest_requests_align(), fd_forest_requests_footprint( ele_max ) ),
425 192 : fd_forest_reqslist_align(), fd_forest_reqslist_footprint( ) ),
426 192 : fd_forest_reqspool_align(), fd_forest_reqspool_footprint( ele_max ) ),
427 192 : fd_forest_consumed_align(), fd_forest_consumed_footprint( ele_max ) ),
428 192 : fd_forest_conslist_align(), fd_forest_conslist_footprint( ) ),
429 192 : fd_forest_conspool_align(), fd_forest_conspool_footprint( ele_max ) ),
430 192 : fd_forest_requests_align(), fd_forest_requests_footprint( ele_max ) ),
431 192 : fd_forest_reqslist_align(), fd_forest_reqslist_footprint( ) ),
432 192 : fd_forest_deque_align(), fd_forest_deque_footprint ( ele_max ) ),
433 192 : fd_forest_align() );
434 192 : }
435 :
436 : /* fd_forest_new formats an unused memory region for use as a
437 : forest. mem is a non-NULL pointer to this region in the local
438 : address space with the required footprint and alignment. */
439 :
440 : void *
441 : fd_forest_new( void * shmem, ulong ele_max, ulong seed );
442 :
443 : /* fd_forest_join joins the caller to the forest. forest
444 : points to the first byte of the memory region backing the forest
445 : in the caller's address space. Returns a pointer in the local
446 : address space to forest on success. */
447 :
448 : fd_forest_t *
449 : fd_forest_join( void * forest );
450 :
451 : /* fd_forest_leave leaves a current local join. Returns a pointer
452 : to the underlying shared memory region on success and NULL on failure
453 : (logs details). Reasons for failure include forest is NULL. */
454 :
455 : void *
456 : fd_forest_leave( fd_forest_t const * forest );
457 :
458 : /* fd_forest_delete unformats a memory region used as a
459 : forest. Assumes only the nobody is joined to the region.
460 : Returns a pointer to the underlying shared memory region or NULL if
461 : used obviously in error (e.g. forest is obviously not a
462 : forest ... logs details). The ownership of the memory region is
463 : transferred to the caller. */
464 :
465 : void *
466 : fd_forest_delete( void * forest );
467 :
468 : /* fd_forest_init initializes a forest. Assumes forest
469 : is a valid local join and no one else is joined. root is the initial
470 : root forest will use. This is the snapshot slot if booting from
471 : a snapshot, 0 if the genesis slot.
472 :
473 : In general, this should be called by the same process that formatted
474 : forest's memory, ie. the caller of fd_forest_new. */
475 :
476 : fd_forest_t *
477 : fd_forest_init( fd_forest_t * forest, ulong root );
478 :
479 : /* Accessors */
480 :
481 : /* fd_forest_wksp returns the local join to the wksp backing the
482 : forest. The lifetime of the returned pointer is at least as
483 : long as the lifetime of the local join. Assumes forest is a
484 : current local join. */
485 :
486 : FD_FN_PURE static inline fd_wksp_t *
487 3557937 : fd_forest_wksp( fd_forest_t const * forest ) {
488 3557937 : return (fd_wksp_t *)( ( (ulong)forest ) - forest->wksp_gaddr );
489 3557937 : }
490 :
491 : /* fd_forest_{pool, pool_const} returns a pointer in the caller's address
492 : space to forest's element pool. */
493 :
494 : FD_FN_PURE static inline fd_forest_blk_t *
495 727224 : fd_forest_pool( fd_forest_t * forest ) {
496 727224 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->pool_gaddr );
497 727224 : }
498 :
499 : FD_FN_PURE static inline fd_forest_blk_t const *
500 29412 : fd_forest_pool_const( fd_forest_t const * forest ) {
501 29412 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->pool_gaddr );
502 29412 : }
503 :
504 : /* fd_forest_{ancestry, ancestry_const} returns a pointer in the caller's
505 : address space to forest's ancestry map. */
506 :
507 : FD_FN_PURE static inline fd_forest_ancestry_t *
508 525663 : fd_forest_ancestry( fd_forest_t * forest ) {
509 525663 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->ancestry_gaddr );
510 525663 : }
511 :
512 : FD_FN_PURE static inline fd_forest_ancestry_t const *
513 114 : fd_forest_ancestry_const( fd_forest_t const * forest ) {
514 114 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->ancestry_gaddr );
515 114 : }
516 :
517 : /* fd_forest_{frontier, frontier_const} returns a pointer in the caller's
518 : address space to forest's frontier map. */
519 :
520 : FD_FN_PURE static inline fd_forest_frontier_t *
521 536652 : fd_forest_frontier( fd_forest_t * forest ) {
522 536652 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->frontier_gaddr );
523 536652 : }
524 :
525 : FD_FN_PURE static inline fd_forest_frontier_t const *
526 135 : fd_forest_frontier_const( fd_forest_t const * forest ) {
527 135 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->frontier_gaddr );
528 135 : }
529 :
530 : /* fd_forest_{subtrees, subtrees_const} returns a pointer in the caller's
531 : address space to forest's subtrees map. */
532 :
533 : FD_FN_PURE static inline fd_forest_subtrees_t *
534 525579 : fd_forest_subtrees( fd_forest_t * forest ) {
535 525579 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->subtrees_gaddr );
536 525579 : }
537 :
538 : FD_FN_PURE static inline fd_forest_subtrees_t const *
539 114 : fd_forest_subtrees_const( fd_forest_t const * forest ) {
540 114 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->subtrees_gaddr );
541 114 : }
542 :
543 : /* fd_forest_{subtlist, subtlist_const} returns a pointer in the caller's
544 : address space to forest's subtlist. */
545 :
546 : FD_FN_PURE static inline fd_forest_subtlist_t *
547 24276 : fd_forest_subtlist( fd_forest_t * forest ) {
548 24276 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->subtlist_gaddr );
549 24276 : }
550 :
551 : FD_FN_PURE static inline fd_forest_subtlist_t const *
552 99 : fd_forest_subtlist_const( fd_forest_t const * forest ) {
553 99 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->subtlist_gaddr );
554 99 : }
555 :
556 : /* fd_forest_{orphaned, orphaned_const} returns a pointer in the caller's
557 : address space to forest's orphaned map. */
558 :
559 : FD_FN_PURE static inline fd_forest_orphaned_t *
560 536445 : fd_forest_orphaned( fd_forest_t * forest ) {
561 536445 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->orphaned_gaddr );
562 536445 : }
563 :
564 : FD_FN_PURE static inline fd_forest_orphaned_t const *
565 114 : fd_forest_orphaned_const( fd_forest_t const * forest ) {
566 114 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->orphaned_gaddr );
567 114 : }
568 :
569 : /* fd_forest_{consumed, consumed_const} returns a pointer in the caller's
570 : address space to forest's consumed map. */
571 :
572 : FD_FN_PURE static inline fd_forest_consumed_t *
573 189225 : fd_forest_consumed( fd_forest_t * forest ) {
574 189225 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->consumed_gaddr );
575 189225 : }
576 :
577 : FD_FN_PURE static inline fd_forest_consumed_t const *
578 135 : fd_forest_consumed_const( fd_forest_t const * forest ) {
579 135 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->consumed_gaddr );
580 135 : }
581 :
582 : /* fd_forest_{conslist, conslist_const} returns a pointer in the caller's
583 : address space to forest's consumed list. */
584 :
585 : FD_FN_PURE static inline fd_forest_conslist_t *
586 954 : fd_forest_conslist( fd_forest_t * forest ) {
587 954 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->conslist_gaddr );
588 954 : }
589 :
590 : FD_FN_PURE static inline fd_forest_conslist_t const *
591 99 : fd_forest_conslist_const( fd_forest_t const * forest ) {
592 99 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->conslist_gaddr );
593 99 : }
594 :
595 : /* fd_forest_{conspool, conspool_const} returns a pointer in the caller's
596 : address space to forest's consumed pool. */
597 :
598 : FD_FN_PURE static inline fd_forest_ref_t *
599 189264 : fd_forest_conspool( fd_forest_t * forest ) {
600 189264 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->conspool_gaddr );
601 189264 : }
602 :
603 : FD_FN_PURE static inline fd_forest_ref_t const *
604 234 : fd_forest_conspool_const( fd_forest_t const * forest ) {
605 234 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->conspool_gaddr );
606 234 : }
607 :
608 : /* fd_forest_{requests, requests_const} returns a pointer in the caller's
609 : address space to forest's requests map. */
610 :
611 : FD_FN_PURE static inline fd_forest_requests_t *
612 24471 : fd_forest_requests( fd_forest_t * forest ) {
613 24471 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->requests_gaddr );
614 24471 : }
615 :
616 : FD_FN_PURE static inline fd_forest_requests_t const *
617 114 : fd_forest_requests_const( fd_forest_t const * forest ) {
618 114 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->requests_gaddr );
619 114 : }
620 :
621 : /* fd_forest_{reqslist, reqslist_const} returns a pointer in the caller's
622 : address space to forest's reqslist. */
623 :
624 : FD_FN_PURE static inline fd_forest_reqslist_t *
625 11400 : fd_forest_reqslist( fd_forest_t * forest ) {
626 11400 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->reqslist_gaddr );
627 11400 : }
628 :
629 : FD_FN_PURE static inline fd_forest_reqslist_t const *
630 114 : fd_forest_reqslist_const( fd_forest_t const * forest ) {
631 114 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->reqslist_gaddr );
632 114 : }
633 :
634 : /* fd_forest_{orphreqs, orphanreqs_const} returns a pointer in the caller's
635 : address space to forest's orphanreqs. */
636 :
637 : FD_FN_PURE static inline fd_forest_requests_t *
638 11229 : fd_forest_orphreqs( fd_forest_t * forest ) {
639 11229 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->orphreqs_gaddr );
640 11229 : }
641 :
642 : FD_FN_PURE static inline fd_forest_requests_t const *
643 114 : fd_forest_orphreqs_const( fd_forest_t const * forest ) {
644 114 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->orphreqs_gaddr );
645 114 : }
646 :
647 : /* fd_forest_{orphlist, orphanlist_const} returns a pointer in the caller's
648 : address space to forest's orphanlist. */
649 :
650 : FD_FN_PURE static inline fd_forest_reqslist_t *
651 11238 : fd_forest_orphlist( fd_forest_t * forest ) {
652 11238 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->orphlist_gaddr );
653 11238 : }
654 :
655 : FD_FN_PURE static inline fd_forest_reqslist_t const *
656 114 : fd_forest_orphlist_const( fd_forest_t const * forest ) {
657 114 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->orphlist_gaddr );
658 114 : }
659 :
660 : /* fd_forest_{reqspool, reqspool_const} returns a pointer in the caller's
661 : address space to forest's reqspool pool. */
662 :
663 : FD_FN_PURE static inline fd_forest_ref_t *
664 35907 : fd_forest_reqspool( fd_forest_t * forest ) {
665 35907 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->reqspool_gaddr );
666 35907 : }
667 :
668 : FD_FN_PURE static inline fd_forest_ref_t const *
669 114 : fd_forest_reqspool_const( fd_forest_t const * forest ) {
670 114 : return fd_wksp_laddr_fast( fd_forest_wksp( forest ), forest->reqspool_gaddr );
671 114 : }
672 :
673 : /* fd_forest_root_slot returns forest's root slot. Assumes
674 : forest is a current local join. */
675 :
676 : FD_FN_PURE static inline ulong
677 13134 : fd_forest_root_slot( fd_forest_t const * forest ) {
678 13134 : if( FD_UNLIKELY( forest->root == fd_forest_pool_idx_null( fd_forest_pool_const( forest ) ) )) return ULONG_MAX; /* uninitialized */
679 13134 : return fd_forest_pool_ele_const( fd_forest_pool_const( forest ), forest->root )->slot;
680 13134 : }
681 :
682 : fd_forest_blk_t *
683 : fd_forest_query( fd_forest_t * forest, ulong slot );
684 :
685 : /* Operations */
686 :
687 : /* fd_forest_blk_insert inserts a new block into the forest. Assumes
688 : slot >= forest->root. blk_insert can also be called to create a
689 : sentinel block, i.e. a placeholder block that we know exists but
690 : don't know the parent slot of. The caller should pass in parent_slot
691 : == ULONG_MAX. In this case, the block inserted will remain an
692 : orphan/subtree at least until the next blk_insert is called with a
693 : different parent_slot, after which point no more updates to the
694 : parent_slot will be allowed. For non-sentinel blocks, blk insert is
695 : idempotent, and can be called multiple times with the same slot.
696 :
697 : If the forest pool is full at the time of insertion, a block will be
698 : chosen for eviction (see fd_forest.c:evict for more details). If the
699 : caller passes in a non-NULL evicted pointer, the evicted slot will be
700 : stored to the pointer.
701 :
702 : Returns the inserted (or existing) forest ele. NULL if the forest
703 : pool is full and no block could be evicted. */
704 :
705 : fd_forest_blk_t *
706 : fd_forest_blk_insert( fd_forest_t * forest, ulong slot, ulong parent_slot, ulong * evicted );
707 :
708 153144 : #define SHRED_SRC_TURBINE 0
709 153327 : #define SHRED_SRC_REPAIR 1
710 305610 : #define SHRED_SRC_RECOVERED 2
711 :
712 : /* fd_forest_shred_insert inserts a new shred into the forest. Assumes
713 : slot is already in forest, and should typically be preceded by a
714 : fd_forest_blk_insert. Returns the forest ele corresponding to the
715 : shred slot if the shred is accepted, and NULL if the shred is
716 : rejected. A shred can only be rejected if slot is able to verify
717 : that this shred does not belong to the canonical FEC set.
718 :
719 : A possible side effect of data_shred_insert is that it may update the
720 : parent slot of the block IF the inserted shred has a verifiably
721 : correct merkle root. Note this is different from a sentinel block
722 : parent update. A data shred insert can only update the parent slot if
723 : the data shred has a verified merkle root, and the parent slot is
724 : incorrect. A sentinel block will update its parent with the first
725 : parent slot it receives, but it can be later updated with a
726 : data_shred_insert. Each update type can only be performed once. */
727 :
728 : fd_forest_blk_t *
729 : fd_forest_data_shred_insert( fd_forest_t * forest,
730 : ulong slot,
731 : ulong parent_slot,
732 : uint shred_idx,
733 : uint fec_set_idx,
734 : int slot_complete,
735 : int ref_tick,
736 : int src,
737 : fd_hash_t * mr,
738 : fd_hash_t * cmr );
739 :
740 : fd_forest_blk_t *
741 : fd_forest_code_shred_insert( fd_forest_t * forest, ulong slot, uint shred_idx );
742 :
743 : /* fd_forest_fec_insert inserts a new fully completed FEC set into the
744 : forest. Assumes slot is already in forest, and should typically be
745 : called directly after fd_forest_block_insert. Returns the forest ele
746 : corresponding to the shred slot if the FEC was accepted, NULL
747 : otherwise. */
748 :
749 : fd_forest_blk_t *
750 : fd_forest_fec_insert( fd_forest_t * forest,
751 : ulong slot,
752 : ulong parent_slot,
753 : uint last_shred_idx,
754 : uint fec_set_idx,
755 : int slot_complete,
756 : int ref_tick,
757 : fd_hash_t * mr,
758 : fd_hash_t * cmr );
759 :
760 : /* fd_forest_fec_clear clears the FEC set at the given slot and
761 : fec_set_idx.
762 : Can fec_clear break requests frontier invariants? No.
763 :
764 : 2) If slot n is in scope of the forest root, then the shred
765 : delivered to repair will trigger a data_shred_insert call
766 : that does nothing, as repair already has record of that
767 : shred. Eventually the fec_completes or fec_clear msg will be
768 : delivered to repair. fec_insert will do nothing. fec_clear
769 : will remove the idxs for the shreds from the bitset, and
770 : update the buffered_idx. This doesn't matter though! because
771 : we already have moved past slot n on the requests frontier.
772 : No need to request those shreds again.
773 :
774 : Except 2) breaks a bit with in specific leader slot cases. See
775 : fd_forest_fec_clear for more details. */
776 : void
777 : fd_forest_fec_clear( fd_forest_t * forest, ulong slot, uint fec_set_idx, uint max_shred_idx );
778 :
779 : /* fd_forest_fec_chain_verify verifies the chain of merkle roots for a
780 : given block. Should only be called on a block that has all the shreds
781 : received. Returns a pointer to the first slot that does not confirm,
782 : or NULL if the chain is valid. */
783 : fd_forest_blk_t *
784 : fd_forest_fec_chain_verify( fd_forest_t * forest, fd_forest_blk_t * ele, fd_hash_t const * mr );
785 :
786 : void
787 : fd_forest_confirm( fd_forest_t * forest, fd_forest_blk_t * ele, fd_hash_t const * bid );
788 :
789 : /* fd_forest_merkle_last_incorrect_idx returns the highest incorrect FEC
790 : index for a given block. */
791 : static inline uint
792 33 : fd_forest_merkle_last_incorrect_idx( fd_forest_blk_t * ele ) {
793 33 : ulong first_verified_fec = ele->lowest_verified_fec;
794 : /* UNLIKELY because this is being called because we've detected an incorrect FEC */
795 33 : if( FD_UNLIKELY( first_verified_fec == 0 ) ) return UINT_MAX;
796 :
797 30 : uint bad_fec_idx = first_verified_fec == UINT_MAX ? ele->complete_idx / 32UL /* last FEC is wrong */
798 30 : : (uint)first_verified_fec - 1;
799 30 : return bad_fec_idx * 32UL;
800 33 : }
801 :
802 : /* fd_forest_publish publishes slot as the new forest root, setting
803 : the subtree beginning from slot as the new forest tree (ie. slot
804 : and all its descendants). Prunes all eles not in slot's forest.
805 : Assumes slot is present in forest. Returns the new root. */
806 :
807 : fd_forest_blk_t const *
808 : fd_forest_publish( fd_forest_t * forest, ulong slot );
809 :
810 : /* fd_forest_highest_repaired_slot returns the highest child of a fully,
811 : contiguously repaired slot. */
812 : ulong
813 : fd_forest_highest_repaired_slot( fd_forest_t const * forest );
814 :
815 : /* fd_forest_iter_* takes either the standard iterator or the orphan
816 : iterator and returns the next shred to request. The iterator must
817 : one of the two iterators that is owned by the forest.
818 :
819 : The iterator will be in an iter_done state if there are no current
820 : shreds to request.
821 :
822 : The forward forest iterator will visit each shred at most once over
823 : the lifetime of the forest, without revisiting past shreds, so it is
824 : up to the caller to track which shreds will need re-requesting. The
825 : exception to the rule is slots where the slot_complete shred is still
826 : not known - the highest window idx will be requested for that slot,
827 : and the slot will be added to the tail of the requests deque so that
828 : later we may revisit it. As a result, the children of that slot may
829 : also be revisited multiple times.
830 :
831 : Note this case is pretty rare.
832 :
833 : An iterator signifies to the repair tile to request the
834 : highest_window_index when the ele_idx is not null and shred_idx is
835 : UINT_MAX.
836 :
837 : Otherwise, the iterator signifies to the repair tile to request a
838 : regular shred window_idx.
839 :
840 : Invariants for requests map and requests deque:
841 :
842 : There can only be one occurence of the slot in the requests deque at
843 : any time. Any slot in the requests deque must exist in the requests
844 : map, and vice versa. Any slot in the requests map must also exist in
845 : the forest. During publish the requests map must also be pruned.
846 :
847 : If we are mid-request of a slot that gets pruned, forest will take
848 : responsibility to update the iterator to a valid slot.
849 :
850 : TODO: should this really be an iterator?? or just a _next function? */
851 :
852 : fd_forest_iter_t *
853 : fd_forest_iter_next( fd_forest_iter_t * iter, fd_forest_t * forest );
854 :
855 : int
856 : fd_forest_iter_done( fd_forest_iter_t * iter, fd_forest_t * forest );
857 :
858 : /* Misc */
859 :
860 : /* fd_forest_verify checks the forest is not obviously corrupt.
861 : Returns 0 if verify succeeds, -1 otherwise. */
862 :
863 : int
864 : fd_forest_verify( fd_forest_t const * forest );
865 :
866 : /* fd_forest_print pretty-prints a formatted forest tree. Printing begins
867 : from `ele` (it will appear as the root in the print output).
868 :
869 : The most straightforward and commonly used printing pattern is:
870 : `fd_forest_print( forest, fd_forest_root( forest ) )`
871 :
872 : This would print forest beginning from the root.
873 :
874 : Alternatively, caller can print a more localized view, for example
875 : starting from the grandparent of the most recently executed slot:
876 :
877 : ```
878 : fd_forest_blk_t const * ele = fd_forest_query( slot );
879 : fd_forest_print( forest, fd_forest_parent( fd_forest_parent( ele ) ) )
880 : ```
881 :
882 : Callers should add null-checks as appropriate in actual usage. */
883 :
884 : void
885 : fd_forest_print( fd_forest_t const * forest );
886 :
887 : FD_PROTOTYPES_END
888 :
889 : #endif /* HEADER_fd_src_discof_forest_fd_forest_h */
|