Line data Source code
1 : #ifndef HEADER_fd_src_choreo_tower_fd_tower_h
2 : #define HEADER_fd_src_choreo_tower_fd_tower_h
3 :
4 : /* fd_tower presents an API for Solana's TowerBFT algorithm.
5 :
6 : What is TowerBFT? TowerBFT is an algorithm for converging a
7 : supermajority of stake in the validator cluster on the same fork.
8 :
9 : /-- 3-- 4 (A)
10 : 1-- 2
11 : \-- 5 (B)
12 :
13 : Above is a diagram of a fork. The leader for slot 5 decided to build
14 : off slot 2, rather than slot 4. This can happen for various reasons,
15 : for example network propagation delay. We now have two possible forks
16 : labeled A and B. The consensus algorithm has to pick one of them.
17 :
18 : So, how does the consensus algorithm pick? As detailed in fd_ghost.h,
19 : we pick the fork based on the most stake from votes, called the
20 : "heaviest". Validators vote for blocks during replay, and
21 : simultaneously use other validator’s votes to determine which block
22 : to vote for. This encourages convergence, because as one fork gathers
23 : more votes, more and more votes pile-on, solidifying its position as
24 : the heaviest fork.
25 :
26 : /-- 3-- 4 (10%)
27 : 1-- 2
28 : \-- 5 (9%)
29 :
30 : However, network propagation delay of votes can lead us to think one
31 : fork is heaviest, before observing new votes that indicate another
32 : fork is heavier. So our consensus algorithm also needs to support
33 : switching.
34 :
35 : /-- 3-- 4 (10%)
36 : 1-- 2
37 : \-- 5 (15%)
38 :
39 : At the same time we don’t want excessive switching. The more often
40 : validators switch, the more difficult it will be to achieve that
41 : pile-on effect I just described.
42 :
43 : Note that to switch forks, you need to rollback a given slot and its
44 : descendants on that fork. In the example above, to switch to 1, 2, 5,
45 : we need to rollback 3 and 4. The consensus algorithm makes it more
46 : costly the further you want to rollback a fork. Here, I’ve added a
47 : column lockout, which doubles for every additional slot you want to
48 : rollback.
49 :
50 : Eventually you have traversed far enough down a fork, that the
51 : lockout is so great it is infeasible to imagine it ever rolling back
52 : in practice. So you can make that fork permanent or “commit” it. Once
53 : all validators do this, the blockchain now just has a single fork.
54 :
55 : Armed with some intuition, let’s now begin defining some terminology.
56 : Here is a diagram of a validator's "vote tower":
57 :
58 : slot | confirmation count (conf)
59 : --------------------------------
60 : 4 | 1
61 : 3 | 2
62 : 2 | 3
63 : 1 | 4
64 :
65 : It is a stack structure in which each element is a vote. The vote
66 : slot column indicates which slots the validator has voted for,
67 : ordered from most to least recent.
68 :
69 : The confirmation count column indicates how many consecutive votes on
70 : the same fork have been pushed on top of that vote. You are
71 : confirming your own votes for a fork every time you vote on top of
72 : the same fork.
73 :
74 : Two related concepts to confirmation count are lockout and expiration
75 : slot. Lockout equals 2 to the power of confirmation count. Every time
76 : we “confirm” a vote by voting on top of it, we double the lockout.
77 : The expiration slot is the sum of vote slot and lockout, so it also
78 : increases when lockouts double. It represents which slot the vote
79 : will expire. When a vote expires, it is popped from the top of the
80 : tower. An important Tower rule is that a validator cannot vote for a
81 : different fork from a given vote slot, until reaching the expiration
82 : slot for that vote slot. To summarize, the further a validator wants
83 : to rollback their fork (or vote slots) the longer the validator needs
84 : to wait without voting (in slot time).
85 :
86 : Here is the same tower, fully-expanded to include all the fields:
87 :
88 : slot | conf | lockout | expiration
89 : ----------------------------------
90 : 4 | 1 | 2 | 6
91 : 3 | 2 | 4 | 7
92 : 2 | 3 | 8 | 10
93 : 1 | 4 | 16 | 17
94 :
95 : Based on this tower, the validator is locked out from voting for any
96 : slot <= 6 that is on a different fork than slot 4. I’d like to
97 : emphasize that the expiration is with respect to the vote slot, and
98 : is _not_ related to the Proof-of-History slot or what the
99 : quote-unquote current slot is. So even if the current slot is now 7,
100 : the validator can’t go back and vote for slot 5, if slot 5 were on a
101 : different fork than 4. The earliest valid vote slot this validator
102 : could submit for a different fork from 4 would be slot 7 or later.
103 :
104 : Next let’s look at how the tower makes state transitions. Here we
105 : have the previous example tower, with a before-and-after view with
106 : respect to a vote for slot 9:
107 :
108 : (before) slot | conf
109 : -----------
110 : 4 | 1
111 : 3 | 2
112 : 2 | 3
113 : 1 | 4
114 :
115 : (after) slot | conf
116 : -----------
117 : 9 | 1
118 : 2 | 3
119 : 1 | 4
120 :
121 : As you can see, we added a vote for slot 9 to the top of the tower.
122 : But we also removed the votes for slot 4 and slot 3. What happened?
123 : This is an example of vote expiry in action. When we voted for slot
124 : 9, this exceeded the expirations of vote slots 4 and 3, which were 6
125 : and 7 respectively. This action of voting triggered the popping of
126 : the expired votes from the top of the tower.
127 :
128 : Next, we add a vote for slot 11:
129 :
130 : (before) slot | conf
131 : -----------
132 : 9 | 1
133 : 2 | 3
134 : 1 | 4
135 :
136 : (after) slot | conf
137 : -----------
138 : 11 | 1
139 : 9 | 2
140 : 2 | 3
141 : 1 | 4
142 :
143 : The next vote for slot 11 doesn’t involve expirations, so we just add
144 : it to the top of the tower. Also, here is an important property of
145 : lockouts. Note that the lockout for vote slot 9 doubled (ie. the
146 : confirmation count increased by 1) but the lockouts of vote slots 2
147 : and 1 remained unchanged.
148 :
149 : The reason for this is confirmation counts only increase when they
150 : are consecutive in the vote tower. Because 4 and 3 were expired
151 : previously by the vote for 9, that consecutive property was broken.
152 : In this case, the vote for slot 11 is only consecutive with slot 9,
153 : but not 2 and 1. Specifically, there is a gap in the before-tower at
154 : confirmation count 2.
155 :
156 : In the after-tower, all the votes are again consecutive (confirmation
157 : counts 1, 2, 3, 4 are all accounted for), so the next vote will
158 : result in all lockouts doubling as long as it doesn’t result in more
159 : expirations.
160 :
161 : One other thing I’d like to point out about this vote for slot 11.
162 : Even though 11 exceeds the expiration slot of vote slot 2, which is
163 : 10, voting for 11 did not expire the vote for 2. This is because
164 : expiration happens top-down and contiguously. Because vote slot 9 was
165 : not expired, we do not proceed with expiring 2.
166 :
167 : In the Tower rules, once a vote reaches a max lockout of 32, it is
168 : considered rooted and it is popped from the bottom of the tower. Here
169 : is an example:
170 :
171 : (before) slot | conf
172 : -----------
173 : 50 | 1
174 : ... | ... (29 votes elided)
175 : 1 | 4
176 :
177 : (after) slot | conf
178 : -----------
179 : 53 | 1
180 : ... | ... (29 votes elided)
181 : 1 | 4
182 :
183 : So the tower is really a double-ended queue rather than a stack.
184 :
185 : Rooting has implications beyond the Tower. It's what we use to prune
186 : our state. Every time tower makes a new root slot, we prune any old
187 : state that does not originate from that new root slot. Our blockstore
188 : will discard blocks below that root, our forks structure will discard
189 : stale banks, funk (which is our accounts database) will discard stale
190 : transactions (which in turn track modifications to accounts), and
191 : ghost (which is our fork select tree) will discard stale nodes
192 : tracking stake percentages. We call this operation publishing.
193 :
194 : Note that the vote slots are not necessarily consecutive. Here I
195 : elided the votes sandwiched between the newest and oldest votes for
196 : brevity.
197 :
198 : Next, let’s go over three additional tower checks. These three checks
199 : further reinforce the consensus algorithm we established with
200 : intuition, in this case getting a supermajority (ie. 2/3) of stake to
201 : converge on a fork.
202 :
203 : The first is the threshold check. The threshold check makes sure at
204 : least 2/3 of stake has voted for the same fork as the vote at depth 8
205 : in our tower. Essentially, this guards our tower from getting too out
206 : of sync with the rest of the cluster. If we get too out of sync we
207 : can’t vote for a long time, because we had to rollback a vote we had
208 : already confirmed many times and had a large lockout. This might
209 : otherwise happen as the result of a network partition where we can
210 : only communicate with a subset of stake.
211 :
212 : Next is the lockout check. We went in detail on this earlier when
213 : going through the lockout and expiration slot, and as before, the
214 : rule is we can only vote on a slot for a different fork from a
215 : previous vote, after that vote’s expiration slot.
216 :
217 : Given this fork and tower from earlier:
218 :
219 : /-- 3-- 4
220 : 1-- 2
221 : \-- 5
222 :
223 : slot | conf
224 : -----------
225 : 4 | 1
226 : 3 | 2
227 : 2 | 3
228 : 1 | 4
229 :
230 : You’re locked out from voting for 5 because it’s on a different fork
231 : from 4 and the expiration slot of your previous vote for 4 is 6.
232 :
233 : However, if we introduce a new slot 9:
234 :
235 : /-- 3-- 4
236 : 1-- 2
237 : \-- 5-- 9
238 :
239 : slot | conf
240 : -----------
241 : 9 | 1
242 : 2 | 3
243 : 1 | 4
244 :
245 : Here the new Slot 9 descends from 5, and exceeds vote slot 2’s
246 : expiration slot of 6 unlike 5.
247 :
248 : After your lockout expires, the tower rules allow you to vote for
249 : descendants of the fork slot you wanted to switch to in the first
250 : place (here, 9 descending from 5). So we eventually switch to the fork
251 : we wanted, by voting for 9 and expiring 3 and 4.
252 :
253 : Importantly, notice that the fork slots and vote slots are not exactly
254 : 1-to-1. While conceptually our tower is voting for the fork 1, 2, 5,
255 : 9, the vote for 5 is only implied. Our tower votes themselves still
256 : can’t include 5 due to lockout.
257 :
258 : Finally, the switch check. The switch check is used to safeguard
259 : optimistic confirmation. Optimistic confirmation is when a slot gets
260 : 2/3 of stake-weighted votes. This is then treated as a signal that the
261 : slot will eventually get rooted. However, to actually guarantee this
262 : we need a rule that prevents validators from arbitrarily switching
263 : forks (even when their vote lockout has expired). This rule is the
264 : switch check.
265 :
266 : The switch check is additional to the lockout check. Before switching
267 : forks, we need to make sure at least 38% of stake has voted for a
268 : different fork than our own. Different fork is defined by finding the
269 : greatest common ancestor of our last voted fork slot and the slot we
270 : want to switch to. Any forks descending from the greatest common
271 : ancestor (which I will subsequently call the GCA) that are not our
272 : own fork are counted towards the switch check stake.
273 :
274 : Here we visualize the switch check:
275 :
276 : /-- 7
277 : /-- 3-- 4
278 : 1-- 2 -- 6
279 : \-- 5-- 9
280 :
281 : First, we find the GCA of 4 and 9 which is 2. Then we look at all the
282 : descendants of the GCA that do not share a fork with us, and make sure
283 : their stake sums to more than 38%.
284 :
285 : I’d like to highlight that 7 here is not counted towards the switch
286 : proof, even though it is on a different fork from 4. This is because
287 : it’s on the same fork relative to the GCA.
288 :
289 : So that covers the checks. Next, there are two additional important
290 : concepts: "reset slot" and "vote slot". The reset slot is the slot you
291 : reset PoH to when it's your turn to be leader. Because you are
292 : responsible for producing a block, you need to decide which fork to
293 : build your block on. For example, if there are two competing slots 3
294 : and 4, you would decide whether to build 3 <- 5 or 4 <- 5. In general
295 : the reset slot is the same fork as the vote slot, but not always.
296 : There is an important reason for this. Recall this fork graph from
297 : earlier:
298 :
299 : /-- 3-- 4 (10%)
300 : 1-- 2
301 : \-- 5-- 6 (9%)
302 :
303 : In this diagram, 4 is the winner of fork choice. All future leaders
304 : now want to reset to slot 4. Naively, this makes sense because you
305 : maximize the chance of your block finalizing (and earning the rewards)
306 : if you greedily (in the algorithmic, and perhaps also literal sense)
307 : pick what's currently the heaviest.
308 :
309 : However, say most validators actually voted fork 5, even though we
310 : currently observe 3 as the heavier. For whatever reason, those votes
311 : for 5 just didn't land (the leader for 6 missed the votes, network
312 : blip, etc.)
313 :
314 : All these validators that voted for 5 are now constrained by the
315 : switch check (38% of stake), and none of them can actually switch
316 : their vote to 4 (which only has 10%). But they're all continuing to
317 : build blocks on top of fork 4, which importantly implies that votes
318 : for 5 will not be able to propagate. This is because the validators
319 : that can't switch continue to refresh their votes for 5, but those
320 : votes never "land" because no one is building blocks on top of fork
321 : 5 anymore (everyone is building on 4 because that's currently the
322 : heaviest).
323 :
324 : Therefore, it is important to reset to the same fork as your last vote
325 : slot, which is usually also the heaviest fork, but not always.
326 :
327 : Note that with both the vote slot and reset slot, the tower uses ghost
328 : to determine the last vote slot's ancestry. So what happens if the
329 : last vote slot isn't in the ghost? There are two separate cases in
330 : which this can happen that tower needs to handle:
331 :
332 : 1. Our last vote slot > ghost root slot, but is not a descendant of
333 : the ghost root. This can happen if we get stuck on a minority fork
334 : with a long lockout. In the worst case, lockout duration is
335 : 2^{threshold_check_depth} ie. 2^8 = 256 slots. In other words, we
336 : voted for and confirmed a minority fork 8 times in a row. We assume
337 : we won't vote past 8 times for the minority fork, because the
338 : threshold check would have stopped us (recall the threshold check
339 : requires 2/3 of stake to be on the same fork at depth 8 before we
340 : can keep voting for that fork).
341 :
342 : While waiting for those 256 slots of lockout to expire, it is
343 : possible that in the meantime a supermajority (ie. >2/3) of the
344 : cluster actually roots another fork that is not ours. During
345 : regular execution, we would not publish ghost until we have an
346 : updated tower root. So as long as the validator stays running while
347 : it is locked out from the supermajority fork, it keeps track of its
348 : vote slot's ancestry.
349 :
350 : If the validator were to stop running while locked out though (eg.
351 : operator needed to restart the box), the validator attempts to
352 : repair the ancestry of its last vote slot.
353 :
354 : In the worst case, if we cannot repair that ancestry, then we do
355 : not vote until replay reaches the expiration slot of that last vote
356 : slot. We can assume the votes > depth 8 in the tower do not violate
357 : lockout, because again the threshold check would have guarded it.
358 :
359 : TODO CURRENTLY THIS IS UNHANDLED. WHAT THE VALIDATOR DOES IF IT
360 : HAS LOST THE GHOST ANCESTRY IS IT WILL ERROR OUT.
361 :
362 : 2. Our last vote slot < ghost root slot. In this case we simply
363 : cannot determine whether our last vote slot is on the same fork as
364 : our ghost root slot because we no longer have ancestry information
365 : before the ghost root slot. This can happen if the validator is not
366 : running for a long time, then started up again. It will have to use
367 : the snapshot slot for the beginning of the ghost ancestry, which
368 : could be well past the last vote slot in the tower.
369 :
370 : In this case, before the validator votes again, it makes sure that
371 : the last vote's confirmation count >= THRESHOLD_CHECK_DEPTH (stated
372 : differently, it makes sure the next time it votes it will expire at
373 : least the first THRESHOLD_CHECK_DEPTH votes in the tower), and then
374 : it assumes that the last vote slot is on the same fork as the ghost
375 : root slot.
376 :
377 : TODO VERIFY AGAVE BEHAVIOR IS THE SAME.
378 :
379 : Now let’s switch gears from theory back to practice. What does it mean
380 : to send a vote?
381 :
382 : As a validator, you aren’t sending individual tower votes. Rather, you
383 : are sending your entire updated tower to the cluster every time.
384 : Essentially, the validator is continuously syncing their local tower
385 : with the cluster. That tower state is then stored inside a vote
386 : account, like any other state on Solana.
387 :
388 : On the flip side, we also must stay in sync the other way from cluster
389 : to local. If we have previously voted, we need to make sure our tower
390 : matches up with what the cluster has last seen. We know the most
391 : recent tower is in the last vote we sent, so we durably store every
392 : tower (by checkpointing it to disk) whenever we send a vote. In case
393 : this tower is out-of-date Conveniently Funk, our accounts database,
394 : stores all the vote accounts including our own, so on bootstrap we
395 : simply load in our vote account state itself to to initialize our own
396 : local view of the tower.
397 :
398 : Finally, a note on the difference between the Vote Program and
399 : TowerBFT. The Vote Program runs during transaction (block) execution.
400 : It checks that certain invariants about the tower inside a vote
401 : transaction are upheld (recall a validator sends their entire tower as
402 : part of a "vote"): otherwise, it fails the transaction. For example,
403 : it checks that every vote contains a tower in which the vote slots are
404 : strictly monotonically increasing. As a consequence, only valid towers
405 : are committed to the ledger. Another important detail of the Vote
406 : Program is that it only has a view of the current fork on which it is
407 : executing. Specifically, it can't observe what state is on other
408 : forks, like what a validator's tower looks like on fork A vs. fork B.
409 :
410 : The TowerBFT rules, on the other hand, run after transaction
411 : execution. Also unlike the Vote Program, the TowerBFT rules do not
412 : take the vote transactions as inputs: rather the inputs are the towers
413 : that have already been written to the ledger by the Vote Program. As
414 : described above, the Vote Program validates every tower, and in this
415 : way, the TowerBFT rules leverage the validation already done by the
416 : Vote Program to (mostly) assume each tower is valid. Every validator
417 : runs TowerBFT to update their own tower with votes based on the
418 : algorithm documented above. Importantly, TowerBFT has a view of all
419 : forks, and the validator makes a voting decision based on all forks.
420 : */
421 :
422 : #include "../fd_choreo_base.h"
423 : #include "../epoch/fd_epoch.h"
424 : #include "../ghost/fd_ghost.h"
425 : #include "../voter/fd_voter.h"
426 : #include "../../disco/pack/fd_microblock.h"
427 : #include "../../flamenco/runtime/fd_blockstore.h"
428 : #include "../../flamenco/runtime/fd_system_ids.h"
429 : #include "../../flamenco/txn/fd_txn_generate.h"
430 : #include "../../funk/fd_funk.h"
431 :
432 : /* FD_TOWER_USE_HANDHOLDING: Define this to non-zero at compile time
433 : to turn on additional runtime checks and logging. */
434 :
435 : #ifndef FD_TOWER_USE_HANDHOLDING
436 : #define FD_TOWER_USE_HANDHOLDING 1
437 : #endif
438 :
439 0 : #define FD_TOWER_VOTE_MAX (31UL)
440 :
441 : struct fd_tower_vote {
442 : ulong slot; /* vote slot */
443 : ulong conf; /* confirmation count */
444 : };
445 : typedef struct fd_tower_vote fd_tower_vote_t;
446 :
447 : #define DEQUE_NAME fd_tower_votes
448 0 : #define DEQUE_T fd_tower_vote_t
449 0 : #define DEQUE_MAX FD_TOWER_VOTE_MAX
450 : #include "../../util/tmpl/fd_deque.c"
451 :
452 : /* fd_tower is a representation of a validator's "vote tower" (described
453 : in detail in the preamble at the top of this file). The votes in the
454 : tower are stored in an fd_deque ordered from lowest to highest vote
455 : slot (highest to lowest confirmation count) relative to the head and
456 : tail . There can be at most 31 votes in the tower. This invariant
457 : is upheld with every call to `fd_tower_vote`.
458 :
459 : The definition of `fd_tower_t` is a simple typedef alias for
460 : `fd_tower_vote_t` and is a transparent wrapper around the vote deque.
461 : Relatedly, the tower API takes a local pointer to the first vote in
462 : the deque (the result of `fd_deque_join`) as a parameter in all its
463 : function signatures. */
464 :
465 : typedef fd_tower_vote_t fd_tower_t;
466 :
467 : /* FD_TOWER_{ALIGN,FOOTPRINT} specify the alignment and footprint needed
468 : for tower. ALIGN is double x86 cache line to mitigate various kinds
469 : of false sharing (eg. ACLPF adjacent cache line prefetch). FOOTPRINT
470 : is the size of fd_deque including the private header's start and end
471 : and an exact multiple of ALIGN. These are provided to facilitate
472 : compile time tower declarations. */
473 :
474 0 : #define FD_TOWER_ALIGN (128UL)
475 0 : #define FD_TOWER_FOOTPRINT (512UL)
476 : FD_STATIC_ASSERT( FD_TOWER_FOOTPRINT==sizeof(fd_tower_votes_private_t), FD_TOWER_FOOTPRINT );
477 :
478 : /* fd_tower_{align,footprint} return the required alignment and
479 : footprint of a memory region suitable for use as a tower. align
480 : returns FD_TOWER_ALIGN. footprint returns FD_TOWER_FOOTPRINT. */
481 :
482 : FD_FN_CONST static inline ulong
483 0 : fd_tower_align( void ) {
484 0 : return FD_TOWER_ALIGN;
485 0 : }
486 :
487 : FD_FN_CONST static inline ulong
488 0 : fd_tower_footprint( void ) {
489 0 : return FD_TOWER_FOOTPRINT;
490 0 : }
491 :
492 : /* fd_tower_new formats an unused memory region for use as a tower. mem
493 : is a non-NULL pointer to this region in the local address space with
494 : the required footprint and alignment. */
495 :
496 : void *
497 : fd_tower_new( void * mem );
498 :
499 : /* fd_tower_join joins the caller to the tower. tower points to the
500 : first byte of the memory region backing the tower in the caller's
501 : address space.
502 :
503 : Returns a pointer in the local address space to tower on success. */
504 :
505 : fd_tower_t *
506 : fd_tower_join( void * tower );
507 :
508 : /* fd_tower_leave leaves a current local join. Returns a pointer to the
509 : underlying shared memory region on success and NULL on failure (logs
510 : details). Reasons for failure include tower is NULL. */
511 :
512 : void *
513 : fd_tower_leave( fd_tower_t * tower );
514 :
515 : /* fd_tower_delete unformats a memory region used as a tower. Assumes
516 : only the local process is joined to the region. Returns a pointer to
517 : the underlying shared memory region or NULL if used obviously in
518 : error (e.g. tower is obviously not a tower ... logs details). The
519 : ownership of the memory region is transferred to the caller. */
520 :
521 : void *
522 : fd_tower_delete( void * tower );
523 :
524 : /* fd_tower_lockout_check checks if we are locked out from voting for
525 : `slot`. Returns 1 if we can vote for `slot` without violating
526 : lockout, 0 otherwise. Assumes tower is non-empty.
527 :
528 : After voting for a slot n, we are locked out for 2^k slots, where k
529 : is the confirmation count of that vote. Once locked out, we cannot
530 : vote for a different fork until that previously-voted fork expires at
531 : slot n+2^k. This implies the earliest slot in which we can switch
532 : from the previously-voted fork is (n+2^k)+1. We use `ghost` to
533 : determine whether `slot` is on the same or different fork as previous
534 : vote slots.
535 :
536 : In the case of the tower, every vote has its own expiration slot
537 : depending on confirmations. The confirmation count is the max number
538 : of consecutive votes that have been pushed on top of the vote, and
539 : not necessarily its current height in the tower.
540 :
541 : For example, the following is a diagram of a tower pushing and
542 : popping with each vote:
543 :
544 :
545 : slot | confirmation count
546 : -----|-------------------
547 : 4 | 1 <- vote
548 : 3 | 2
549 : 2 | 3
550 : 1 | 4
551 :
552 :
553 : slot | confirmation count
554 : -----|-------------------
555 : 9 | 1 <- vote
556 : 2 | 3
557 : 1 | 4
558 :
559 :
560 : slot | confirmation count
561 : -----|-------------------
562 : 10 | 1 <- vote
563 : 9 | 2
564 : 2 | 3
565 : 1 | 4
566 :
567 :
568 : slot | confirmation count
569 : -----|-------------------
570 : 11 | 1 <- vote
571 : 10 | 2
572 : 9 | 3
573 : 2 | 4
574 : 1 | 5
575 :
576 :
577 : slot | confirmation count
578 : -----|-------------------
579 : 18 | 1 <- vote
580 : 2 | 4
581 : 1 | 5
582 :
583 :
584 : In the final tower, note the gap in confirmation counts between slot
585 : 18 and slot 2, even though slot 18 is directly above slot 2. */
586 :
587 : int
588 : fd_tower_lockout_check( fd_tower_t const * tower,
589 : fd_ghost_t const * ghost,
590 : ulong slot );
591 :
592 : /* fd_tower_switch_check checks if we can switch to `fork`. Returns 1
593 : if we can switch, 0 otherwise. Assumes tower is non-empty.
594 :
595 : There are two forks of interest: our last vote fork ("vote fork") and
596 : the fork we want to switch to ("switch fork"). The switch fork is
597 : what the caller passes in as `fork`.
598 :
599 : In order to switch, FD_TOWER_SWITCH_PCT of stake must have voted for
600 : a different descendant of the GCA of vote_fork and switch_fork, and
601 : also must be locked out from our last vote slot.
602 :
603 : Recall from the lockout check a validator is locked out from voting
604 : for our last vote slot when their last vote slot is on a different
605 : fork, and that vote's expiration slot > our last vote slot.
606 :
607 : The following pseudocode describes the algorithm:
608 :
609 : ```
610 : find the greatest common ancestor (gca) of vote_fork and switch_fork
611 : for all validators v
612 : if v's locked out[1] from voting for our latest vote slot
613 : add v's stake to switch stake
614 : return switch stake >= FD_TOWER_SWITCH_PCT
615 : ```
616 :
617 : The switch check is used to safeguard optimistic confirmation.
618 : Specifically: FD_TOWER_OPT_CONF_PCT + FD_TOWER_SWITCH_PCT >= 1. */
619 :
620 : int
621 : fd_tower_switch_check( fd_tower_t const * tower,
622 : fd_epoch_t const * epoch,
623 : fd_ghost_t const * ghost,
624 : ulong slot );
625 :
626 : /* fd_tower_threshold_check checks if we pass the threshold required to
627 : vote for `slot`. This is only relevant after voting for (and
628 : confirming) the same fork ie. the tower is FD_TOWER_THRESHOLD_DEPTH
629 : deep. Returns 1 if we pass the threshold check, 0 otherwise.
630 :
631 : The following psuedocode describes the algorithm:
632 :
633 : ```
634 : for all vote accounts in the current epoch
635 :
636 : simulate that the validator has voted for `slot`
637 :
638 : pop all votes expired by that simulated vote
639 :
640 : if the validator's latest tower vote after expiry >= our threshold
641 : slot ie. our vote from FD_TOWER_THRESHOLD_DEPTH back (after
642 : simulating a vote on our own tower the same way), then add
643 : validator's stake to threshold_stake.
644 :
645 : return threshold_stake >= FD_TOWER_THRESHOLD_PCT
646 : ```
647 :
648 : The threshold check simulates voting for the current slot to expire
649 : stale votes. This is to prevent validators that haven't voted in a
650 : long time from counting towards the threshold stake. */
651 :
652 : int
653 : fd_tower_threshold_check( fd_tower_t const * tower,
654 : fd_epoch_t const * epoch,
655 : fd_funk_t * funk,
656 : fd_funk_txn_t const * txn,
657 : ulong slot,
658 : fd_spad_t * runtime_spad );
659 :
660 : /* fd_tower_reset_slot returns the slot to reset PoH to when building
661 : the next leader block. Assumes tower and ghost are both valid local
662 : joins and in-sync ie. every vote slot in tower corresponds to a node
663 : in ghost. Returns FD_SLOT_NULL if this is not true.
664 :
665 : In general our reset slot is the fork head of our last vote slot, but
666 : there are 3 cases in which that doesn't apply:
667 :
668 : 1. If we haven't voted before, we reset to the ghost head.
669 :
670 : 2. If our last vote slot is older than the ghost root, we know we
671 : don't have ancestry information about our last vote slot anymore,
672 : so we also reset to the ghost head.
673 :
674 : 2. If our last vote slot is newer than the ghost root, but we are
675 : locked out on a minority fork that does not chain back to the
676 : ghost root, we know that we should definitely not reset to a slot
677 : on this fork to build a block, given a supermajority of the
678 : cluster has already rooted a different fork. So build off the
679 : best fork instead.
680 :
681 : See the top-level documentation in fd_tower.h for more context. */
682 :
683 : ulong
684 : fd_tower_reset_slot( fd_tower_t const * tower,
685 : fd_epoch_t const * epoch,
686 : fd_ghost_t const * ghost );
687 :
688 : /* fd_tower_vote_slot returns the correct vote slot to pick given the
689 : ghost tree. Returns FD_SLOT_NULL if we cannot vote, because we are
690 : locked out, do not meet switch threshold, or fail the threshold
691 : check.
692 :
693 : Specifically, these are the two scenarios in which we can vote:
694 :
695 : 1. the ghost head is on the same fork as our last vote slot, and
696 : we pass the threshold check.
697 : 2. the ghost head is on a different fork than our last vote slot,
698 : but we pass both the lockout and switch checks so we can
699 : switch to the ghost head's fork. */
700 :
701 : ulong
702 : fd_tower_vote_slot( fd_tower_t * tower,
703 : fd_epoch_t const * epoch,
704 : fd_funk_t * funk,
705 : fd_funk_txn_t const * txn,
706 : fd_ghost_t const * ghost,
707 : fd_spad_t * runtime_spad );
708 :
709 : /* fd_tower_simulate_vote simulates a vote on the vote tower for slot,
710 : returning the new height (cnt) for all the votes that would have been
711 : popped. Assumes tower is non-empty. */
712 :
713 : ulong
714 : fd_tower_simulate_vote( fd_tower_t const * tower, ulong slot );
715 :
716 : /* Operations */
717 :
718 : /* fd_tower_vote votes for slot. Assumes caller has already performed
719 : the relevant tower checks (lockout_check, etc.) to ensure it is valid
720 : to vote for `slot`. Returns a new root if this vote results in the
721 : lowest vote slot in the tower reaching max lockout. The lowest vote
722 : will also be popped from the tower.
723 :
724 : Max lockout is equivalent to 1 << FD_TOWER_VOTE_MAX + 1 (which
725 : implies confirmation count is FD_TOWER_VOTE_MAX + 1). As a result,
726 : fd_tower_vote also maintains the invariant that the tower contains at
727 : most FD_TOWER_VOTE_MAX votes, because (in addition to vote expiry)
728 : there will always be a pop before reaching FD_TOWER_VOTE_MAX + 1. */
729 :
730 : ulong
731 : fd_tower_vote( fd_tower_t * tower, ulong slot );
732 :
733 : /* Misc */
734 :
735 : /* fd_tower_from_vote_acc writes the saved tower inside `state` to the
736 : caller-provided `tower`. Assumes `tower` is a valid join of an
737 : fd_tower that is currently empty. */
738 :
739 : void
740 : fd_tower_from_vote_acc( fd_tower_t * tower,
741 : fd_funk_t * funk,
742 : fd_funk_txn_t const * txn,
743 : fd_funk_rec_key_t const * vote_acc );
744 :
745 : /* fd_tower_to_tower_sync converts an fd_tower_t into a fd_tower_sync_t
746 : to be sent out as a vote program ix inside a txn. */
747 :
748 : void
749 : fd_tower_to_vote_txn( fd_tower_t const * tower,
750 : ulong root,
751 : fd_hash_t const * bank_hash,
752 : fd_hash_t const * recent_blockhash,
753 : fd_pubkey_t const * validator_identity,
754 : fd_pubkey_t const * vote_authority,
755 : fd_pubkey_t const * vote_acc,
756 : fd_txn_p_t * vote_txn,
757 : fd_spad_t * runtime_spad );
758 :
759 : /* fd_tower_verify checks the tower is in a valid state. The cnt should
760 : be < FD_TOWER_VOTE_MAX, the vote slots and confirmation counts in the
761 : tower should be monotonically increasing, and the root should be <
762 : the bottom vote. */
763 :
764 : int
765 : fd_tower_verify( fd_tower_t const * tower );
766 :
767 : /* fd_tower_print pretty-prints tower as a formatted table.
768 :
769 : Sample output:
770 :
771 : slot | confirmation count
772 : --------- | ------------------
773 : 279803918 | 1
774 : 279803917 | 2
775 : 279803916 | 3
776 : 279803915 | 4
777 : 279803914 | 5
778 : 279803913 | 6
779 : 279803912 | 7
780 : */
781 :
782 : void
783 : fd_tower_print( fd_tower_t const * tower, ulong root );
784 :
785 : #endif /* HEADER_fd_src_choreo_tower_fd_tower_h */
|