Line data Source code
1 : #ifndef HEADER_fd_src_choreo_tower_fd_tower_h
2 : #define HEADER_fd_src_choreo_tower_fd_tower_h
3 :
4 : /* fd_tower presents an API for Solana's TowerBFT algorithm.
5 :
6 : What is TowerBFT? TowerBFT is an algorithm for converging a
7 : supermajority of stake in the validator cluster on the same fork.
8 :
9 : /-- 3-- 4 (A)
10 : 1-- 2
11 : \-- 5 (B)
12 :
13 : Above is a diagram of a fork. The leader for slot 5 decided to build
14 : off slot 2, rather than slot 4. This can happen for various reasons,
15 : for example network propagation delay. We now have two possible forks
16 : labeled A and B. The consensus algorithm has to pick one of them.
17 :
18 : So, how does the consensus algorithm pick? As detailed in fd_ghost.h,
19 : we pick the fork based on the most stake from votes, called the
20 : "heaviest". Validators vote for blocks during replay, and
21 : simultaneously use other validator’s votes to determine which block
22 : to vote for. This encourages convergence, because as one fork gathers
23 : more votes, more and more votes pile-on, solidifying its position as
24 : the heaviest fork.
25 :
26 : /-- 3-- 4 (10%)
27 : 1-- 2
28 : \-- 5 (9%)
29 :
30 : However, network propagation delay of votes can lead us to think one
31 : fork is heaviest, before observing new votes that indicate another
32 : fork is heavier. So our consensus algorithm also needs to support
33 : switching.
34 :
35 : /-- 3-- 4 (10%)
36 : 1-- 2
37 : \-- 5 (15%)
38 :
39 : At the same time we don’t want excessive switching. The more often
40 : validators switch, the more difficult it will be to achieve that
41 : pile-on effect I just described.
42 :
43 : Note that to switch forks, you need to rollback a given slot and its
44 : descendants on that fork. In the example above, to switch to 1, 2, 5,
45 : we need to rollback 3 and 4. The consensus algorithm makes it more
46 : costly the further you want to rollback a fork. Here, I’ve added a
47 : column lockout, which doubles for every additional slot you want to
48 : rollback.
49 :
50 : Eventually you have traversed far enough down a fork, that the
51 : lockout is so great it is infeasible to imagine it ever rolling back
52 : in practice. So you can make that fork permanent or “commit” it. Once
53 : all validators do this, the blockchain now just has a single fork.
54 :
55 : Armed with some intuition, let’s now begin defining some terminology.
56 : Here is a diagram of a validator's "vote tower":
57 :
58 : slot | confirmation count (conf)
59 : --------------------------------
60 : 4 | 1
61 : 3 | 2
62 : 2 | 3
63 : 1 | 4
64 :
65 : It is a stack structure in which each element is a vote. The vote
66 : slot column indicates which slots the validator has voted for,
67 : ordered from most to least recent.
68 :
69 : The confirmation count column indicates how many consecutive votes on
70 : the same fork have been pushed on top of that vote. You are
71 : confirming your own votes for a fork every time you vote on top of
72 : the same fork.
73 :
74 : Two related concepts to confirmation count are lockout and expiration
75 : slot. Lockout equals 2 to the power of confirmation count. Every time
76 : we “confirm” a vote by voting on top of it, we double the lockout.
77 : The expiration slot is the sum of vote slot and lockout, so it also
78 : increases when lockouts double. It represents which slot the vote
79 : will expire. When a vote expires, it is popped from the top of the
80 : tower. An important Tower rule is that a validator cannot vote for a
81 : different fork from a given vote slot, until reaching the expiration
82 : slot for that vote slot. To summarize, the further a validator wants
83 : to rollback their fork (or vote slots) the longer the validator needs
84 : to wait without voting (in slot time).
85 :
86 : Here is the same tower, fully-expanded to include all the fields:
87 :
88 : slot | conf | lockout | expiration
89 : ----------------------------------
90 : 4 | 1 | 2 | 6
91 : 3 | 2 | 4 | 7
92 : 2 | 3 | 8 | 10
93 : 1 | 4 | 16 | 17
94 :
95 : Based on this tower, the validator is locked out from voting for any
96 : slot <= 6 that is on a different fork than slot 4. I’d like to
97 : emphasize that the expiration is with respect to the vote slot, and
98 : is _not_ related to the Proof-of-History slot or what the
99 : quote-unquote current slot is. So even if the current slot is now 7,
100 : the validator can’t go back and vote for slot 5, if slot 5 were on a
101 : different fork than 4. The earliest valid vote slot this validator
102 : could submit for a different fork from 4 would be slot 7 or later.
103 :
104 : Next let’s look at how the tower makes state transitions. Here we
105 : have the previous example tower, with a before-and-after view with
106 : respect to a vote for slot 9:
107 :
108 : (before) slot | conf
109 : -----------
110 : 4 | 1
111 : 3 | 2
112 : 2 | 3
113 : 1 | 4
114 :
115 : (after) slot | conf
116 : -----------
117 : 9 | 1
118 : 2 | 3
119 : 1 | 4
120 :
121 : As you can see, we added a vote for slot 9 to the top of the tower.
122 : But we also removed the votes for slot 4 and slot 3. What happened?
123 : This is an example of vote expiry in action. When we voted for slot
124 : 9, this exceeded the expirations of vote slots 4 and 3, which were 6
125 : and 7 respectively. This action of voting triggered the popping of
126 : the expired votes from the top of the tower.
127 :
128 : Next, we add a vote for slot 11:
129 :
130 : (before) slot | conf
131 : -----------
132 : 9 | 1
133 : 2 | 3
134 : 1 | 4
135 :
136 : (after) slot | conf
137 : -----------
138 : 11 | 1
139 : 9 | 2
140 : 2 | 3
141 : 1 | 4
142 :
143 : The next vote for slot 11 doesn’t involve expirations, so we just add
144 : it to the top of the tower. Also, here is an important property of
145 : lockouts. Note that the lockout for vote slot 9 doubled (ie. the
146 : confirmation count increased by 1) but the lockouts of vote slots 2
147 : and 1 remained unchanged.
148 :
149 : The reason for this is confirmation counts only increase when they
150 : are consecutive in the vote tower. Because 4 and 3 were expired
151 : previously by the vote for 9, that consecutive property was broken.
152 : In this case, the vote for slot 11 is only consecutive with slot 9,
153 : but not 2 and 1. Specifically, there is a gap in the before-tower at
154 : confirmation count 2.
155 :
156 : In the after-tower, all the votes are again consecutive (confirmation
157 : counts 1, 2, 3, 4 are all accounted for), so the next vote will
158 : result in all lockouts doubling as long as it doesn’t result in more
159 : expirations.
160 :
161 : One other thing I’d like to point out about this vote for slot 11.
162 : Even though 11 exceeds the expiration slot of vote slot 2, which is
163 : 10, voting for 11 did not expire the vote for 2. This is because
164 : expiration happens top-down and contiguously. Because vote slot 9 was
165 : not expired, we do not proceed with expiring 2.
166 :
167 : In the Tower rules, once a vote reaches a max lockout of 32, it is
168 : considered rooted and it is popped from the bottom of the tower. Here
169 : is an example:
170 :
171 : (before) slot | conf
172 : -----------
173 : 50 | 1
174 : ... | ... (29 votes elided)
175 : 1 | 4
176 :
177 : (after) slot | conf
178 : -----------
179 : 53 | 1
180 : ... | ... (29 votes elided)
181 : 1 | 4
182 :
183 : So the tower is really a double-ended queue rather than a stack.
184 :
185 : Rooting has implications beyond the Tower. It's what we use to prune
186 : our state. Every time tower makes a new root slot, we prune any old
187 : state that does not originate from that new root slot. Our blockstore
188 : will discard blocks below that root, our forks structure will discard
189 : stale banks, funk (which is our accounts database) will discard stale
190 : transactions (which in turn track modifications to accounts), and
191 : ghost (which is our fork select tree) will discard stale nodes
192 : tracking stake percentages. We call this operation publishing.
193 :
194 : Note that the vote slots are not necessarily consecutive. Here I
195 : elided the votes sandwiched between the newest and oldest votes for
196 : brevity.
197 :
198 : Next, let’s go over three additional tower checks. These three checks
199 : further reinforce the consensus algorithm we established with
200 : intuition, in this case getting a supermajority (ie. 2/3) of stake to
201 : converge on a fork.
202 :
203 : The first is the threshold check. The threshold check makes sure at
204 : least 2/3 of stake has voted for the same fork as the vote at depth 8
205 : in our tower. Essentially, this guards our tower from getting too out
206 : of sync with the rest of the cluster. If we get too out of sync we
207 : can’t vote for a long time, because we had to rollback a vote we had
208 : already confirmed many times and had a large lockout. This might
209 : otherwise happen as the result of a network partition where we can
210 : only communicate with a subset of stake.
211 :
212 : Next is the lockout check. We went in detail on this earlier when
213 : going through the lockout and expiration slot, and as before, the
214 : rule is we can only vote on a slot for a different fork from a
215 : previous vote, after that vote’s expiration slot.
216 :
217 : Given this fork and tower from earlier:
218 :
219 : /-- 3-- 4
220 : 1-- 2
221 : \-- 5
222 :
223 : slot | conf
224 : -----------
225 : 4 | 1
226 : 3 | 2
227 : 2 | 3
228 : 1 | 4
229 :
230 : You’re locked out from voting for 5 because it’s on a different fork
231 : from 4 and the expiration slot of your previous vote for 4 is 6.
232 :
233 : However, if we introduce a new slot 9:
234 :
235 : /-- 3-- 4
236 : 1-- 2
237 : \-- 5-- 9
238 :
239 : slot | conf
240 : -----------
241 : 9 | 1
242 : 2 | 3
243 : 1 | 4
244 :
245 : Here the new Slot 9 descends from 5, and exceeds vote slot 2’s
246 : expiration slot of 6 unlike 5.
247 :
248 : After your lockout expires, the tower rules allow you to vote for
249 : descendants of the fork slot you wanted to switch to in the first
250 : place (here, 9 descending from 5). So we eventually switch to the fork
251 : we wanted, by voting for 9 and expiring 3 and 4.
252 :
253 : Importantly, notice that the fork slots and vote slots are not exactly
254 : 1-to-1. While conceptually our tower is voting for the fork 1, 2, 5,
255 : 9, the vote for 5 is only implied. Our tower votes themselves still
256 : can’t include 5 due to lockout.
257 :
258 : Finally, the switch check. The switch check is used to safeguard
259 : optimistic confirmation. Optimistic confirmation is when a slot gets
260 : 2/3 of stake-weighted votes. This is then treated as a signal that the
261 : slot will eventually get rooted. However, to actually guarantee this
262 : we need a rule that prevents validators from arbitrarily switching
263 : forks (even when their vote lockout has expired). This rule is the
264 : switch check.
265 :
266 : The switch check is additional to the lockout check. Before switching
267 : forks, we need to make sure at least 38% of stake has voted for a
268 : different fork than our own. Different fork is defined by finding the
269 : greatest common ancestor of our last voted fork slot and the slot we
270 : want to switch to. Any forks descending from the greatest common
271 : ancestor (which I will subsequently call the GCA) that are not our
272 : own fork are counted towards the switch check stake.
273 :
274 : Here we visualize the switch check:
275 :
276 : /-- 7
277 : /-- 3-- 4
278 : 1-- 2 -- 6
279 : \-- 5-- 9
280 :
281 : First, we find the GCA of 4 and 9 which is 2. Then we look at all the
282 : descendants of the GCA that do not share a fork with us, and make sure
283 : their stake sums to more than 38%.
284 :
285 : I’d like to highlight that 7 here is not counted towards the switch
286 : proof, even though it is on a different fork from 4. This is because
287 : it’s on the same fork relative to the GCA.
288 :
289 : So that covers the checks. Next, there are two additional important
290 : concepts: "reset slot" and "vote slot". The reset slot is the slot you
291 : reset PoH to when it's your turn to be leader. Because you are
292 : responsible for producing a block, you need to decide which fork to
293 : build your block on. For example, if there are two competing slots 3
294 : and 4, you would decide whether to build 3 <- 5 or 4 <- 5. In general
295 : the reset slot is the same fork as the vote slot, but not always.
296 : There is an important reason for this. Recall this fork graph from
297 : earlier:
298 :
299 : /-- 3-- 4 (10%)
300 : 1-- 2
301 : \-- 5-- 6 (9%)
302 :
303 : In this diagram, 4 is the winner of fork choice. All future leaders
304 : now want to reset to slot 4. Naively, this makes sense because you
305 : maximize the chance of your block finalizing (and earning the rewards)
306 : if you greedily (in the algorithmic, and perhaps also literal sense)
307 : pick what's currently the heaviest.
308 :
309 : However, say most validators actually voted fork 5, even though we
310 : currently observe 3 as the heavier. For whatever reason, those votes
311 : for 5 just didn't land (the leader for 6 missed the votes, network
312 : blip, etc.)
313 :
314 : All these validators that voted for 5 are now constrained by the
315 : switch check (38% of stake), and none of them can actually switch
316 : their vote to 4 (which only has 10%). But they're all continuing to
317 : build blocks on top of fork 4, which importantly implies that votes
318 : for 5 will not be able to propagate. This is because the validators
319 : that can't switch continue to refresh their votes for 5, but those
320 : votes never "land" because no one is building blocks on top of fork
321 : 5 anymore (everyone is building on 4 because that's currently the
322 : heaviest).
323 :
324 : Therefore, it is important to reset to the same fork as your last vote
325 : slot, which is usually also the heaviest fork, but not always.
326 :
327 : Note that with both the vote slot and reset slot, the tower uses ghost
328 : to determine the last vote slot's ancestry. So what happens if the
329 : last vote slot isn't in the ghost? There are two separate cases in
330 : which this can happen that tower needs to handle:
331 :
332 : 1. Our last vote slot > ghost root slot, but is not a descendant of
333 : the ghost root. This can happen if we get stuck on a minority fork
334 : with a long lockout. In the worst case, lockout duration is
335 : 2^{threshold_check_depth} ie. 2^8 = 256 slots. In other words, we
336 : voted for and confirmed a minority fork 8 times in a row. We assume
337 : we won't vote past 8 times for the minority fork, because the
338 : threshold check would have stopped us (recall the threshold check
339 : requires 2/3 of stake to be on the same fork at depth 8 before we
340 : can keep voting for that fork).
341 :
342 : While waiting for those 256 slots of lockout to expire, it is
343 : possible that in the meantime a supermajority (ie. >2/3) of the
344 : cluster actually roots another fork that is not ours. During
345 : regular execution, we would not publish ghost until we have an
346 : updated tower root. So as long as the validator stays running while
347 : it is locked out from the supermajority fork, it keeps track of its
348 : vote slot's ancestry.
349 :
350 : If the validator were to stop running while locked out though (eg.
351 : operator needed to restart the box), the validator attempts to
352 : repair the ancestry of its last vote slot.
353 :
354 : In the worst case, if we cannot repair that ancestry, then we do
355 : not vote until replay reaches the expiration slot of that last vote
356 : slot. We can assume the votes > depth 8 in the tower do not violate
357 : lockout, because again the threshold check would have guarded it.
358 :
359 : TODO CURRENTLY THIS IS UNHANDLED. WHAT THE VALIDATOR DOES IF IT
360 : HAS LOST THE GHOST ANCESTRY IS IT WILL ERROR OUT.
361 :
362 : 2. Our last vote slot < ghost root slot. In this case we simply
363 : cannot determine whether our last vote slot is on the same fork as
364 : our ghost root slot because we no longer have ancestry information
365 : before the ghost root slot. This can happen if the validator is not
366 : running for a long time, then started up again. It will have to use
367 : the snapshot slot for the beginning of the ghost ancestry, which
368 : could be well past the last vote slot in the tower.
369 :
370 : In this case, before the validator votes again, it makes sure that
371 : the last vote's confirmation count >= THRESHOLD_CHECK_DEPTH (stated
372 : differently, it makes sure the next time it votes it will expire at
373 : least the first THRESHOLD_CHECK_DEPTH votes in the tower), and then
374 : it assumes that the last vote slot is on the same fork as the ghost
375 : root slot.
376 :
377 : TODO VERIFY AGAVE BEHAVIOR IS THE SAME.
378 :
379 : Now let’s switch gears from theory back to practice. What does it mean
380 : to send a vote?
381 :
382 : As a validator, you aren’t sending individual tower votes. Rather, you
383 : are sending your entire updated tower to the cluster every time.
384 : Essentially, the validator is continuously syncing their local tower
385 : with the cluster. That tower state is then stored inside a vote
386 : account, like any other state on Solana.
387 :
388 : On the flip side, we also must stay in sync the other way from cluster
389 : to local. If we have previously voted, we need to make sure our tower
390 : matches up with what the cluster has last seen. We know the most
391 : recent tower is in the last vote we sent, so we durably store every
392 : tower (by checkpointing it to disk) whenever we send a vote. In case
393 : this tower is out-of-date Conveniently Funk, our accounts database,
394 : stores all the vote accounts including our own, so on bootstrap we
395 : simply load in our vote account state itself to to initialize our own
396 : local view of the tower.
397 :
398 : Finally, a note on the difference between the Vote Program and
399 : TowerBFT. The Vote Program runs during transaction (block) execution.
400 : It checks that certain invariants about the tower inside a vote
401 : transaction are upheld (recall a validator sends their entire tower as
402 : part of a "vote"): otherwise, it fails the transaction. For example,
403 : it checks that every vote contains a tower in which the vote slots are
404 : strictly monotonically increasing. As a consequence, only valid towers
405 : are committed to the ledger. Another important detail of the Vote
406 : Program is that it only has a view of the current fork on which it is
407 : executing. Specifically, it can't observe what state is on other
408 : forks, like what a validator's tower looks like on fork A vs. fork B.
409 :
410 : The TowerBFT rules, on the other hand, run after transaction
411 : execution. Also unlike the Vote Program, the TowerBFT rules do not
412 : take the vote transactions as inputs: rather the inputs are the towers
413 : that have already been written to the ledger by the Vote Program. As
414 : described above, the Vote Program validates every tower, and in this
415 : way, the TowerBFT rules leverage the validation already done by the
416 : Vote Program to (mostly) assume each tower is valid. Every validator
417 : runs TowerBFT to update their own tower with votes based on the
418 : algorithm documented above. Importantly, TowerBFT has a view of all
419 : forks, and the validator makes a voting decision based on all forks.
420 : */
421 :
422 : #include "../fd_choreo_base.h"
423 : #include "../epoch/fd_epoch.h"
424 : #include "../ghost/fd_ghost.h"
425 : #include "../voter/fd_voter.h"
426 : #include "../../ballet/pack/fd_microblock.h"
427 : #include "../../flamenco/runtime/fd_blockstore.h"
428 : #include "../../flamenco/runtime/fd_system_ids.h"
429 : #include "../../flamenco/txn/fd_txn_generate.h"
430 : #include "../../funk/fd_funk.h"
431 :
432 : /* FD_TOWER_USE_HANDHOLDING: Define this to non-zero at compile time
433 : to turn on additional runtime checks and logging. */
434 :
435 : #ifndef FD_TOWER_USE_HANDHOLDING
436 : #define FD_TOWER_USE_HANDHOLDING 1
437 : #endif
438 :
439 0 : #define FD_TOWER_VOTE_MAX (32UL)
440 :
441 : struct fd_tower_vote {
442 : ulong slot; /* vote slot */
443 : ulong conf; /* confirmation count */
444 : };
445 : typedef struct fd_tower_vote fd_tower_vote_t;
446 :
447 : #define DEQUE_NAME fd_tower_votes
448 0 : #define DEQUE_T fd_tower_vote_t
449 0 : #define DEQUE_MAX FD_TOWER_VOTE_MAX
450 : #include "../../util/tmpl/fd_deque.c"
451 :
452 : /* fd_tower implements the TowerBFT algorithm and related consensus
453 : rules. */
454 :
455 : struct __attribute__((aligned(128UL))) fd_tower {
456 :
457 : /* Owned memory */
458 :
459 : /* The votes currently in the tower, ordered from latest to earliest
460 : vote slot (lowest to highest confirmation count). */
461 :
462 : fd_tower_vote_t * votes;
463 :
464 : /* The root is the most recent vote in the tower to reach max lockout
465 : (ie. confirmation count 32). It is no longer present in the tower
466 : votes themselves. */
467 :
468 : ulong root; /* FIXME wire with fseq */
469 :
470 : /* smr is a non-NULL pointer to an fseq that always contains the
471 : highest observed smr. This value is initialized by replay tile.
472 :
473 : Do not read or modify outside the fseq API. */
474 :
475 : ulong * smr;
476 : };
477 : typedef struct fd_tower fd_tower_t;
478 :
479 : /* fd_tower_{align,footprint} return the required alignment and
480 : footprint of a memory region suitable for use as tower. align is
481 : double cache line to mitigate false sharing. */
482 :
483 : FD_FN_CONST static inline ulong
484 0 : fd_tower_align( void ) {
485 0 : return alignof(fd_tower_t);
486 0 : }
487 :
488 : FD_FN_CONST static inline ulong
489 0 : fd_tower_footprint( void ) {
490 0 : return FD_LAYOUT_FINI(
491 0 : FD_LAYOUT_APPEND(
492 0 : FD_LAYOUT_APPEND(
493 0 : FD_LAYOUT_INIT,
494 0 : alignof(fd_tower_t), sizeof(fd_tower_t) ),
495 0 : fd_tower_votes_align(), fd_tower_votes_footprint() ),
496 0 : alignof(fd_tower_t) );
497 0 : }
498 :
499 : /* fd_tower_new formats an unused memory region for use as a tower. mem
500 : is a non-NULL pointer to this region in the local address space with
501 : the required footprint and alignment. */
502 :
503 : void *
504 : fd_tower_new( void * mem );
505 :
506 : /* fd_tower_join joins the caller to the tower. tower points to the
507 : first byte of the memory region backing the tower in the caller's
508 : address space.
509 :
510 : Returns a pointer in the local address space to tower on success. */
511 :
512 : fd_tower_t *
513 : fd_tower_join( void * tower );
514 :
515 : /* fd_tower_leave leaves a current local join. Returns a pointer to the
516 : underlying shared memory region on success and NULL on failure (logs
517 : details). Reasons for failure include tower is NULL. */
518 :
519 : void *
520 : fd_tower_leave( fd_tower_t const * tower );
521 :
522 : /* fd_tower_delete unformats a memory region used as a tower. Assumes
523 : only the local process is joined to the region. Returns a pointer to
524 : the underlying shared memory region or NULL if used obviously in
525 : error (e.g. tower is obviously not a tower ... logs details). The
526 : ownership of the memory region is transferred to the caller. */
527 :
528 : void *
529 : fd_tower_delete( void * tower );
530 :
531 : /* fd_tower_lockout_check checks if we are locked out from voting for
532 : `slot`. Returns 1 if we can vote for `slot` without violating
533 : lockout, 0 otherwise.
534 :
535 : After voting for a slot n, we are locked out for 2^k slots, where k
536 : is the confirmation count of that vote. Once locked out, we cannot
537 : vote for a different fork until that previously-voted fork expires at
538 : slot n+2^k. This implies the earliest slot in which we can switch
539 : from the previously-voted fork is (n+2^k)+1. We use `ghost` to
540 : determine whether `slot` is on the same or different fork as previous
541 : vote slots.
542 :
543 : In the case of the tower, every vote has its own expiration slot
544 : depending on confirmations. The confirmation count is the max number
545 : of consecutive votes that have been pushed on top of the vote, and
546 : not necessarily its current height in the tower.
547 :
548 : For example, the following is a diagram of a tower pushing and
549 : popping with each vote:
550 :
551 :
552 : slot | confirmation count
553 : -----|-------------------
554 : 4 | 1 <- vote
555 : 3 | 2
556 : 2 | 3
557 : 1 | 4
558 :
559 :
560 : slot | confirmation count
561 : -----|-------------------
562 : 9 | 1 <- vote
563 : 2 | 3
564 : 1 | 4
565 :
566 :
567 : slot | confirmation count
568 : -----|-------------------
569 : 10 | 1 <- vote
570 : 9 | 2
571 : 2 | 3
572 : 1 | 4
573 :
574 :
575 : slot | confirmation count
576 : -----|-------------------
577 : 11 | 1 <- vote
578 : 10 | 2
579 : 9 | 3
580 : 2 | 4
581 : 1 | 5
582 :
583 :
584 : slot | confirmation count
585 : -----|-------------------
586 : 18 | 1 <- vote
587 : 2 | 4
588 : 1 | 5
589 :
590 :
591 : In the final tower, note the gap in confirmation counts between slot
592 : 18 and slot 2, even though slot 18 is directly above slot 2. */
593 :
594 : int
595 : fd_tower_lockout_check( fd_tower_t const * tower,
596 : fd_ghost_t const * ghost,
597 : ulong slot );
598 :
599 : /* fd_tower_switch_check checks if we can switch to `fork`. Returns 1
600 : if we can switch, 0 otherwise.
601 :
602 : There are two forks of interest: our last vote fork ("vote fork") and
603 : the fork we want to switch to ("switch fork"). The switch fork is
604 : what the caller passes in as `fork`.
605 :
606 : In order to switch, FD_TOWER_SWITCH_PCT of stake must have voted for
607 : a different descendant of the GCA of vote_fork and switch_fork, and
608 : also must be locked out from our last vote slot.
609 :
610 : Recall from the lockout check a validator is locked out from voting
611 : for our last vote slot when their last vote slot is on a different
612 : fork, and that vote's expiration slot > our last vote slot.
613 :
614 : The following pseudocode describes the algorithm:
615 :
616 : ```
617 : find the greatest common ancestor (gca) of vote_fork and switch_fork
618 : for all validators v
619 : if v's locked out[1] from voting for our latest vote slot
620 : add v's stake to switch stake
621 : return switch stake >= FD_TOWER_SWITCH_PCT
622 : ```
623 :
624 : The switch check is used to safeguard optimistic confirmation.
625 : Specifically: FD_TOWER_OPT_CONF_PCT + FD_TOWER_SWITCH_PCT >= 1. */
626 :
627 : int
628 : fd_tower_switch_check( fd_tower_t const * tower,
629 : fd_epoch_t const * epoch,
630 : fd_ghost_t const * ghost,
631 : ulong slot );
632 :
633 : /* fd_tower_threshold_check checks if we pass the threshold required to
634 : vote for `slot`. This is only relevant after voting for (and
635 : confirming) the same fork ie. the tower is FD_TOWER_THRESHOLD_DEPTH
636 : deep. Returns 1 if we pass the threshold check, 0 otherwise.
637 :
638 : The following psuedocode describes the algorithm:
639 :
640 : ```
641 : for all vote accounts in the current epoch
642 :
643 : simulate that the validator has voted for `slot`
644 :
645 : pop all votes expired by that simulated vote
646 :
647 : if the validator's latest tower vote after expiry >= our threshold
648 : slot ie. our vote from FD_TOWER_THRESHOLD_DEPTH back (after
649 : simulating a vote on our own tower the same way), then add
650 : validator's stake to threshold_stake.
651 :
652 : return threshold_stake >= FD_TOWER_THRESHOLD_PCT
653 : ```
654 :
655 : The threshold check simulates voting for the current slot to expire
656 : stale votes. This is to prevent validators that haven't voted in a
657 : long time from counting towards the threshold stake. */
658 :
659 : int
660 : fd_tower_threshold_check( fd_tower_t const * tower,
661 : fd_epoch_t const * epoch,
662 : fd_funk_t * funk,
663 : fd_funk_txn_t const * txn,
664 : ulong slot );
665 :
666 : /* fd_tower_reset_slot returns the slot to reset PoH to when building
667 : the next leader block. Assumes tower and ghost are both valid local
668 : joins and in-sync ie. every vote slot in tower corresponds to a node
669 : in ghost. Returns FD_SLOT_NULL if this is not true.
670 :
671 : In general our reset slot is the fork head of our last vote slot, but
672 : there are 3 cases in which that doesn't apply:
673 :
674 : 1. If we haven't voted before, we reset to the ghost head.
675 :
676 : 2. If our last vote slot is older than the ghost root, we know we
677 : don't have ancestry information about our last vote slot anymore,
678 : so we also reset to the ghost head.
679 :
680 : 2. If our last vote slot is newer than the ghost root, but we are
681 : locked out on a minority fork that does not chain back to the
682 : ghost root, we know that we should definitely not reset to a slot
683 : on this fork to build a block, given a supermajority of the
684 : cluster has already rooted a different fork. So build off the
685 : best fork instead.
686 :
687 : See the top-level documentation in fd_tower.h for more context. */
688 :
689 : ulong
690 : fd_tower_reset_slot( fd_tower_t const * tower,
691 : fd_epoch_t const * epoch,
692 : fd_ghost_t const * ghost );
693 :
694 : /* fd_tower_vote_slot returns the correct vote slot to pick given the
695 : ghost tree. Returns FD_SLOT_NULL if we cannot vote, because we are
696 : locked out, do not meet switch threshold, or fail the threshold
697 : check.
698 :
699 : Specifically, these are the two scenarios in which we can vote:
700 :
701 : 1. the ghost head is on the same fork as our last vote slot, and
702 : we pass the threshold check.
703 : 2. the ghost head is on a different fork than our last vote slot,
704 : but we pass both the lockout and switch checks so we can
705 : switch to the ghost head's fork. */
706 :
707 : ulong
708 : fd_tower_vote_slot( fd_tower_t * tower,
709 : fd_epoch_t const * epoch,
710 : fd_funk_t * funk,
711 : fd_funk_txn_t const * txn,
712 : fd_ghost_t const * ghost );
713 :
714 : /* fd_tower_simulate_vote simulates a vote on the vote tower for slot,
715 : returning the new height (cnt) for all the votes that would have been
716 : popped. */
717 :
718 : ulong
719 : fd_tower_simulate_vote( fd_tower_t const * tower, ulong slot );
720 :
721 : /* fd_tower_vote votes for slot. Assumes caller has already performed
722 : all the tower checks to ensure this is a valid vote. */
723 :
724 : void
725 : fd_tower_vote( fd_tower_t const * tower, ulong slot );
726 :
727 : /* fd_tower_is_max_lockout returns 1 if the bottom vote of the tower has
728 : reached max lockout, 0 otherwise. Max lockout is equivalent to 1 <<
729 : FD_TOWER_VOTE_MAX (equivalently, confirmation count is
730 : FD_TOWER_VOTE_MAX). So if the tower is at height FD_TOWER_VOTE_MAX,
731 : then the bottom vote has reached max lockout. */
732 :
733 : static inline int
734 0 : fd_tower_is_max_lockout( fd_tower_t const * tower ) {
735 0 : return fd_tower_votes_cnt( tower->votes ) == FD_TOWER_VOTE_MAX;
736 0 : }
737 :
738 : /* fd_tower_publish publishes the tower. Returns the new root. Assumes
739 : caller has already checked that tower is at max lockout (see
740 : fd_tower_is_max_lockout).
741 :
742 : smr is a non-NULL pointer to an fseq that always contains the highest
743 : observed smr.
744 :
745 : IMPORTANT! Caller should not read or modify this value outside the
746 : fseq API. */
747 :
748 : static inline ulong
749 0 : fd_tower_publish( fd_tower_t * tower ) {
750 0 : #if FD_TOWER_USE_HANDHOLDING
751 0 : FD_TEST( fd_tower_is_max_lockout( tower ) );
752 0 : #endif
753 :
754 0 : ulong root = fd_tower_votes_pop_head( tower->votes ).slot;
755 0 : tower->root = root;
756 0 : return root;
757 0 : }
758 :
759 : /* fd_voter_from_vote_acc writes the saved tower inside `state` to the
760 : caller-provided `tower`. Assumes `tower` is a valid join of an
761 : fd_tower that is currently empty. */
762 :
763 : void
764 : fd_tower_from_vote_acc( fd_tower_t * tower,
765 : fd_funk_t * funk,
766 : fd_funk_txn_t const * txn,
767 : fd_funk_rec_key_t const * vote_acc );
768 :
769 : /* fd_tower_to_tower_sync converts an fd_tower_t into a fd_tower_sync_t
770 : to be sent out as a vote program ix inside a txn. */
771 :
772 : void
773 : fd_tower_to_vote_txn( fd_tower_t const * tower,
774 : fd_hash_t const * bank_hash,
775 : fd_hash_t const * recent_blockhash,
776 : fd_pubkey_t const * validator_identity,
777 : fd_pubkey_t const * vote_authority,
778 : fd_pubkey_t const * vote_acc,
779 : fd_txn_p_t * vote_txn );
780 :
781 : /* fd_tower_print pretty-prints tower as a formatted table.
782 :
783 : Sample output:
784 :
785 : slot | confirmation count
786 : --------- | ------------------
787 : 279803918 | 1
788 : 279803917 | 2
789 : 279803916 | 3
790 : 279803915 | 4
791 : 279803914 | 5
792 : 279803913 | 6
793 : 279803912 | 7
794 : */
795 :
796 : void
797 : fd_tower_print( fd_tower_t const * tower );
798 :
799 : #endif /* HEADER_fd_src_choreo_tower_fd_tower_h */
|