Introduction
This document is a work in progress and is likely to change somewhat, with some sections marked as “TBC” (to be confirmed). It is a slightly deeper discussion regarding data chains, but does try and stay high level enough to be able to convey the important messages. Along with these desires it is hoped that this is also as brief as possible, allowing discourse and hopefully agreement on the fundamental aspects of the network. This will include:
- data chains
- peer age
- peer penalties (TBC)
- secure message transfer (TBC)
Consensus types
- Remote network consensus: This is the consensus of events relayed securely to our section.
- Local consensus: This is the consensus of events observed by our current section.
- Chain (offline) consensus: This is a sub-set of local consensus which provides a cryptographically verifiable order of events.
Definitions
Complete group: at least GROUP_SIZE
peers with age >4 in a section.
Elders: the GROUP_SIZE
oldest peers in the section. If there is a tie, we use the tie breaker rule for node aging. They are the only ones who have a routing table and will be in routing table.
Tie breaker rule for Node Aging: If there are multiple peers of the same age then XOR their public keys together, hash the result and find the one XOR closest to this hash.
Adult: in a section with a complete group a peer with age >4
. They are only connected to their section Elder
s.
Infant: non elder with age ≤ 4. They are only connected to their section Elder
s.
Vote: a node’s detection of a network event plus its signature of it.
Proof: a node’s id and its signature against a network event.
Block: network event with a vector of proofs collected.
Quorum: Majority of valid voters && > 50% of total age of valid voters.
Group consensus: a block containing a Quorum of proofs of the elders.
Section Membership: All the members of a single section, comprising the elders, adults and infants. All of whom hold the data chain.
Neighbours: sections we are supposed to be connected to differing in exactly one bit from us.
Sibling: A section differing from us in the last bit.
Prefix: The bits that identify the section. It additionally has version that increments by one each time the prefix appears on the network.
Static group and quorum sizes
When a section does not have a complete group, all oldest peers up to GROUP_SIZE
are Elders
and all section members influence node aging.
All peers must get all section relevant data, sign it and provide their votes to be accepted into the section. This replaces/enhances resource proof mechanism.
Newly joining peers enter the section as Infants
with age 1. In order for the number of Infants
in a section not to grow out of control, once our section has a Complete Group
we also stop accepting new Infants
(with age 1) if our section already has such an Infant
.
Adults
and Infants
only store the data for now, and only connect to their section’s elders. Elders store the data and connect to all peers in their section and all elders in neighbouring sections.
There is no attempt in this paper to find other work for adults and infants, but it is very likely they can further reduce load on the elders.
Detailed design
Types
The DataChain
will contain a permanent record of states for all elders in a section (this will be flushed to disk periodically) and a temporary record for adults and infants. The permanent one named blocks
and will be Vec<Block<ElderState>>
. Meanwhile the transient one named valid_peers
and will be Vec<Block<AdultOrInfantState>>
.
enum ElderState {
/// Accepted but not yet lost. May have joined or been relocated here, or may
/// have come back via a merge.
Live(PeerId),
/// Cannot ever become live again
Dead(PeerId),
/// Gone to another section via "split" or become an Adult (Demoted). Can
/// again become live here.
Gone(PeerId),
}
enum AdultOrInfantState {
Live(PeerId),
}
Ordering constraints on ElderState for Complete Groups
The ElderState
type blocks will come in pairs, where Live
immediately follows Dead
or Gone
. DataChain::blocks
should be reordered as these blocks become valid so we have the above pairings. (e.g. if two Dead
blocks become valid, we insert the corresponding Live
block between them once it becomes valid). The ordering of these pairs are defined by the valid voters.
/// Traversing the blocks from reverse (newest to oldest block)
for (elder_state, i) in blocks.iter().enumerate().rev() {
match elder_state {
ElderState::Dead |
ElderState::Gone => assert!(blocks[i-1], ElderState::Live), // if not then reorder
}
}
A Vote
is defined as:
struct Vote<T> {
payload: T,
signature: Signature,
}
The peer ID is omitted here as it is implicit in direct messages. The signature is the sender’s signature of the payload
.
struct PeerId {
age: u8,
pub_key: PublicKey,
}
struct Proof {
peer_id: PeerId,
sig: Signature,
}
The peer_id
of the Vote
’s sender is inserted by the receiving peer to turn the Vote
into a Proof
.
struct Block<T> {
payload: T,
proofs: BTreeSet<Proof>,
}
struct Prefix {
bits: BitVec,
version: u64,
}
Intra Section Voting
A peer voting can be seen as sending either a Vote
or a Block
that contains its proof.
Vote
’s are sent and received only between elders while Blocks
are sent to all the section members.
Peers will only send a Vote
or Block
once.
on local event:
- if block exists with our proof → end
- else: vote to elders and trigger
handle_vote_received(true)
handle_vote_received(swarm: bool)
:
- if the block doesn’t exist
- generate new
- else:
- add proof to existing block
- if it becomes valid
- add your proof to the block if not already there
- if
swarm
- swarm the block
handle_block_received
:
- if the block is not valid based on current section → drop the
Block
. We only use theProof
s inside the givenBlock
to check the validity and not add the extraProof
s that we already know of. - Call
handle_vote_received(false)
for everyProof
that’s inside theBlock
. - Swarm the block (the one formed inside
handle_vote_received
)
Block insertion rule:
Insert the blocks into the chain and action on it (e.g. sever connections or other steps necessary) only if correct pairing is obtained. Until then hold the block but do nothing because of it.
The goal of the insertion rule is to always get to GROUP_SIZE
Elders
.
Block States
Block states are represented as:
pub enum BlockState {
NotYetValid,
Valid,
Full,
}
NotYetValid: Contains Proof
s < Quorum
Valid: Contains Proof
s >= Quorum && != Full
Full: Contains Proof
s == maximum possible Proof
s (will be defined later on)
ElderState (Only for the DataChain::blocks)
Live(PeerId, u8)
A peer in this state is a valid voter.
Trigger for voting
Once we have more than GROUP_SIZE
peers in the section, this should follow the voting of Dead
OR be followed by Gone
if we saw these ourselves OR if we got a valid block containing any of these events.
On consensus
Add the peer to the routing table.
Dead(PeerId)
Never allow the PeerId
to become live again. Peer relocated on churn or voted to be killed because of agreed accusations.
Trigger for voting
As we receive/form a valid block of Live
for non-infant peers, we take the Hash of the event H. Then if H % 2^age == 0
for any peer (sorted by age ascending) in our section, we relocate this node to the neighbour that has the lowest number of peers.
If two peers have the same age, use the tie breaker rule. If all neighbours have the same number of peers we relocate to the section closest to the H
above (that is not us).
OR
When we detect behaviour deemed enough to kill a node. This will form a part of accusations which is intended for later.
OR
Peer has gone offline. Can happen out of order, but forces next state.
On consensus
Remove the peer from the routing table and disconnect if not already disconnected.
Gone(PeerId)
A peer leaves the Elder
group in this section but is allowed to come back to the Elder
group with the same PeerId
. It cannot however join by restarting.
Trigger for voting
- Split where the peer ends up as an elder in the sibling section
- Merge where the peer ends up becoming an
Adult
from anElder
- When an
Elder
is accepted and pushes an existing one out as anAdult
On consensus
Remove peer’s entry from current section in the Routing Table.
Adults and Infants
They are not a part of the chain. Adults
and Infants are stored as Blocks
(Live
) in a separate container and these entries are transient. Once an Adult
or Infant
is no longer a part of our section (either Dead
or Gone
) then we remove its block.
The blocks of Adults
and Infants
are always signed by the current Elders
. If a new Elder
becomes Live in our section then it must provide its proof for all the existing Blocks
of Adults
and Infants
. If an existing Elder
is Dead
or Gone
then its proof must be removed from the all these Blocks
.
By keeping the container separate (from the main Elder
chain), pruning it constanty as these peers disappear and providing Proofs
from only the most recent Elders
we keep the Adults
and Infants
blocks bubbled up to the top instead of embedded deep in the chain where we need to search more expensively to find out about them.
Merge
Data structures
struct SiblingEldersRpc {
/// This is our vote for each of the sibling elders' Ids
sibling_elder_ids: Vec<Vote<PeerId>>,
}
struct MergeInfoRpc {
/// Our Adults and Infants
adults_and_infants: Vec<Vote<PeerId>>,
/// PeerIds of the additional neighbours
new_neighbours: BTreeMap<Prefix, Vec<Vote<PeerId>>>,
}
The algorithm
Say we are an elder in the section S00, merging with S01 into S0.
- We realise that we need to merge, due to one of the two reasons:
- Once we realise that, we send a
SiblingEldersRpc
to everyone in our section. The RPC contains our votes for every elder in the sibling section. - When receiving
SiblingEldersRpc
s from the other elders, we accumulate the votes and create blocks in the cache.- Once a block becomes valid, we pass it to everyone in our section (to make sure that even if some malicious peers voted only to us, everyone will see it)
- Once we accumulate a quorum of valid blocks from
SiblingEldersRpc
s, we start sendingMergeInfoRpc
s to every elder in the sibling section.- note: every PeerId is accumulated separately - we wait for a quorum of those
- If there is any change to our adults or infants, we send the RPC again with the updated data.
- Once we get at least a quorum of
MergeInfoRpc
(note: we should get up to GROUP_SIZE messages initially; there may be updates later, but we don’t wait for them), start doing the following in parallel:- vote for
Live(x)
to everyone in the sibling section if x is in our section and is supposed to be an elder of the merged section; - vote for
Gone(x)
to everyone in our section if x is an elder of our pre-merge section, but shouldn’t be an elder in the merged section; - vote for
Gone(x)
to everyone in the sibling section if x is an elder of that section, but shouldn’t be an elder of the merged section; - if a
SiblingEldersRpc
block accumulated during voting, adjust our view of the post-merge elders and send appropriate votes (ie. if we thought the post-merge elders will be 1,2,3,4 and after the update we think they should be 1,2,3,5, voteGone(4)
andLive(5)
) - the above ensures that every block will have a quorum of current elders, regardless of the order peers put them in the chain
- every vote for
Gone(x)
above means also voting forAdultsAndInfants::Live
, and analogous forLive(x)
- vote for
- If any churn event happened since the time we realised the necessity to merge, send the vote for it to every elder in both sections.
- If a
Dead(x)
accumulated forx
that was supposed to be an elder in the merged section, calculate the next eldery
(based on all the accumulated elders, adults and infants) and voteLive(y)
- If a
- Specific rules for voting:
- If you were an elder before the merge started, you send the votes to appropriate sets of peers (defined by the rules above) for the whole duration of the merge, even if you have been demoted in the meantime
- If you were an elder in a pre-merge section and receive votes from any peer who was an elder in a pre-merge section or is supposed to be an elder of the post-merge section (might be a pre-merge adult if an elder dies during the merge), you consider that vote to be valid (regardless of their elder/adult status) and form a not-yet-valid block. The same applies to received blocks.
- If you are an adult that was a pre-merge elder, you send your votes to all the peers you perceive as current elders.
- If you are an elder during the merge and receive a vote from a pre-merge elder who is now an adult, you forward it to all the pre-merge elders.
- The standard rules for block validity apply - a block is valid if it has a quorum of votes from the members as defined by the previous block. The rules should ensure that we will receive votes from at least a quorum of voters from both pre-merge sections throughout the whole merge, which will allow us to find an order for them that makes them valid. At the same time, the malicious nodes will be too few to be able to form a block that will be considered valid at any point of the chain.
- Append all accumulated adults and infants to the adult/infant chain (except the ones that might have become elders in the meantime).
- Connect to the new neighbours (the ones that were your sibling’s neighbours but not yours before the merge) - the information about them was in the
MergeInfoRpc
struct. - The merge is complete once the target elder group has been formed (all the agreed target elders are either promoted or dead; they may die during the merge).
Example
- S00 decides to merge with S01.
- S00 sends
SiblingEldersRpc
to its own section - S00…
Each elder in S00 fills it up with its votes for what it considers are the elders in the sibling. We will accumulate on per PeerId basis. It’s OK if some members disagree on the complete membership of the sibling, at-least the quorum of them will be agreed upon. The rest will fall into place due to eventual consistency after following the algorithm to the end. - S00 sends
MergeInfoRpc
to S01 to indicate about its adults and infants, afterSiblingEldersRpc
accumulates for it. The accumulation is done on per PeerId being voted on, not the entire set at once. The accumulation of this in S01 will make it understand there is a merge happening and it needs to react if it hasn’t already (which might happen if both simultaneously realise they want to merge - it is symmetrical so shouldn’t matter) - this is when you have a validSiblingElderRpc
accumulated. - S01 will do the same thing - send
SiblingEldersRpc
to its own section, S01, and sendMergeInfoRpc
to the sibling. - Whichever peer gets
PeerId
s fromSiblingEldersRpc
accumulated, starts voting for appropriateLive
andGone
events.
Split
Data structures
We define an additional RPC:
struct SplitRpc;
The only purpose of this structure is to be voted for by the elders, and when it accumulates, everyone will know that they are, in fact, splitting. Otherwise, the following Live
and Gone
events could be mistaken for natural churn.
The algorithm
Assume we are an elder in a section that is about to split.
- We realise that we need to split when both child halves of our section have at least
GROUP_SIZE + SPLIT_BUFFER
peers with age ≥ 4 (the age condition being in force only once the section has a complete group).- A child half is the set of peers that belong to our section and match the prefix one bit longer than our current prefix. There are always two child halves, for
p0
andp1
, wherep
is our current prefix.
- A child half is the set of peers that belong to our section and match the prefix one bit longer than our current prefix. There are always two child halves, for
- When we realise the need to split, we cast a vote on
SplitRpc
. We use the usual voting algorithm. - When
SplitRpc
accumulates, we start the split.- The valid
SplitRpc
block is not recorded in the chain. - We calculate which child half of the section we will belong to. Let’s call this half “our half”, and the other one “sibling half”.
- The valid
- All pre-split and post-split elders cast votes for all
Live
andGone
events that need to happen to bring the set of the elders to the post-split configuration in both sections. The votes are sent to all the pre-split and post-split elders that fit the corresponding prefix.- Say our section is S0 which is splitting into S00 and S01, and we vote for
Gone(1)
. 1 will belong to S0, which means it only needs to be gone in S01 - so we send the vote only to the peers matching the prefix of S01. - This ensures that both branches of the chain will have quora of votes for every block. When the section splits, every elder belongs to one of the children, but still has to vote for a few blocks in the sibling section until they vote for him being
Gone
. - The blocks will generally have more than GROUP_SIZE votes. Once a block is accumulated and inserted into the chain, the rendundant votes are dropped - only the ones signed by peers implied by the previous block to be the elders are retained.
- Say our section is S0 which is splitting into S00 and S01, and we vote for
- The same as 4 happens with the adults and infants. The pre-split and post-split elders vote for
Gone
for every adult and infant and send the votes to the relevant peers. In addition, the post-split elders that weren’t pre-split elders need to voteLive
for every relevant adult and infant.- The
Gone
blocks for adults and infants aren’t saved in the chain, but cause removal of the correspondingLive
blocks from theadults_and_infants
chain.
- The
- All pre-split and post-split elders vote for any other event relevant to their new prefix.
- Example: if peer A that would be a part of S01 died, all pre-split elders matching S01 and all S01’s post-split elders vote for
Dead(A)
- Example: if peer A that would be a part of S01 died, all pre-split elders matching S01 and all S01’s post-split elders vote for
- The split is considered complete when the final set of the elders is promoted.
Example
There are sections S0
, S10
and S11
. S0
currently has 1
, 2
, C
, D
as elders, and is about to split into S00
(will have 1
- 4
as elders) and S01
(will have A
- D
as elders).
The following is the flow:
- A
Live
becoming valid triggersS0
to start split process. - The elders of
S0
vote forSplitRpc
, sending the accumulated blocks to the adults and infants as well - Once the above accumulates:
3.1.1
,2
,C
,D
,3
,4
cast votes forLive(3)
,Live(4)
,Gone(C)
,Gone(D)
to1
,2
,3
and4
(they don’t vote for blocks where it doesn’t make sense, though - as in,3
doesn’t vote forLive(3)
etc.)
3.2.1
,2
,C
,D
,A
,B
cast votes forLive(A)
,Live(B)
,Gone(1)
,Gone(2)
toA
,B
,C
andD
3.3. When the blocks accumulate, they are swarmed to the elders of neighbouring sections and adults/infants of own section - Adults/Infants within
S0
, when they receive new accumulated Live/Gone blocks:- Record those blocks regarding elders that will be in the same new section with us.
- Elders within
S10
orS11
, when they receive new accumulated Live/Gone blocks:- Disconnect from elders that they are no longer supposed to be connected to, and start connecting to new promoted elders.