Data Chains - Deeper Dive

mav · December 20, 2017, 2:08am

The recent research obviously negates some old docs and it takes time to finalize and filter through, but in RFC-0045 Node Ageing (more than one year old) there are some things I’m unclear about.

Mostly it matches extremely closely with the op, which is kinda cool considering how much work has happened since rfc0045 was written.

“Groups will only allow one node with an age of zero.” from rfc0045

The simulation says initial age 1, but rfc0045 says initial age 0. Maybe the simulation is doing a shortcut for this part of the rfc:

“This means the node itself on successful join will be immediately relocated.” and thus have age 1 at the start.

This would seem to affect the disallow rule, making it age==0 rather than the simulated age==1. Is there any difference in reality? Just trying to grasp if initial age 1 a shortcut or if it’s a reality.

“If there exists more than one node to relocate then the oldest node should be chosen.” from rfc0045

This goes directly against the op which favours relocating the youngest node. I assume op is correct and the doc needs changing, but it’s interesting because this simple change seems to have a very big impact on the ageing and splitting algorithm because the new algorithm makes much fewer adults in young networks. I think favouring younger is the correct approach, but the difference in docs is worth noting anyhow.

The relocation tiebreaker has also changed between rfc0045 and op, just worth noting, nothing further to be said.

“If two nodes have the same oldest age then the one that has been present for the most churn events is chosen to relocate.” from rfc0045, now changed as per op.

“When a churn even does force a node to relocate to a destination group (D)… The relocating node then generates a key that will fit in D and attempts to join D.” from rfc0045

Is this key generation process still applicable? It seems unavoidable but is not mentioned in op (probably it’s out of scope for op).

This process gets exponentially harder as the section prefix / network size grows. Probably worth simulating how hard it becomes and when this becomes ‘too hard’. It may be a potential issue but also may never become one in the real world; can’t say off the top of my head.

Some rough calcs:

Generating an ed25519 key pair (kp) takes about 68000 cycles on an i7 2.1 GHz machine, which is about 30K kp/s.

This means it takes about 1s to find a key with prefix of log2(30K) = 15 bits.

A network of 1B adult nodes has about 1B / 12 = 83M sections.

This means a prefix length of log2(83M) = 26 bits.

Assuming finding a kp for 26 bit prefix takes 2^26 attempts that’s about 2^(26-15) seconds = 2048 seconds = 30 minutes.

Reckon this might become an issue? This is all back-of-the-envelope so I am looking for holes in my reasoning.