A trivial attempt at optimizing blockchain networking through incentivization
Abstract
Blockchains are inherently broadcast networks which means that every block published by a block-producer is propagated to all the other nodes in the network. In PoW chains, blocks are required to be propagated fast enough to keep orphan rates low. High orphan rates reduce security and also encourage centralization. On the other hand, PoS chains with 1-2 second block times restrict the size of the network to a few hundred nodes or lesser.
Peer to peer networking, today, follows an unincentivized commons model which isn't incentive compatible for the parties involved. Full nodes in the network which are the backbone of decentralized and censorship-resistant propagation are not incentivized for the same in the absence of which block producers have conflicting interests when forwarding blocks. An unincentivized model further adds uncertainty onto the time by which blocks reach every interested party.
We address these issues by proposing an incentivized and efficient relay networking protocol called Marlin Relay. Marlin Relay creates incentives for nodes in a network to compete with one another to propagate blocks. We believe that a blockchain-agnostic incentive-compatible relay network can increase the network-layer security of individual blockchains while also boosting their throughputs by pooling bandwidth resources and reducing tail latency.
Introduction
Blockchain is a replicated state machine with a consensus protocol for all the nodes to have a common view of the system. Each block added to the replicated state machine is broadcasted across the network to ensure consistent view. Blocks in PoW chains especially, have to be propagated fast enough so that the other miners mine on top of the latest block instead of mining at the same height as already mined. Blocks hop on a number of flaky nodes with unstable internet connections and pass through trans-continental submarine links as they make their way to other nodes spread across the globe. Unsurprisingly, the transmission requires some time (due to packet losses and consequent retransmissions) leading to the occasional discovery and transmission of a competing block at the same height in the midst known as forks. The occurrence of forks compromise open networks in two more ways:
- They reduce the security of the main chain as hash power that could have been spent on securing one dominant chain is now dispersed across a couple of orphan branches that are eventually discarded making 51% attacks on the main chain slightly more easier. This could be considered an unfortunate leak off the security budget.
- They encourage geographical clustering and incentivize miners to centralize. Blocks orphaned in branches that do not make it to the longest chain impose an economic loss on the miner who mined them as the associated coinbase transactions become worthless. Every moment that miners spend mining on top of old blocks while the latest block is being propagated through the network is time lost with respect to the miner who mined the latest block or received it soon enough and is now building on top of it. This violates the fairness property of PoW chains driving miners with a lower proportion of hashpower out of competition.
Previous work
There has been some work to improve the propagation delays but mostly specific to blockchains, notably FIBRE and graphene for Bitcoin.
FIBRE
To address these concerns, Matt Corallo, another Bitcoin core developer and Blockstream co-founder designed and deployed a relay network (homonymous to the protocol, FIBRE) consisting of a few servers strategically located across the globe in a star topology. Miners connect to the closest node of the relay and use it to send and receive blocks from other nodes in the Bitcoin network. The optimizations (also see BIP 152), improvements and years of sysad experience culminated in over a 90% reduction in block propagation time and fork rate (more interesting charts and analysis here and here).
Contributions
Marlin is blockchain-agnostic. As a result, to benchmark performances for various other blockchains, we required a FIBRE-like protocol and relay network for chains that FIBRE did not support. FIBRE’s implementation is currently closely intertwined with the Bitcoin Core software thus making quick modifications for other blockchains strenuous if not impossible. Consequently, we set out on designing an architecture that would not require significant changes to the core network once deployed to add support for new blockchains.
While the FIBRE network is run for free, we understand that not everyone is so altruistic. Moreover, where does altruism fit in systems designed for environments with economically rational players and byzantine actors anyway! Hence we have regarded, incentive compatibility and monetization to be one of the goals of Marlin.
Summary
Marlin Relay is a blockchain-agnostic permissionless relay network with built-in incentivization. Thus, unlike several other relay networks which are geared towards a specific blockchain, Marlin Relay is generic and can be integrated with multiple blockchains simultaneously. Furthermore, an incentive structure ensures that trustless nodes can contribute resources to the network without adversely affecting its security guarantees.
Marlin Relay involves participation of the following actors:
Producers - Set of nodes who introduce blocks into the relay network. Producers are generally miners who wish to gets their blocks propagated to other miners in the network in the shortest time to ensure that the block is not orphaned. However, any user with an incentive to propagate the block fast enough can be a producer as well. Producers are responsible for the validity of the propagated block and spam prevention mechanisms ensure that producers who introduce invalid blocks are slashed appropriately.
Receivers - Set of nodes in the network who wish to receive block as soon as possible. Receivers can be other miners in the network who want to build on top of the latest block to ensure that they are not building on top of an older block which gets orphaned. Receivers can also be exchanges, block explorers or any other node which can benefit from receiving the blocks as soon as possible. Receivers pay a subscription fee to the protocol to join the receiver set.
Relayers - Set of nodes in the network which participate in the relay of blocks from producers to receivers. Relayers are incentivized to propagate blocks quickly as receivers only pay the subset of relayers who participate in the fastest propagation of the block to them.
Clusters - Group of relayers who are collectively responsible for delivering the block from a producer to receivers. Clusters are rewarded as a single entity with the reward being distributed amongst the relayers in the cluster.
Overview
The core network is the backbone of the relay network which propagates blocks to nodes connected to it. The core network is composed of clusters which connect the producers with receivers. Receivers subscribe to blocks from a specific blockchain network for a certain period by paying a subscription fee. Block producers who build a block divide the message into multiple chunks and publish them to the network through the various clusters. The clusters that propagate chunks of a particular block compete with each other to deliver them fastest to the receiver as only the first few clusters to deliver chunks to a particular receiver are rewarded.
Receivers willing to pay a subscription fee can directly subscribe for blocks from the core network and receive blocks reliably within the earliest possible time. Receivers pay different subscription fees based on the reliability bucket they choose, each reliability bucket receives a different number of chunks per block.
The core network is responsible for propagating blocks introduced by producers widely across the globe.
Protocol Description
This section explains how various components of Marlin Relay interact with each other to enable incentivization and reliable data transmission.
Design Principles
Marlin Relay is designed to align with the following principles:
- General - No restrictions based on consensus algorithms to ensure compatibility with a wide range of blockchains.
- Extensible - Cross-platform; adding support for new chains requires minimal effort and resources.
- User-friendly - Marlin SDK makes integration into any codebase easy.
- Incentive compatible - In constrast to present day altruism-based gossip, Marlin Relay is designed for environments with economically rational players and byzantine actors.
Security Model
The security of Marlin Relay refers to the defence the network provides against an attacker who doesn't want blocks to be propagated.
Marlin Relay assumes an "uncoordinated rational majority model" wherein participants act rationally and may collude with upto 1/3 of the participants in the network.
Specifically, attackers/byzantine actors may compromise upto 1/3rd the clusters in the network.
Formation of a cluster
Cluster is formed organically as a result of the reputation calculated for each of the relayers based on their previous performance. Fee Rewards received by the cluster is an indicator for the performance of the cluster as fees are awarded probablistically based on messages delivered. The cluster can be either organized by a central entity or a distributed cluster where various relayers of similar reputation and diverse geographies can come together to form a cluster. A cluster has to be registered onchain by staking LIN. The cluster will be able to deploy it's own cluster management contract for the distribution of rewards.
Reputation Management
Reputations can be individually calculated and maintained offchain by every node in the network based on the rewards distributed in the previous epoch. Every member of the cluster which won the reward will get increase in their reputation. The member of the cluster which has submitted the receipt that won the reward will get increase in their reputation. ensures that every member of a winning cluster sees an increase in reputation for their role in the transmission. ensures that the clusters which are serving most number of receivers has a higher reputation. The reputation decays over time to ensure that recently gained reputation has an higher influence of the overall reputation of the relayer. The following equation shows how reputation is calculated at the end of an epoch.
R_\{i+1} = (R_i + r*R_m + c*R_c)/D
where is the number of times a relayer's receipt won the reward and is the number of times a receipt from the cluster won the reward. is the factor by which the reputation decays over time
New Relayer entry
A relayer who has recently joined the network with a new identity will be given a default base reputation. The relayer can join any cluster that allows relayers with a small reputation. The relayer then gains reputation over time based on the payments received individually and thus even if the reputation received from cluster is not high, the relayer can still progress to high performing clusters by joining a decentralized cluster.
Graceful Exit
Relayer who wants to exit the network can send an exit request during any epoch and will be removed from the active relayer list and the associated stake will be returned after a wait time.
Reputation of each relayer is stored on a smart contract to ensure that a relayer who wants to exit the network should be able to resume the reputation when they reuse the network.
Introducing a block into the Relay
Producers are required to register with the protocol by staking enough LIN to cover any spam-related damage caused by possibly malicious messages transmitted by them. Producers send messages to clusters by dividing the message into multiple chunks using erasure coding. The block header is sent with every chunk to ensure that the block is valid. The redundancy of the erasure coding is kept high enough to ensure that receivers with the highest reliability requirements are also servable while ensuring enough competition (achieved by using a redundancy that's marginally higher than what's demanded by receivers such that the last few do not get paid). Each chunk is attested by the sender to ensure integrity and attribute responsibility in the case of spam. A deterministic function based on blockHash and chunkId is used to determine which clusters should be used to propagate messages for each of the chunks of the message. Producers should ensure that chunks are only sent to designated clusters in an epoch as other clusters drop untoward chunks. The clusters compete with each other to ensure that they receive a receipt from receivers which is only provided till a certain level of redundancy is achieved by the cluster.
C_\{m, i} = SHA256(SHA256(m), i)\pmod\{C_\{total}}
where cluster index for message m and chunkId i is C_\{m, i} and C_\{total} represents the total number of clusters
Receiver-side subscription
Receivers pay a subscription fee to receive blocks from the Marlin Relay for a certain time period. Subscription fees are accumulated in a pot which is used to pay the clusters for propagating blocks. Receipts are sent to cluster nodes by receivers as an acknowledgement that chunks were received. This receipt can be used to claim an acknowledgement if the receipt number falls within the winning range. A subscriber only sends receipts for the number of chunks necessary to maintain the redundancy ratio for its subscription bucket.
Assuming that the demand for reliability is discrete and assuming that a receiver needs a certain reliability, a receiver will not misbehave by accepting bribes to decrease it's own reliability (that is send receipts without receiving chunks allowing the cluster to freeload). We divide the receivers into distinct buckets based on the reliability provided. The pricing for the propagation changes based on the reliability demanded. Hence any receiver can subscribe to an appropriate reliability bucket and pay exactly that fee without paying for a higher reliability. Based on the reliability guarantees of the receiver bucket (n out of m chunks), the clusters propagating the first chunks from the producer are allowed to propagate chunks to the receiver where represents the fraction of extra clusters necessary to ensure competition.
The subscription fee is set by in-protocol governance mechanisms based on the costs to the network and the reliability required by the receiver. As only higher reliability users make use of certain chunks propagated in network, only these users pay for those chunks. Thus, as the reliability provided increases, the fee paid by the receivers also increases.
Integrating with blockchains
One of Marlin Relay's goals is to ensure that integration of a new blockchain requires as less effort as possible. Only the following 4 steps are required for such an integration:
- A smart contract based light client for the blockchain on Ethereum.
- A plugin which can check for transaction and block validity.
- A peer table update to the blockchain client to send blocks/transactions to Marlin Relay in addition to existing peers.
- A ZK circuit to verify block integrity from its chunks' hashes.
Incentivization
Incentives are necessary for economically rational actors to follow the protocol honestly. Incentivization of the network layer especially is a novel feature of Marlin Relay.
Producers' incentives
Producers are anyone with an external incentive to propagate blocks through the network. Generally producers are miners who want to propagate their block to the nodes in the network as soon as possible. They thus have a natural incentive to propagate the block through the Relay.
Spam prevention checks are performed on blocks to ensure that a malicious attacker doesnt't DoS the network with invalid blocks. If an invalid block is sent by a producer, then the producer is appropriately penalized via stake slashing. A miner may also delegate the resposibility of staking to a service provider for a fee.
Relayers' incentives
Relayers are necessary to ensure that message are propagated from producers to receivers. There are 2 types of incentives provided to the relayers:
- Fees - Fees are used to cover the bandwidth and infrastructure costs of relayers for relaying messages in the network. It is paid proportional to the work done by the relayer and is sourced from the pot of subscription fees collected from receivers.
- Network Reward - Network rewards are awarded probabilistically based on the amount of work done by each relayer. This is an incentive for relayers during the early stages of bootstrapping the network. The reward value decreases over time as the organic usage of the network increases and relayers recover a higher portion of their costs through fees.
Fee distribution
The acknowledgements received from receivers act as tickets for claiming the fee. The fee is paid probabilistically so only a few of the tickets which are part of the winning range can claim the fee. To ensure that the tickets are not prone to grinding attack, the tickets are generated deterministically based on the receiver's public key, specific relayer's signature, chunkId and the blockHash of the chunk received. A deterministic but random relayer from the cluster is identified from the hash of the receipt given by the receiver (this is again done to prevent sybil nodes within a cluster). The receipt is signed by the identified relayer and sent back to the relayer who propagated the block to the receiver. Hash of the receipt signed by the identified relayer is compared against a winning range. The ticket cannot be submitted by any other relayer in the cluster as a receipt mentioning the relayer with its signature has to be submitted along with the ticket.
A portion of the fee claimed by the cluster is paid back to the receiver to ensure that receiver has skin in the game and does not falsely deny receiving a message by not sending back an acknowledgement.
An attacker who controls a certain percentage of clusters may try to create sybil clusters as well as sybil receivers such that these receivers only provide receipts to the clusters controlled by the attacker and not provide one to others. The sybil clusters are not benificial for the attacker if they aren't propagating chunks because they would not receive receipts from anyone other than the sybil receivers and the return on investment would be signficantly less given attacker can't compromise a signficant portion of receivers. Sybil receivers providing receipts only to the attacker clusters mean that the clusters controlled by attacker have a higher chance of winning the fee as well as block rewards. This attack is thwarted as the loss of return fee to the sybil receivers due to not providing receipts is more than the gains the attacker incurs by this attack. Because fee claims cannot be influenced by rational attackers, the network reward which is randomly selected from the fee claims is also not affected. Further analysis on the fee returned is shared in the next section.
The winning range is dynamically adjusted based on the difference between number of redemptions in the previous epoch to the redemption target. The pot is distributed equally amongst all the clusters who have receipts in the winning range during the epoch.
Network Rewards
Network rewards are provided to relayers who propagate the block in the previous epoch. To ensure that the network rewards are distributed proportional to the amount of data propagated by the cluster, network rewards are provided to a randomly selected fee redemption among the redemptions that were rewarded fee in the previous epoch.
Analysis
Reliability of reconstruction
In this section we analyse the data propagation mechanism to calculate the reliability of data propagation to receivers in Marlin. As a result of the analysis, we determine the redundancy ratio of erasure coding for a given reliability in Marlin.
Assumptions
- In this analysis we assume that clusters propagate data consistently among receivers. i.e either all receivers of a cluster get the data or not.
- A specified fraction of all clusters are honest.
Definitions
- Let be the total number of clusters that relay messages in Marlin network.
- Let be the minimum number of honest clusters in Marlin network and be the maximum number of malicious clusters in Marlin network.
- Let be the total number of chunks that a block is erasure coded into before it is propagated in the network.
- Let be the minimum required chunks necessary for any receiver to reconstruct the block.
- Let be the probability of clusters being honest out of the total selected clusters required to send chunks into the network.
- Let be the probability of atleast clusters being honest out of the total selected clusters required to send chunks into the network.
Combinatorics
For a message which is divided into chunks to be propagated using the marlin network, clusters are picked at random. For the receivers to reconstruct the block chunks are necessary, so alteast clusters among the clusters selected must be honest. Hence we calculate the probability as follows
P(C_r+) = \sum_\{i=C_r}^\{C} P(i)
We now calculate the probability of picking exactly honest clusters among the selected clusters
P_x = \frac\{Ways\ to\ pick\ x\ honest\ clusters\ and\ C-x\ dishonest\ clusters}\{Ways\ to\ pick\ C\ clusters\ from\ R\ clusters}
=\frac\{Ways\ to\ pick\ x\ honest\ clusters\ *\ Ways\ to\ pick\ C-x\ dishonest\ clusters}\{Ways\ to\ pick\ C\ clusters\ from\ R\ clusters}
=\frac\{\{R_h+x-1 \choose x}\ *\ \{R_d+C-x-1 \choose C-x}}\{R+C-1 \choose C} // using multicombination
Using the above result, we calculate as follows
P(C_r+) = \sum_\{i=C_r}^\{C} \frac\{\{R_h+i-1 \choose i}\ *\ \{R_d+C-i-1 \choose C-i}}\{R+C-1 \choose C}
is the reliability metric for the Marlin network.
Conclusion
Assuming 2^\{-40} probability of failure to select a set of clusters such that receivers will be able to reconstruct the block(reliability) and a total of clusters among which are honest. We observe that the if the message is divided into chunks then the redundency ratio(\frac\{C}\{C_r}) needs to be around times the original data which is significantly lower than the current networks.
For an Ethereum block of size 30 KB if we assume that each chunk is of 1 KB and overhead of the protocol is of 0.2 KB. Assuming 100 clusters the redundency ratio needs to be around 3.3 times the original data. Assuming 1000 clusters the redundancy ratio needs to be around 2.7 times the original data.
For a Bitcoin block of size 1 MB if we assume that each chunk is 30 KB and protocol overhead of 0.2 KB. Assuming 1000 clusters the redundency ratio is around 2.5 times the original data. Assuming 100 clusters the redundancy ratio is around 3 times the original data.
Assuming 100 chunks per block, following plot shows how Probability that receiver will not be able to reconstruct the block(in log scale) as no of clusters vary for various redundancy ratios.
Subscription model
Marlin protocol returns some portion of the subscription fee for receiving blocks to ensure that both receiver and cluster has something to lose if they claim that the message wasn't transmitted. This ensures that receiver doesn't return receipts when message wasn't transmitted. If receiver gives a receipt when message wasn't transmitted then that gives an incentive to the cluster to behave maliciously, hence it is detremental to the receiver. In this post we calculate a lower limit on the amount of fee returned by calculating the minimum return fee necessary that sybil attack on receivers will give a negative outcome for the clusters. This will ensure that the receiver set is sybil resistant.
Notations
- Let be the amount of money repaid per epoch as part of the return from subscription fee.
- Let be the network reward per epoch
- Let be the total number of receivers in the network
- Let be the fraction of clusters colluded to sybil the recievers.
- Let be the fraction of sybil receivers by colluded clusters in the network
- Let be the probability that an acknowledgement is paid the fee.
- Let be the probability that an acknowledgement receives the network reward.
- Let be the total number of clusters in the network.
- Let T be the maximum number of transaction target for claiming fee.
- Let Ack_\{total} be the total number of acknowledgements if all receivers behave correctly
- Let Ack_\{avg} be the average number of acknowledgements produced by a colluded receiver
Finding Limit on Repayment of Fee
For the sybil attack by a collusion of clusters to be financially viable, cost of sybil attack should be lower than the cost of gains from it. So the gains due to increase in probability of the cluster getting network reward should be higher than the money lost by the receivers for claiming to not receive messages from honest clusters not part of collusion.
Money lost by the receiver collusion from not claiming message delivery =(R*L_r)*Ack_\{avg}*(1-L_c)*(Repay*P_A)
where P_A = \frac\{T}\{Ack_\{total}-R*L_r*Ack_\{avg}*\{1-L_c}}
Probability of colluded cluster getting network reward per cluster =\frac\{\frac\{Ack_\{total}}\{C}}\{Ack_\{total}-R*L_r*Ack_\{avg}*(1-L_c)}
Probability of clusters not in collusion getting network reward =\frac\{acknowledgements\ received\ by\ non\ collusion\ clusters}\{Total\ acknowledgements}
=\frac\{acknowledgements\ sent\ by\ non\ collusion\ receivers}\{Total\ acknowledgement}
=\frac\{\frac\{Ack_\{total} - R*L_r*Ack_\{avg}}\{C}}\{Ack_\{total}-R*L_r*Ack_\{avg}*(1-L_c)}
Increase in Probability due to collusion = =\frac\{\frac\{Ack_\{total}}\{C}}\{Ack_\{total}-R*L_r*Ack_\{avg}*(1-L_c)} - \frac\{\frac\{Ack_\{total} - R*L_r*Ack_\{avg}}\{C}}\{Ack_\{total}-R*L_r*Ack_\{avg}*(1-L_c)}
=\frac\{R*L_r*Ack_\{avg}}\{C*(Ack_\{total}-R*L_r*Ack_\{avg}*(1-L_c))}
To ensure that sybil attack is not viable, the following inequality must hold
Loss due to not claiming message delivery Profit due to increase in network reward
(R*L_r)*Ack_\{avg}*(1-L_c)*(Repay*P_A) \geq B*\frac\{R*L_r*Ack_\{avg}}\{C*(Ack_\{total}-R*L_r*Ack_\{avg}*(1-L_c))}
Substituting
(R*L_r)*Ack_\{avg}*(1-L_c)*(Repay*\frac\{T}\{Ack_\{total}-R*L_r*Ack_\{avg}*\{1-L_c}}) \geq B*\frac\{R*L_r*Ack_\{avg}}\{C*(Ack_\{total}-R*L_r*Ack_\{avg}*(1-L_c))}
(1-L_c)*Repay*T \geq \frac\{B}\{C}
Repay \geq \frac\{B}\{T*C*(1-L_c)}
This result above provides the following insights into what should be the lower limit on the repayment
- It is proportional to the Network reward
- It is inversely proportional to the number of transactions allowed on ethereum in a time period
- It is inversely proportional to the number of clusters not part of the collusion.
We can see from above that the total number of clusters in the network doesn't really matter, till we have a certain number of clusters who are honest.
The repayment fee increases the total sunscription fee the user has to pay upfront. Hence it is desirable to have the repayment fee as low as possible. So the actual repayment fee can be set delta over the above repayment fee to ensure that any cluster/group of clusters attempting sybil attack do not economically benefit.
Below is a graph of the fraction of Block reward to be repaid to receiver per winning ticket to make the receiver set sybil resistant.
Potential attacks and counter-measures
Cartelization
As the Marlin Relay can act as a base layer for multiple blockchains, its necessary to avoid centralization. Marlin Relay encourages decentralization by using market competition and randomization at various stages.
Producers are randomly assigned clusters to send chunks, which ensures that all the clusters get a fair chance to propagate blocks avoiding the chances of a select few outcompeting others through discriminatory pricing. Rewards are thus distributed amongst several clusters.
Freeloading
The clusters compete with each another as only the first few clusters that send enough chunks required to reconstruct the block are rewarded by the receiver. This ensures that clusters who consistently deliver messages the fastest earn more. Note that this measure does not contradict the argument against cartelization with appropriate paramterization.
Censorship
Producer Censorship: Producer censorship is exteremely difficult as shown in the analysis above. Assuming that a threshold percentage of the clusters are honest and with clusters being randomly chosen, the probability that a significant percentage of selected clusters for a particular message shall be malicious is certainly negligible.
Receiver Censorship: Based on the above analysis, we calculate the redundancy ratio necessary to ensure that even if a significant portion of clusters want to censor a receiver, the probability of successful censorship is extremely low.
To ensure that custodians of a large supply of LIN are not able to stake LIN to create clusters that doesn't propagate messages, certain percentage of clusters are removed from the network periodically.
Sybil
A collusion of clusters incurs an economic loss when creating sybil entities to get excess rewards from the pot. This is achieved by creating a fee return mechanism for users. The fee return mechanism ensures that the quantum of money lost in return fees doesn't justify the benefits from increase in rewards due to sybil. The analysis of the mechanism is detailed above.
Bribery
Clusters may bribe receivers to name them as the ones who propagated chunks without actually propagating them. It is also possible for clusters to create smart contract to reward any receiver who maliciously names cluster as the one who propagated the chunk. We avoid this attack by dividing the receivers into different buckets based on the reliability needed by them.
We assume that receivers are rational actors. A rational actor will perform a particular action provided it is profitable to the actor. As the subscription fee is substantial, it doesn't make sense for a receiver with no external utility to subscribe to Marlin Relay as receiver will never be earning more than the subscription fee in bribes as it is not economical for the relayer to do. If there exists external utility for the receiver in receiving a block, the receiver chooses a redundancy ratio such that the external utility is higher than the fee spent. Thus, the only reason for a receiver to take a bribe is to receive a higher redundancy ratio at a lower cost. We show below that this is not possible if demand is discrete.
Assume the reliability required by a receiver is n
out of m
chunks. If the cluster subscribes to a higher redundancy bucket of n+k
out of m
chunks, it pays a higher subscription fee for the chunks. This is beneficial for the receiver only when
bribe paid > extra fee to be paid to receive extra chunks
bribe paid > propagation cost + profit margin of all clusters involved in sending the extra chunks
If the receiver receive bribes from the cluster, the bribe cannot be higher than the extra cost receiver pays for subscribing to the higher reliability bucket. So given receiver needs a specific number of chunks, receiver is better off not taking bribes and subscribing to a bucket that matches receiver's needs.
Clusters are, thus, better off by propagating messages rather than providing bribes as bribes have to be higher than the extra cost of subscription which means they should be higher than propagation cost + profit margin. In the absence of such big bribes, it doesn't make sense for receivers to join a higher reliability bucket by paying a higher fee.
Denial-of-Service
Marlin Relay inherits the DoS resistance mechanism of the blockchain by ensuring that only a valid block can be propagated in the network. Any producer who propagates an invalid block is slashed if it is attributable and data which is not attributable will be dropped by relayers. Spam prevention is an important requirement for a broadcast network as the nature of broadcast amplifies the spam attack. As Marlin Relay uses cut-through routing, spam checks are performed only at the edges. We have considered DDoS via spam attacks and counter measures against it in great detail. Various components where spam is possible are listed below:
- Message Integrity
- Attestation integrity
- Message Replay
- Producer Spam Burst
- Invalid chunk propogation
- Chunk Withholding
- Changing attestation source
- Infinite Valid Chunks
- Block Integrity
- Header Verification
Message Integrity
Chunks are propogated from source to various destinations through relayers among which some might be malicious. This section specifies the protections in place to prevent malicious relayers from tampering with the intergrity of the message. Thus avoid spam of message which might be an invalid block.
There are 2 major components to the data that is propogated, which are chunk and attestation. There might exist malicious actors who propogate messages with wrong chunk/attestation in the network resulting in spam of the network by invalid messages. Attestation validity procedure which is performed at cluster exit node and the receiver is detailed below, chunk validity is verified at the receiver when enough chunks are used to reconstruct a block.
Attestation Integrity
Each chunk that is broadcasted to the network needs to be attested by an sender which ensures that the chunk, if found to be spam is attributable. Attestation consists of
- blockHeader: Header of the block which is being transmitted. Hash of which also acts as an unique identifier for the message.
- ChunkId: Sequence number of the erasure coded chunk
- ChannelId: Specifies the blockchain that the block corresponds to.
- ClusterId: The cluster to which message is propagated.
- Timestamp: Timestamp at which message is attested
- Signature: Signature of the attestor who stakes LIN with the protocol
- Chunk: The body of the chunk which contains a part of the erasure coded block.
The attestation along with block header is verified by each cluster and receiver. The block header should have difficulty valid against the current network. Messages with no/invalid attestations are dropped as they might be sent by a malicious actor who doesn't have any stake in the network. Even if the attestation is valid, attestor of the message should have enough stake associated with the network.
Message Replay
A malicious actor can resend a valid message which was previously sent in the network. To ensure that messages cannot be replayed, recent messages are cached for MESSAGE_CACHE_TIMEOUT
and then deleted. If the message is already present in the cache then it is dropped. If the message is not in the cache and the timestamp of the message is older than MESSAGE_CACHE_TIMEOUT
then the message is discarded. This mechanism ensures that both recent and older messages cannot be spammed in the network.
Message replay from different producers will also be not propagated as the clusters are selected based on chunkId and blockHash. So specific chunk of a particular block will only be sent through a fixed cluster. Hence it will be possible for cluster to deduplicate.
Producer Spam Burst
Producer has to send the header of the block along with the chunk. This ensures that the sybil protection mechanism of the blockchain layer can be reused for ensuring that producer doesn't send spam messages. Producer can spam in burst only if the producer can send multiple valid blocks in a short time which is handled by the blockchain layer.
Invalid Chunk
A malicious producer can send invalid chunks which cannot be used to reconstruct the original message. In this case, anyone who received MIN_CHUNKS[message_type]
should be able to reconstuct the message and if they are unable to, then one of the chunks must be invalid. Hence MIN_CHUNKS[message_type]
with which the reconstructed message didn't match the structure of the block for the message_type
can be submitted to the contract and the producer of the message will be slashed.
Chunk Withholding
A malicious producer can withhold chunks from being propogated in the network by not publishing them to the network. The producer needs to propagate atleast one chunk to the network with which the blockhash of the message to be propagated is known.
If any receiver is unable to reconstruct the message, receiver can calculate the clusters through which chunks are to be received. The receiver can intimate the cluster about the block produced and chunk that is supposed to be received through the cluster. Clusters can then intercommunicate with other clusters who were supposed to receive chunks from the producer for the block.
If the cluster is sufficiently confident that the block was withheld, a challenge can be raised by the cluster to release the block in question on a data availability layer and submit a proof. Cluster can check the validity of the released block and challenge with proof in case the block is invalid. In case the block is valid, a small bounty enough to cover the costs to put data on DA layer, set by the challenger(cluster) will be lost to producer. If the block is invalid or producer doesn't submit the proof within specified time period, then the producer is slashed and the cluster that raised the challenge is rewarded with a portion of the slashed stake.
Infinite Valid Chunks
The clusters which will propagate the block will know the total number of chunks being propagated as it is decided by the protocol. Hence the cluster will not accept a block if it isn't supposed to receive one. If a cluster propagates multiple chunks for the same block and chunkId, then the cluster can be slashed. Hence it will not be possible for the producer to submit more chunks than necessary. If a producer propagates different chunks for the same blockId and chunkId, then the producer can be slashed.
Block Integrity
Chunks are propagated to receivers by relayers and once enough chunks are received, receiver can reconstruct the block. If the block is invalid, then the receiver can create a proof for the ZK circuit which reconstructs the block from the chunks and verifies if the block body matches the header data and if the transactions in the body of block are valid.
Note: Only cryptographic validity is verified as part of the ZK Circuit.
Header Verification
Block header that is propagated along with the message is verified against the header structure.
To ensure that messages that aren't relevant or not in line with the state of the blockchain are not transmitted, an onchain light client similar to BTC-relay will be maintained on Ethereum for each of the supported networks to ensure that propagated data can be compared against the state of the blockchain network. To avoid complexities involved due to possible reorg of the chain, a safe limit of CONFIRMATION_BLOCKS
blocks are assumed before acting on the data provided by the onchain light client.
Onchain light client makes difficulty of the blockchain network available to everyone and hence it will be possible to verify the nonce of the block matches the difficulty necessary for the block to be valid.
Eclipse Attack
Eclipse attack on producers is hard as the clusters which will propagate the chunks is randomly selected and without knowing the blockhash which is only known to the producer. Hence to eclipse the producer a significant clusters have to be taken over by the attacker which would be significantly costly due to stake that a cluster has to be lock.
Eclipse attack on receivers is hard without compromising signifiant number of clusters as the receiver knows which clusters are allowed to send the data. It is possible for an AS level adversery to perform routing attacks to eclipse(like Erebus attack) but even in that case, adversery can deny service to the receiver but not mislead the receiver with an invalid chunk as the adversery cannot create a valid signature on behalf of the cluster.
Delay attacks
Marlin Relay is not prone to delay attacks as receivers never wait for someone to respond and we tradeoff redundancy for non-interactivity. Marlin Relay ensures that receivers receive enough chunks to reconstruct blocks and that it need not wait for someone to respond to any request.
Routing attacks
BGP hijacking style routing attacks can partition the network. Some clusters in Marlin Relay can implement suggestions contained in SABRE to make chains resistant to such attacks. Moreover, in cases a Marlin Relay node also runs a blockchain fullnode, we suggest to run the Marlin node and the blockchain client on different IPs, if possible, to ensure that an AS level attacker will have less data to identify blockchain traffic.
BGP attacks are visible at a global level, hence carrying out such attacks for a very long period of time is challanging in itself.
Open Questions
References
- http://ljk.imag.fr/membres/Jean-Guillaume.Dumas/Enseignements/ProjetsCrypto/Ethereum/236.pdf
- https://ethresear.ch/t/incentivizing-a-robust-p2p-network-relay-layer/1438
Appendix
Cheaper Blocks
The fringe network is a network of nodes connected to the receivers of the core networks. Fringe network nodes can connect to the edges of the core network and request blocks at a negotiated cost. Nodes in the fringe network receive the block eventually at a much less cost compared to core network.
There are 2 types of costs associated with relaying the message in fringe node. First is the fee component of cost which is less compared to core network because one core network receiver can distribute the message to multiple fringe nodes exponentially decreasing the fee component of the price per block, for every hop in the fringe network. The bandwidth cost component of the price per block is constant and the total price is delta over this cost as the competition among the leaves of the core network to distribute the block. Thus fringe network allows the full nodes who do not require block with low latency to still receive blocks at a minimal budget. This ensures that home users can operate full nodes at signficantly low costs even if they have to pay for the bandwidth.
Reliability of reconstruction
The probability of reconstruction can be calculated as follows:
from decimal import *
def fac(n):
pdt = 1
for k in range(1, n+1):
pdt = pdt * k
return pdt
def choose(n, k):
return fac(n) / fac(k) / fac(n-k)
def probOne(w, n, k, t):
b = n - w
return Decimal(choose(w+k-1, k)*choose(b+t-k-1, t-k))/choose(n+t-1, t)
def probAll(w, n, t, ht):
return sum([probOne(w, n, i, t) for i in range(ht, t+1)])
reliability = probAll(w, totClusters, totChunks, reqChunks)
print reliability
Receiver distribution
We have assumed that the distribution of receivers among the reliability buckets follows normal distribution. The cummulative number of acknowledgments in the network are calculated as below.
from decimal import *
import math
def fac(n):
pdt = 1
# iteration to avoid reaching max recursion depth
for k in range(1, n+1):
pdt = pdt * k
return pdt
def poisson(u, x):
return Decimal(u**x)/Decimal(fac(x))/Decimal(math.exp(u))
def normal(u, sig, x):
return 1/math.sqrt(2*math.pi)*1/sig*math.exp(-0.5*((x-u)/sig)**2)
def cummulativeNormal(u, sig, s, e):
return sum([normal(u, sig, i) for i in range(s, e+1)])
def totalAcks(r, sig, ack, u, s, e):
return sum([math.ceil(cummulativeNormal(u, sig, i, e)*r)*ack for i in range(s, e+1)])
# r: Total receivers
# sig: Variance of the normal distribution
# ack: Extra chunks per increasing reliability window
# u: mean of the normal distribution
# s: Start window chunks
# e: End window chunks
print totalAcks(r, sig, ack, u, s, e) + r*s
Total number of acknowledgements which is calculated form the above script can be used to set the winning range.