In the blockchain, web3 community one of the most trusted and reliable projects is Ethereum. And I was curious about how things works from a scalability perspective. And how things can be better. This blog is a collection of texts from different sources to make a sense of information and understand things in a better way.
Why Blockchain ?
One of the basic reasons of using a blockchain is people are not trustworthy, algorithm is. Even though implementation of algorithm is not that trustworthy, but still the information to verify is out there in open, you can go and check, all the decisions are taken in public forum and you can go and check. Unlike the traditional organization.
Trilemma
If you want to get a fair idea of why every decision and every improvement is the way it is, You might have to understand this trilemma. It’s as popular as CAP theorem is to database.
The scalability trilemma says that there are three properties that a blockchain try to have, and that, if you stick to "simple" techniques, you can only get two of those three. The three properties are:
Scalability: the chain can process more transactions than a single regular node (think: a consumer laptop) can verify.
Decentralization: the chain can run without any trust dependencies on a small group of large centralized actors. This is typically interpreted to mean that there should not be any trust (or even honest-majority assumption) of a set of nodes that you cannot join with just a consumer laptop.
Security: the chain can resist a large percentage of participating nodes trying to attack it (ideally 50%; anything above 25% is fine, 5% is definitely not fine).
Verbatim copy from one of the vitalik’s [blog]
What is Blockchain
From a technical standpoint, the ledger of a cryptocurrency such as Bitcoin can be thought of as a state transition system, where there is a "state" consisting of the ownership status of all existing bitcoins and a "state transition function" that takes a state and a transaction and outputs a new state which is the result. [Ref]
But this state is not just any state, it’s global state. Maintained and Synchronised by a lot of computers. But Can’t they be synchronised by algorithms like Raft or Paxos ?. The answer is no, because in a traditional database scalability you trust all the computers that they are working in good-faith and there is no bad actor unlike ethereum and bitcoin. Other differences are
Ethereum and Bitcoin's consensus algorithm is very different from Raft and Paxos. Ways in which they differ:
In both Raft and Paxos, the systems elect a leader. There is no leader in Ethereum and Bitcoin.
In both Raft and Paxos, all members of the cluster are trusted. In Ethereum and Bitcoin, one needs trust in only 51% of the holders of hash power.
The number of nodes in Ethereum and Bitcoin can fluctuate without notifying other members of the network; in Raft and Paxos, these values are predetermined. This makes it difficult to scale Raft and Paxos; furthermore, with a fixed number of nodes, one probably needs to give up security (by allowing any node to join) or node anonymity (to authenticate).
Ways in which they are the same:
The systems result in a consistent state.
A single system proposes state changes (at present, selected "at random" by mining in Ethereum and Bitcoin, and by election for Raft and Paxos).
Temporary forks can be formed (when there are network propagation delays or competing chains for Ethereum and Bitcoin and when there network issues for Paxos -- Raft simply stops working with less than a majority of nodes on-line).
The protocols for all four systems have a rule against deleting past transactions (or, rather, they only have rules for adding new ones).
So how do we do this synchronisation ?
Enter, Consensus Protocols.
Proof of work [Ref]
Proof-of-work is the mechanism that allows the decentralized Ethereum network to come to consensus, or agree on things like account balances and the order of transactions. This prevents users from "double spending" their coins and ensures that the Ethereum chain is tremendously difficult to attack or manipulate
Ethash, requires miners to go through an intense race of trial and error to find the nonce for a block. Only blocks with a valid nonce can be added to the chain.
Miners who successfully create a block get rewarded with two freshly minted ETH but no longer receive all the transaction fees, as the base fee gets burned, while the tip and block reward goes to the miner. A miner may also get 1.75 ETH for an uncle block. Uncle blocks(Ommer Blocks) are valid blocks created by a miner practically at the same time as another miner mined the successful block. Uncle blocks usually happen due to network latency
A transaction has "finality" on Ethereum when it's part of a block that can't change.
Because miners work in a decentralized way, two valid blocks can get mined at the same time. This creates a temporary fork. Eventually, one of these chains will become the accepted chain after a subsequent block has been mined and added, making it longer.
But to complicate things further, transactions rejected on the temporary fork may have been included in the accepted chain. This means it could get reversed. So finality refers to the time you should wait before considering a transaction irreversible. For Ethereum, the recommended time is six blocks or just over 1 minute. After six blocks, you can say with relative confidence that the transaction was successful. You can wait longer for even greater assurances.
Proof of Stake
It’s will come after the merge.
In proof-of-stake, validators explicitly stake capital in the form of ether into a smart contract on Ethereum. This staked ether then acts as collateral that can be destroyed if the validator behaves dishonestly or lazily. The validator is then responsible for checking that new blocks propagated over the network are valid and occasionally creating and propagating new blocks themselves.
To participate as a validator, a user must deposit 32 ETH into the deposit contract and run three separate pieces of software: an execution client, a consensus client, and a validator. On depositing their ether, the user joins an activation queue that limits the rate of new validators joining the network. Once activated, validators receive new blocks from peers on the Ethereum network. The transactions delivered in the block are re-executed, and the block signature is checked to ensure the block is valid. The validator then sends a vote (called an attestation) in favor of that block across the network.
Whereas under proof-of-work, the timing of blocks is determined by the mining difficulty, in proof-of-stake, the tempo is fixed. Time in proof-of-stake Ethereum is divided into slots (12 seconds) and epochs (32 slots). One validator is randomly selected to be a block proposer in every slot. This validator is responsible for creating a new block and sending it out to other nodes on the network. Also in every slot, a committee of validators is randomly chosen, whose votes are used to determine the validity of the block being proposed.
A transaction has "finality" in distributed networks when it's part of a block that can't change without a significant amount of ether getting burned. On proof-of-stake Ethereum, this is managed using "checkpoint" blocks. The first block in each epoch is a checkpoint. Validators vote for pairs of checkpoints that it considers to be valid. If a pair of checkpoints attracts votes representing at least two-thirds of the total staked ether, the checkpoints are upgraded. The more recent of the two (target) becomes "justified". The earlier of the two is already justified because it was the "target" in the previous epoch. Now it is upgraded to "finalized". Since finality requires a two-thirds majority, an attacker could prevent the network from reaching finality by voting with one-third of the total stake. There is a mechanism to defend against this: the inactivity leak.
When the beacon chain is not finalising it enters a special "inactivity leak" mode.
Attesters receive no rewards. Non-participating validators receive increasingly large penalties based on their track records.
This is designed to eventually restore finality in the event of a permanent failure of large numbers of validators.
This activates whenever the chain fails to finalize for more than four epochs. The inactivity leak bleeds away the staked ether from validators voting against the majority, allowing the majority to regain a two-thirds majority and finalize the chain.
State
We understood how to synchronise a state, but how a state looks like. Enter Modified Merkle Patricia Trie.
Most of the things are pretty standard except Extension Node. Why to add this extra complexity.
An extension node is an optimized node of the branch node. In the Ethereum state, quite frequently, there are branch nodes that have only one child node. This is the reason why the MPT compresses branch nodes that contain only one child into extension nodes that have a path and the hash of the child.
Since both the leaf node and the extension node are an array of two items there should be a way to distinguish these two different nodes. In order to make such distinction, the MPT adds a prefix to the path. If the node is a leaf and the path consists of even number of nibbles, you add 0x20 as a prefix. If the path consists of odd number of nibbles, you should add 0x3 as a prefix. If the node is an extension node and the path consists of even number of nibbles, you add 0x00 as a prefix. If it consists of odd number of nibbles, you should add 0x1 as a prefix. Because the path that consists of an odd number of nibbles gets a nibble as prefix and the path that consists of an even number of nibbles gets two nibbles as a prefix, a path is always expressed as a byte. [Ref]
Block
To explain Block in simplest of fashion, it is used to synchronize states across various nodes and distributes eth to the nodes involved.
Blocks are batches of transactions with a hash of the previous block in the chain. This links blocks together (in a chain) because hashes are cryptographically derived from the block data. This prevents fraud, because one change in any block in history would invalidate all the following blocks as all subsequent hashes would change and everyone running the blockchain would notice
Technically a block has two parts a header and a body. [Ref]
timestamp
– the time when the block was mined.blockNumber
– the length of the blockchain in blocks.baseFeePerGas
- the minimum fee per gas required for a transaction to be included in the block.difficulty
– the effort required to mine the block.mixHash
– a unique identifier for that block.parentHash
– the unique identifier for the block that came before (this is how blocks are linked in a chain).transactions
– the transactions included in the block.stateRoot
– the entire state of the system: account balances, contract storage, contract code and account nonces are inside.nonce
– a hash that, when combined with the mixHash, proves that the block has gone through proof-of-work
A final important note is that blocks themselves are bounded in size. Each block has a target size of 15 million gas but the size of blocks will increase or decrease in accordance with network demands, up until the block limit of 30 million gas (2x target block size). The total amount of gas expended by all transactions in the block must be less than the block gas limit. This is important because it ensures that blocks can’t be arbitrarily large. If blocks could be arbitrarily large, then less performant full nodes would gradually stop being able to keep up with the network due to space and speed requirements.
Algorithm For Rewarding Nodes
You can read more details here GHOST Implementation
Ethereum implements a simplified version of GHOST which only goes down seven levels. Specifically, it is defined as follows:
A block must specify a parent, and it must specify 0 or more uncles
An uncle included in block B must have the following properties:
It must be a direct child of the kth generation ancestor of B, where 2 <= k <= 7.
It cannot be an ancestor of B
An uncle must be a valid block header, but does not need to be a previously verified or even valid block
An uncle must be different from all uncles included in previous blocks and all other uncles included in the same block (non-double-inclusion)
For every uncle U in block B, the miner of B gets an additional 3.125% added to its coinbase reward and the miner of U gets 93.75% of a standard coinbase reward.
Transactions
A submitted transaction includes the following information:
recipient
– the receiving address (if an externally-owned account, the transaction will transfer value. If a contract account, the transaction will execute the contract code)value
– amount of ETH to transfer from sender to recipient (in WEI, a denomination of ETH)data
– optional field to include arbitrary datagasLimit
– the maximum amount of gas units that can be consumed by the transaction. Units of gas represent computational stepsmaxPriorityFeePerGas
- the maximum amount of gas to be included as a tip to the minermaxFeePerGas
- the maximum amount of gas willing to be paid for the transaction (inclusive ofbaseFeePerGas
andmaxPriorityFeePerGas
)
the
raw
is the signed transaction in Recursive Length Prefix (RLP) encoded formthe
tx
is the signed transaction in JSON formv,r,s → What are these ?????
Wallet - It’s just a public private key pair, That’s it.
so v,r,s are used to generate public key(public key recovery) for the wallet by the nodes to verify if it’s actually coming from the account it’s is being told. In short to verify the transactions.
When you send a tx, you sign the transaction and it includes these
v
r
ands
values. You parse these from the signed tx and then pass thesev
r
ands
values and the hash of the transaction back into a function and it'll spit out the public key. This is actually how you get the from address of a transaction.
On Ethereum there are a few different types of transactions:
Regular transactions: a transaction from one wallet to another.
Contract deployment transactions: a transaction without a 'to' address, where the data field is used for the contract code.
Execution of a contract: a transaction that interacts with a deployed smart contract. In this case, 'to' address is the smart contract address.
Accounts and Addresses
Ethereum has two account types:
Externally-owned – controlled by anyone with the private keys
Contract – a smart contract deployed to the network, controlled by code. Learn about smart contracts
Both account types have the ability to:
Receive, hold and send ETH and tokens
Interact with deployed smart contracts
Ethereum accounts have four fields:
nonce
– A counter that indicates the number of transactions sent from the account. This ensures transactions are only processed once. In a contract account, this number represents the number of contracts created by the account.balance
– The number of wei owned by this address. Wei is a denomination of ETH and there are 1e+18 wei per ETH.codeHash
– This hash refers to the code of an account on the Ethereum virtual machine (EVM). Contract accounts have code fragments programmed in that can perform different operations.storageRoot
– Sometimes known as a storage hash. A 256-bit hash of the root node of a Merkle Patricia trie that encodes the storage contents of the account (a mapping between 256-bit integer values), encoded into the trie as a mapping from the Keccak 256-bit hash of the 256-bit integer keys to the RLP-encoded 256-bit integer values. This trie encodes the hash of the storage contents of this account, and is empty by default.
Merkle Proofs in Ethereum [Ref]
Every block header in Ethereum contains not just one Merkle tree, but three trees for three kinds of objects:
Transactions
Receipts (essentially, pieces of data showing the effect of each transaction)
State
This allows for a highly advanced light client protocol that allows light clients to easily make and get verifiable answers to many kinds of queries:
Has this transaction been included in a particular block?
Tell me all instances of an event of type X (eg. a crowdfunding contract reaching its goal) emitted by this address in the past 30 days
What is the current balance of my account?
Does this account exist?
Pretend to run this transaction on this contract. What would the output be?
The first is handled by the transaction tree; the third and fourth are handled by the state tree, and the second by the receipt tree. The first four are fairly straightforward to compute; the server simply finds the object, fetches the Merkle branch (the list of hashes going up from the object to the tree root) and replies back to the light client with the branch.
Ethereum Networking [Ref]
The "networking layer" is the stack of protocols that allow those nodes to find each other and exchange information. This includes "gossiping" information (one-to-many communication) over the network as well as swapping requests and responses between specific nodes (one-to-one communication). Each node must adhere to specific networking rules to ensure they are sending and receiving the correct information.
The execution layer's networking protocols is divided into two stacks:
the discovery stack: built on top of UDP and allows a new node to find peers to connect to
the DevP2P stack: sits on top of TCP and enables nodes to exchange information
Both stacks work in parallel. The discovery stack feeds new network participants into the network, and the DevP2P stack enables their interactions
Discovery
Discovery is the process of finding other nodes in network. This is bootstrapped using a small set of bootnodes (nodes whose addresses are hardcoded into the client so they can be found immediately and connect the client to peers). These bootnodes only exist to introduce a new node to a set of peers - this is their sole purpose, they do not participate in normal client tasks like syncing the chain, and they are only used the very first time a client is spun up.
The protocol used for the node-bootnode interactions is a modified form of Kademlia which uses a distributed hash table to share lists of nodes. Each node has a version of this table containing the information required to connect to its closest peers. This 'closeness' is not geographical - distance is defined by the similarity of the node's ID. Each node's table is regularly refreshed as a security feature.
You can read more about chord, kademlia and dht here.
DevP2P
Once peers are connected and an RLPx session has been started, the wire protocol defines how peers communicate. There are three main tasks defined by the wire protocol: chain synchronization, block propagation and transaction exchange. Chain synchronization is the process of validating blocks near the head of the chain, checking their proof-of-work data and re-executing their transactions to ensure their root hashes are correct, then cascading back in history via those blocks' parents, grandparents etc until the whole chain has been downloaded and validated. State sync is a faster alternative that only validates block headers. Block propagation is the process of sending and receiving newly mined blocks. Transaction exchange refers to exchanging pending transactions between nodes so that miners can select some of them for inclusion in the next block.
That’s all there is to current Ethereum. And there is a lot more to the upcoming Ethereum, lot more complicated. I have explained some of it here
[Ethereum Yellow Paper Explanation]
Thanks for reading. I am glad you took out time from your busy life to read this.