Notes

How to Build a Discrete Event Simulator with Async Rust

2024-07-03

TL;DR: This post describes a new simulation framework for blockchains, some of its inner workings, and how to write your own simulators.

In the last few years I spent quite some time learning different variants of blockchain protocols. Three things became clear. First, blockchain protocols (and consensus protocols in general) are hard to learn. Second, a large variety of such protocols exist and comparing them is not easy. Third, evaluation of blockchain protocols is expensive and hard, as they operate in a geo-replicated setting with hundreds of participants.

Existing protocols also consume a lot of energy, even "green" ones that do not rely on mining. Simulation can allow evaluating new, more efficient protocols, without consuming a massive amount of energy. Further, while there is still significant doubt whether such protocols will ever have a proper use for the average human, they are a great way to learn about how distributed system reach agreement in the presence of failures.

These observations lead me to build a blockchain simulation framework. Named SimBA (for "Simulating Byzantine Fault-Tolerant Applications"), the framework can simulate large-scale blockchain networks. It uses discrete event simulation (DES). A DES maintains a queue of timed event. It repeatedly takes earliest event from the queue and executes it. Events might enqeue new events when they run. This enables simulating complex interacting systems efficiently using only a single thread.

The asim crate

There are great DES libraries, such as pysim for Python, but none that I liked for Rust. As a result, I created asim, a DES that is implemented as a Rust asynchoronous runtime. Rust allows building your own async runtime, or executor, which made it surprisingly easy to implement asim.

A future in Rust represents an async task and is simply a trait with a single method poll. You can think of Rust futures as state machines that might update their state when you poll it. Eventually, it will reach a terminal state a "ready" state meaning it has completed the associated task. This is different from, for example, Goroutines or Java green threads that implement threads in userspace each with their own stack. Rust future consist of two components, the future itself and an associated waker. The waker allows telling the executor when a future wants to continue execution, for example, when an I/O operation completed. If you want to learn more how async Rust works internally, I highly recommend starting with this article on how futures and executors work.

asim is an executor that allows to run tasks as simulated discrete events, instead of real-time. An async executor generally has some queue of futures that are ready to execute, or continue execution. This is the same here, except that asim only runs one task at a time, because simulated time can deviate from real time.

A dedicated Timer implements the DES logic. Whenever a future waits a time-event, e.g., a simulated IO operation, we enqueue its waker in the Timer's queue. Asim then repeatedly triggers the next timer event -- which might be one millisecond or one year of simulated time in the future -- and runs all futures that are ready to execute.

To show how this looks like for the user, consider the following example, where a task ("Pong!") waits for another task ("Ping?") to complete. Like in other async runtimes, spawn will create a task and sychronization primitives, such as oneshot allow controlling the order in which tasks execute.

/// This will always print "Ping? Pong!"

let (ping_sender, ping_receiver) = oneshot::channel();

asim::spawn(async move {
    ping_receiver.await.unwrap();
    println!("Pong!");
});

asim::spawn(async move {
    println!("Ping?");
    ping_sender.send(()).unwrap();
});

A Primer on Consensus Protocols

Consensus protocols enable that a set independent servers (or nodes), potentially located in different data centers, appear as one synchronized "cloud". Here, the "cloud" is an abstraction -- a virtual server with a single unified state -- that is implemented by replicating an applications state across all nodes. For a client, it is then irrelevant which physical machine they interact with and failures of individual nodes do not result in the application failing as whole. Consensus protocols achieve this by coordinating change to the state so that they are applied in the same order at all (non-faulty) nodes. In the context of blockchain, those state changes are simply transactions.

At a first glance, synchronizing state does not seem that hard to achieve but consensus protocols must be able to address two challenges: network delays and node failures. Network delays make it so that, even in the absence of failures, nodes can have an outdated view of the state other nodes are in. Failures can render nodes unresponsive, which could cause other nodes to be stuck waiting for a message to the faulty node. Even worse, it is not possible to differentiate if one has not received a message yet because of a network delay or a crashed node.

Many cosnensus protocols rely on voting among nodes to agree state changes. Essentially, some node (commonly called the leader) proposes a state change and waits for other a majority of nodes to acknowlege and approve that state change. Here, ensuring that all nodes agree on who the leader is is the more difficult part to implement correctly. Other protocols have no explicit voting mechanisms. For example, Nakamoto consensus used by Bitcoin replaces voting by having miners extend the longest chain of valid blocks they know of, which represents an approval of all nodes within the chain they extend.

In some context, such as blockchains, we expect nodes to encounter failures other than simple crashes. In particular, a protocol must be able to tolerate software bugs and malicious operators, which can, for example, lead to nodes refusing to communicate, intentionally delaying messages, or creating incorrect or incomplete messages. Literature refers to these arbitrary failures as Byzantine failures.

The reason why Bitcoin is particular interesting for system researchs is that it introduced Nakamoto consensus, a new class of consensu protocols that can operate on a public network and is comparatively easy to understand. Of course it does so, among other things, sacrificying energy-efficiency and performance, but let us ignore that for now. But enough about consensus protocols for now. It is a topic that can fill multiple books by itself.

Implementing Bitcoin with SimBA

We can implement consensus protocols, such as those of blockchains now that we have an ergonomic way of writing simulation code. For the rest of this post, I want to explain Bitcoin and its implementation in SimBA in more detail and I will start by explaining the core data structure of Nakamoto consensus in more detial: the blockchain.

A blockchain encodes a partial order of transactions grouped into blocks. Each block has a predecessor and an arbitrary number of successors, meaning they are partially ordered forming a rooted tree. A hardcoded "Genesis" block forms the root of this tree, which does not contain any transactions. The transactions within a block are also ordered by some rule, but let us just assume they are totally ordered for now.

Simplified a block looks like the following in SimBA.

pub struct NakamotoBlock {
    /// Basic Metadata
    identifier: BlockId,
    mined_by: AccountId,
    creation_time: Time,
 
    /// The predecessor of this block
    parent: BlockId,

    /// Height indicates how long the chain with this
    /// block as a head is
    height: u64,
   
    ///  
    transactions: Vec<TransactionId>,

    /// Contains the state of the system 
    state: FrozenCowTree<AccountState>,
}

Mining then is a mechanisms to extend this datastructure and eventually have the network agree on a common subset of it that we call the canonical chain. For the sake of simplicity, you can view mining as a random mechanism that, at any point at time, allows some node to create a new block. When doing so, they extend the longest chain of valid blocks they know of. Earlier, I talked about a "single unified cloud state" that is present to the users and the canonical chain is exactly that. Whenever a valid node picks a chain to extend, that chain has the canonical chain as its prefix. While there exists other blocks that nodes keep track of, these can bee seen as an artifact of the randomized and decentralized way of creating new blocks.

As mentioned, this data structure is not a chain, but there exists a subset of this tree that forms what we call the canonical chain.