+ ~ - + - + - + - ~ ~ *
| commonware
* ~ + + - ~ - + - * - +

Grafting Trees to Prove Current State

July 9, 2025

While proving you had a lot of money at some time in the past might go over well at a party, what's often far more useful is proving how much money you have right now. In database terms, this comes down to proving the current state of a key in a database, which is what we take on in this third article from our series on applying the Merkle Mountain Range (MMR) for efficient data authentication.

Recall that our first article described the basics of the MMR and what makes it special: updates are fast (compared to a more standard Merkle Tree) because they only append new data without modifying any historical data. Our second article described how one might use an MMR to authenticate the simplest database you need for state sync and interoperability, even though at first glance the MMR might appear inapplicable to the task. The idea here was to view a database as the result of applying a sequential list of operations, over which the MMR is then constructed. We called this structure adb::any because it allows for a compact proof (from an untrusted server, of course) that a specific key had taken on any one of its historical values at some point in its history.

Log operations and status bits over the key named 'foo'.
Figure 1: Log operations and status bits over a key named "foo". Only the last update of each key has a 1 for its activity status, indicating its current value (6).

Authenticating Current Values

In this post, we show how to extend adb::any to prove that a key currently holds a specific value. If you recall from our second article, the adb::any already tracks an activity status for each operation in the log (figure 1), allowing it to serve only the most recent value (if any) associated with each key. Imagine then pulling out this information into a separate list (or bit vector), where each element is a single bit indicating the corresponding operation's activity status. Unfortunately, unlike the list of operations which changes only through appending new ones, the vector of activity bits requires historical updates. Specifically, each time a key's value is updated, the operation corresponding to its last update flips from active to inactive. Reflecting that state change therefore requires we modify a historical bit rather than simply appending a new bit corresponding to the latest update.

If we wish to make authenticable claims over activity status, we'll (disappointingly) need to use a new data structure to handle these types of updates. Instead of replacing the MMR entirely, we keep the MMR over the operations log as is, and augment it with a separate Merkle tree of identical structure built over just the activity status bits. Conveniently, the only difference between an MMR and this kind of Merkle tree is the ability to modify previous elements and recompute the digests of its ancestors in the tree. By tying these two structures together with a root over the peaks of each (figure 2), a proof derived from only two paths (one to the operation and one to its corresponding activity state) delivers exactly the functionality we are looking for.

A combined MMR + Merkle Tree (MT) of identical structure allows proving which value for a given key is current.
Figure 2: A combined MMR + Merkle Tree (MT) of identical structure allows proving which value for a given key is current. In this figure the siblings along the two paths from root highlighted in red provide the necessary digests for proving “foo currently has value 6”.

Making Updates Fast

We've taken a bit of a step backwards by introducing a Merkle tree, since updates are no longer purely append-only. But, we can limit the performance impact of this tweak by pinning the Merkle tree portion of the structure entirely in RAM (compared to the log of operations, the size of the activity status bit vector is quite compact). Furthermore, we can divide the bit vector into chunks over which we build the tree. The bottom line is, with a suitable choice of N, this structure is small enough to keep entirely memory resident even over very large databases (and worth doing so to avoid random, small writes over disk). We won't review it here but both trees, like with the “single tree” adb::any, can be regularly pruned to keep their sizes in check.

The combined tree structure with the Merkle tree built over chunks of 4 bits each (N=2).
Figure 3: The combined tree structure with the Merkle tree built over chunks of 4 bits each (N=2).

Making Proofs Small

As described above, a current-value proof would be up to 2x larger than an any-value proof because it requires digests from the path of the second tree (the one to the relevant activity status chunk). As long as the number of bits in each chunk is a power of 2, however, the structure of the Merkle tree over activity status remains compatible with the structure of the MMR over database operations. That is, for a chunk size of 2^N, the Merkle tree matches exactly that of the MMR with its bottom N levels removed (figure 3). Instead of maintaining two completely independent trees tied together at a new root, we can instead graft each leaf of the Merkle tree onto its corresponding internal node of the MMR (figure 4).

Grafting each Merkle tree leaf onto its corresponding node in the MMR.
Figure 4: Grafting each Merkle tree leaf onto its corresponding node in the MMR.

Generating a proof over this grafted structure produces a current-value proof that contains the same node digests as a corresponding any-value proof and a bitmap chunk containing an operation's activity bit. Our implementation defaults to a chunk size of 256 bits (the same size as most hash digests) so that a current-value proof includes just one extra digest-worth of data compared to an any-value proof of the same value.

Our grafting technique draws inspiration from QMDB's novel “twig” abstraction but opts for smaller, configurable chunks (to minimize proof size) and, as noted previously, relies exclusively on RAM to store all unpruned chunks (to avoid random writes).

Marching Towards Production

adb::current is available today in the Commonware Library, along with several new storage extensions (that have delivered significant performance gains over the past month). This includes a flexible caching layer that provides write-ahead and random-access caching through a shared buffer pool, parallelized merkleization via batched updates, and an io_uring storage implementation (for operating systems that support it).

We remain on-schedule to release a production-ready version of commonware-storage later this year.