Navigating the Web of Interoperability: A deep dive into Arbitrary Message Passing Protocols
The future is multichain. The quest for scalability has led Ethereum towards roll-ups. The shift towards modular blockchains has reignited attention on app chains. And over the horizon, we hear whispers of application-specific roll-ups, L3s, and sovereign chains.
But this has come at the cost of fragmentation. Hence, the first wave of basic bridges was launched to address the need for bridging, but they are often limited in functionality and rely on trusted signers for security.
What will the endgame of an interconnected web3 look like? We believe that all bridges will evolve into cross-chain messaging or “Arbitrary Message Passing” (AMP) protocols to unlock new use cases, by allowing applications to pass arbitrary messages from source to destination chain. We will also see a “trust mechanism landscape” emerge, where builders make various tradeoffs in usability, complexity, and security.
Every AMP solution needs two critical capabilities:
- Verification: The ability to verify the validity of the message from the source chain on the destination chain
- Liveness: The ability to relay information from source to destination
100% trustless verification is not achievable and the users are either required to trust code, game theory, humans (or entities), or a combination of these, depending on whether the verification is being done on-chain or off-chain.
We divide the overall interoperability landscape based on the trust mechanism and integration architecture.
- Trust code/math: For these solutions, on-chain proof exists and can be verified by anyone. These solutions generally rely on a light client to either validate the consensus of a source chain on a destination chain or verify the validity of a state transition for a source chain on a destination chain. Verification through light clients can be made much more efficient through Zero Knowledge proofs to compress arbitrarily long computations offline and provide a simple verification on-chain to prove computations.
- Trust game theory: There is an additional trust assumption when the user/application has to trust a third party or network of third parties for the authenticity of transactions. These mechanisms can be made more secure through permissionless networks coupled with game theoretics such as economic incentives and optimistic security.
- Trust humans: These solutions rely on honesty from the majority of the validators or independence of entities relaying different information. They require trust in third parties in addition to trusting the consensus of the two interacting chains. The only thing at stake here is the reputation of the participating entities. If enough participating entities agree that a transaction is valid, then it is considered valid.
It is important to note that all solutions, to a certain degree, require trust in code as well as humans. Any solution with faulty code can be exploited by hackers and every solution has some human element in the setup, upgrades, or maintenance of the codebase.
- Point-to-Point model: A dedicated communication channel needs to be established between every source and every destination.
- Hub and Spoke model: A communication channel needs to be established with a central hub that enables connectivity with all other blockchains connected to that hub.
The Point to Point model is relatively difficult to scale as a pairwise communication channel is required for every connected blockchain. Developing these channels can be challenging for blockchains with different consensus and frameworks. However, pairwise bridges provide more flexibility to customize configurations, if needed. A hybrid approach is also possible, for example, by using an Inter-Blockchain Communication protocol (IBC) with multi-hop routing via a hub, which removes the need for direct pairwise communication, but reintroduces more complexity in security, latency, and cost considerations.
How do light clients validate the consensus of a source chain on a destination chain?
A light client/node is a piece of software that connects to full nodes to interact with the blockchain. Light clients on the destination chain normally store the history of block headers (sequentially) of the source chain which is enough to verify the transactions. Off-chain agents like relayers monitor the events on the source chain, generate cryptographic inclusion proofs, and forward them along with the block headers, to the light client on the destination chain. Light clients are able to verify the transaction as they store the block headers sequentially and each block header contains the Merkle root hash which can be used to prove the state. The key features of this mechanism are:
- Apart from trust in code, another trust assumption is introduced during the initialization of the light client. When someone creates a new light client, it is initialized with a header from a specific height from the counterparty chain. The supplied header could be incorrect and the light client can be later tricked with further fake headers. No trust assumptions are introduced once the light client has been initialized. However, this is a weak trust assumption as anyone can check the initialization.
- There is a liveness assumption on the relayer as it is required to transmit the information.
2. Implementation: Depends on the availability of support for the cryptographic primitives required for verification
- If the same type of chain is being connected (same application framework and consensus algorithm), then the implementation of the light client on both sides will be the same. Example: IBC for all Cosmos SDK-based chains.
- If two different types of chains (different application frameworks or consensus types) are connected then the implementation of the light client will differ. Example: Composable finance is working to enable Cosmos SDK chains to be connected via IBC to Substrate (app framework of Polkadot ecosystem). This requires a Tendermint light client on the substrate chain and a so-called beefy light client added to the Cosmos SDK chain
- Resource intensiveness: It is expensive to run pairwise light clients on all the chains as writes on blockchains are expensive and not feasible to run on chains with dynamic validator sets like Ethereum.
- Extensibility: Light client implementation is required for each combination of chains. Given that the implementation varies based on the chain’s architecture, it is difficult to scale and connect different ecosystems.
- Code exploitation: Errors in code can lead to vulnerabilities. The BNB chain exploit in October 2022 uncovered a critical security vulnerability affecting all IBC-enabled chains.
How do ZK proofs verify the validity of a state-transition for the source chain on the destination chain?
Running pairwise light clients on all chains is cost prohibitive and not practical for all the blockchains. Light clients in implementations like IBC are also required to keep track of the validator set of the source chain which is not practical for chains with dynamic validator sets, like Ethereum. ZK proofs provide a solution to reduce gas and verification time. Instead of running the entire computation on-chain, only the verification of proof of computation is done on-chain and the actual computation is done off-chain. Verifying a proof of computation can be done in less time and with less gas than re-running the original computation. The key features of this mechanism are:
- Security: zk-SNARKs depend upon elliptic curves for their security and zk-STARKs depend on hash functions. zk-SNARKs may or may not require a trusted setup. The trusted setup is only needed initially which refers to the initial creation event of the keys that are used to create the proofs required for verification of those proofs. If the secrets in the setup event are not destroyed, they could be utilized to forge transactions by false verifications. No trust assumptions are introduced once the trusted setup is done.
- Implementation: Different ZK proving schemes like SNARK, STARK, VPD, SNARG exist today and currently SNARK is the most widely adopted. Recursive ZK proofs is another latest development that allows the total work of proving to be divided between multiple computers instead of just one. To generate validity proofs, the following core primitives need to be implemented:
- verification of the signature scheme used by the validators
- inclusion of proof of validator public keys in validator set commitment (which is stored on-chain)
- tracking the set of validators which can keep changing frequently
- To implement various signature schemes inside a zkSNARK requires implementing out-of-field arithmetic and complex elliptic curve operations. This is not easy to achieve and might require different implementations for each chain depending on their framework and consensus.
- If the proving time and effort are extremely high, then only specialized teams with specialized hardware will be able to do that which will lead to centralization. Higher proof generation time can also lead to latency.
- Higher verification time and effort will lead to higher on-chain costs.
4. Examples: Polymer ZK-IBC by Polymer Labs, Succinct Labs. Polymer is working on multi-hop enabled IBC to increase connectivity while reducing the number of pairwise connections needed.
Trust game theory
Interoperability protocols that rely on game theoretics can be broadly divided into 2 categories based on how they incentivize honest behavior from participating entities:
- Economic security: Multiple external participants (like validators) reach a consensus on the updated state of the source chain. This is similar to a multi-sig setup but in order to become a validator, participants are required to stake a certain amount of tokens, which can be slashed in case any malicious activity is detected. In permissionless setups, anyone can accumulate stakes and become a validator. There are also block rewards, to act as economic incentives, when the participating validators follow the protocol. The participants are thus economically incentivized to be honest. However, if the amount that can be stolen is much higher than the amount staked, then participants may attempt to collude to steal funds. Examples: Axelar, Celer IM
- Optimistic security: Optimistic security solutions rely on the minority trust assumption which assumes that only a minority of blockchain participants are live, honest, and follow the rules of the protocol. The solution could require only a single honest participant to hold a guarantee. The most common example is an optimal solution where anyone can submit fraud proof. There is also an economic incentive here but it is practically possible even for an honest watcher to miss a fraudulent transaction. Optimistic roll-ups also leverage this approach. Examples: Nomad, ChainLink CCIP
- In the case of Nomad, watchers can prove fraud. However, Nomad watchers are whitelisted at the time of writing.
- ChainLink CCIP will leverage an Anti-Fraud Network which will consist of decentralized oracle networks with the sole purpose of monitoring for malicious activity. CCIP’s Anti-Fraud Network implementation is yet to be seen.
The key features of these mechanisms are:
- Security: For both mechanisms, it is essential to have permissionless participation from validators and watchers for the game theory mechanisms to be effective. Under the economic security mechanism, funds may be more at risk if the amount staked is lower than the amount that can be stolen. Under the optimistic security mechanism, minority trust assumptions for optimistic solutions can be exploited if no one submits the fraud proof, or if permissioned watchers are compromised/removed, whereas economic security mechanisms do not have the same dependency on liveness for security.
- Middle chain with its own validators: A group of external validators monitor the source chain, reach a consensus on the validity of the transaction whenever a call is detected, and provide attestation on the destination chain if consensus is reached. Validators are usually required to stake a certain amount of tokens which can be slashed if malicious activity is detected. Examples: Axelar Network, Celer IM
- Via Off-chain agents: Off-chain agents can be used to implement an optimistic roll-up like solution where during a predefined time window, off-chain agents will be allowed to submit fraud proof and revert the transaction. Example: Nomad relies on independent off-chain agents to relay the header and cryptographic proof. ChainLink CCIP will leverage its existing oracle network for monitoring and attesting cross-chain transactions.
- Trust assumptions can be exploited to steal funds if the majority of the validators collude, which require countermeasures like quadratic voting and fraud proofs.
- Finality: Optimistic security-based AMP solutions introduce complexity in finality and liveness because users and applications need to wait for the fraud window.
- Resource Optimization: This approach is usually not resource intensive as the verification usually does not happen on-chain
- Extensibility: This approach is more extensible as the consensus mechanism remains the same for all kinds of chains and can be easily extended to heterogeneous blockchains.
- Majority honesty assumption: These solutions rely on a multi-sig implementation where multiple entities verify and sign the transactions. Once the minimum threshold is achieved, the transaction is considered valid. The assumption here is that the majority of the entities are honest and if a majority of these entities are signing on a particular transaction, it is valid. The only thing at stake here is the reputation of the participating entities. Examples: Multichain (Anycall V6), Wormhole. Exploits due to smart contract bugs are still possible, as evidenced by the Wormhole hack in early 2022.
- Independence: These solutions split the entire message passing process into two parts and rely on different independent entities to manage the two processes. The assumption here is that the two entities are independent of each other and are not colluding. Example: LayerZero. The block headers are streamed on demand by decentralized oracles and transaction proofs are sent via relayers. If the proof matches the headers, the transaction is considered valid. While proof matching relies on code/math, participants are required to trust the entities to remain independent. The applications building on LayerZero has the option to choose their Oracle and Relayer (or host their own Oracle/Relayer), thereby limiting the risk to individual oracle/relayer collusion. The end users need to trust that either LayerZero, a third party, or the application itself is running the oracle and relayer independently and without malicious intentions.
In both approaches, the reputation of participating 3rd party entities disincentivizes malicious behavior. These are usually respected entities within the validator and oracle community and they risk reputational consequences and a negative impact on their other business activities if they act maliciously.
Beyond trust assumptions and the future of interoperability
While considering the security and usability of an AMP solution, we need to also take into account the details beyond basic mechanisms. As these are moving parts that can change over time, we did not include them in the overall comparison.
- Code integrity: A number of hacks in the recent past have leveraged bugs in the code which necessitates reliable audits, well-planned bug bounties, and multiple client implementations. If all the validators (in economic/optimistic/reputational security) run the same client (software for verification), it increases the dependency on a single codebase and reduces client diversity. Ethereum, for example, relies on multiple execution clients like geth, nethermind, erigon, besu, akula. Multiple implementations in a variety of languages are likely to increase diversity without any client dominating the network, thereby eliminating a potential single point of failure. Having multiple clients could also help with liveness if a minority of validators/signers/light clients go down due to exploits/bugs in one particular implementation.
- Setup and Upgradability: Users and developers need to be aware if validators/ watchers can join the network in a permissionless manner, otherwise trust is hidden by the selection of permissioned entities. Upgrades to smart contracts can also introduce bugs which can lead to exploits or even potentially change the trust assumptions. Different solutions can be implemented to mitigate these risks. For example, in the current instantiation, the Axelar gateways are upgradable subject to approval from an offline committee (4/8 threshold), however, in the near future Axelar plans to require all validators to collectively approve any upgrades to the gateways. Wormhole’s core contracts are upgradeable and are managed via Wormhole’s on-chain governance system. LayerZero relies on immutable smart contracts and immutable libraries to avoid any upgrades, however, it can push a new library, and dapps with default settings would get the updated version, and dapps with their version manually set would need to set it to the new one.
- MEV: Different blockchains are not synchronized through a common clock and have different times to finality. As a result, the order and time of execution on the destination chain can vary across chains. MEV in a cross-chain world is challenging to clearly define. It introduces a trade-off between liveness and order of execution. An ordered channel will ensure the ordered delivery of messages but the channel will close if one message times out. Another application might prefer a scenario where ordering is not necessary but the delivery of other messages is not impacted.
Trends and future outlook:
- Customizable and additive security: To better serve diverse use cases, AMP solutions are incentivized to offer more flexibility to developers. Axelar introduced an approach for the upgradability of message passing and verification, without any changes to application-layer logic. HyperLane V2 introduced modules that allow developers to choose from multiple choices such as economic security, optimistic security, dynamic security, and hybrid security. CelerIM offers additional optimistic security along with economic security. Many solutions wait for a predefined minimum number of block confirmations on the source chain before transmitting the message. LayerZero allows developers to update these parameters. We expect some AMP solutions to continue offering more flexibility but these design choices warrant some discussion. Should the apps be able to configure their security, to what extent, and what happens if the apps adopt sub-par design architecture? User awareness of the basic concepts behind security may become increasingly important. Ultimately, we foresee aggregation and abstraction of AMP solutions, perhaps in some form of combination or “additive” security.
- Growth and maturation of “Trust code/math” mechanisms: In an ideal endgame, all cross-chain messages will be trust minimized by using zero knowledge (ZK) proofs to verify messages and states. We are already witnessing this shift with the emergence of projects like Polymer Labs and Succinct Labs. Multichain also recently published a zkRouter whitepaper to enable interoperability through ZK proofs. With the recently announced Axelar Virtual Machine, developers can leverage the Interchain Amplifier to permissionlessly set up new connections to the Axelar network. For example, once robust light-clients & ZK proofs for Ethereum’s state are developed, a developer can easily integrate them into the Axelar network to replace or enhance an existing connection. LayerZero in its documentation talks about the possibility of adding new optimized proof messaging libraries in the future. Newer projects like Lagrange are exploring the aggregation of multiple proofs from multiple source chains and Herodotus is making storage proofs feasible through ZK proofs. However, this transition will take time as this approach is difficult to scale among blockchains relying on different consensus mechanisms and frameworks. ZK is a relatively new and complex tech that is challenging to audit, and, currently, the verification and proof generation is not cost-optimal. We believe that in the long run, to support highly scalable cross-chain applications on the blockchain, many AMP solutions are likely to complement trusted humans and entities with verifiable software because:
- The possibility of code exploitation can be minimized through audits and bug bounties. With the passage of time, it will be easier to trust these systems as their history will serve as proof of their security.
- The cost of generating ZK proofs will decrease. With more R&D in ZKPs, recursive ZK, proof aggregation, and specialized hardware, we expect the time and cost of proof generation and verification to reduce significantly, making it a more cost-effective approach.
- Blockchains will become more zk-friendly. In the future, zkEVMs will be able to provide a succinct validity proof of execution, and light client-based solutions will be able to easily verify both execution and consensus of a source chain on the destination chain. In Ethereum’s endgame, there are also plans to “zk-SNARK everything” including consensus.
- Proof of humanity/ reputation/ identity: The security of complex systems like AMP solutions cannot be encapsulated through a single framework and warrants multiple layers of solutions. For example, along with economic incentives, Axelar has implemented quadratic voting to prevent the concentration of voting power among a subset of nodes and promote decentralization. Other proofs of humanity, reputation, and identity can also complement mechanisms for setup and permission.
In the Web3 spirit of openness, we will likely see a plural future where multiple approaches co-exist. In fact, applications may choose to use multiple interoperability solutions, either in a redundant way, or let users mix and match with disclosure of trade-offs. Point-to-point solutions may be prioritized between “high traffic” routes, whereas hub and spoke models may dominate the long tail of chains. In the end, it is up to us, the collective dem and of users, builders, and contributors, to shape the topography of the interconnected web3.
We would like to thank Bo Du and Peter Kim from Polymer Labs, Galen Moore from Axelar Network, Uma Roy from Succinct Labs, Max Glassman, and Ryan Zarick from LayerZero for reviewing and providing their valuable feedback.
List of references:
- Axelar: https://docs.axelar.dev/
- CelerIM: https://im-docs.celer.network/developer/celer-im-overview
- ChainLink CCIP: https://blog.chain.link/introducing-the-cross-chain-interoperability-protocol-ccip/
- debridge: https://docs.debridge.finance/
- IBC: https://tutorials.cosmos.network/academy/3-ibc/
- LayerZero: https://layerzero.gitbook.io/docs/
- Multichain: https://docs.multichain.org/getting-started/introduction
- Nomad: https://docs.nomad.xyz/nomad-101/introduction
- Polymer Labs: https://polymerlabs.medium.com/
- Succinct Labs: https://blog.succinct.xyz/
- Wormhole: https://docs.wormhole.com/wormhole/
Additional reading list:
- Navigating Arbitrary Messaging Bridges: A Comparison Framework: https://blog.li.fi/navigating-arbitrary-messaging-bridges-a-comparison-framework-8720f302e2aa
- L2Bridge Risk Framework: https://gov.l2beat.com/t/l2bridge-risk-framework/31
- Horatius at the Bridge: https://maven11.substack.com/p/horatius-at-the-bridge