Analyzing the Base Mainnet September 5th Incident: A Comprehensive Review

UC1inBkbzU7bIVHPduOZY

TL;DR

The Base mainnet network experienced a brief stall on September 5, 2023 (incident). This postmortem details our commitment to building Base in a decentralized, open source manner, and our ongoing efforts to enhance the network’s reliability and resilience.

Root Cause

At approximately 2:25 pm PT, the Base mainnet network ceased producing blocks for 29 minutes. The root cause was traced to a dependency on a set of L1 nodes that ran out of disk space at 2:15 pm, rendering them unavailable to the sequencer.

  • Each new L2 block references an L1 block known as the “L1 origin.”
  • The sequencer refreshes the latest L1 block periodically to ensure L2 blocks reference recent L1 blocks.
  • If the L1 origin block exceeds the max sequencer drift threshold (currently set to 10 minutes), the sequencer halts new L2 block production.

Mitigation

The primary mitigation involved redirecting our sequencer and verifier nodes to alternative L1 nodes:

  • We updated the sequencer with a functioning L1 RPC, restarted op-node, and resumed block sequencing.
  • We also restarted posting batches and proposing state roots to the L1.
  • Focus shifted to stabilizing our verifier nodes, crucial for disseminating new L2 blocks and maintaining our public RPC endpoint: mainnet.base.org.

Forward Work

To prevent future failures, we are implementing measures to enhance resilience against L1 RPC disruptions:

  • Introducing a proxy layer ensuring constant availability of healthy L1 nodes to the L2 network.
  • Exploring decentralized sequencing options via the Superchain to eliminate single points of failure like the sequencer.

While the sequencer is currently not essential for user interactions (transactions can be included via L1 messenger contracts), we acknowledge the significant impact of block stalls on user experience. Our commitment to decentralization and modular sequencing aims to mitigate such issues.

Base remains dedicated to bolstering the resilience and decentralization of its network in the months and years ahead.

https://base.mirror.xyz/CuMCttLigo7PqPeOU1rFFqTg05UVQ8PTjxbLssrBuhc