How to Organize Rollup Incident Response: A Developer's Guide

introduction

OPERATIONAL GUIDE

How to Organize Rollup Incident Response

A structured framework for development teams to prepare for and manage security incidents, downtime, or critical bugs in rollup environments.

Effective rollup incident response begins with preparation, not reaction. Unlike monolithic blockchains, rollups introduce unique failure modes across the sequencer, data availability layer, bridge contracts, and proving system. Your first step is to define a clear incident severity matrix. Categorize events by impact: SEV-1 for total network halt or fund loss, SEV-2 for degraded performance or high-risk bugs, and SEV-3 for non-critical issues. Assign an on-call rotation from core engineering and DevOps, ensuring 24/7 coverage with defined escalation paths to key stakeholders and external auditors like OpenZeppelin.

Establish a dedicated, private war room channel (e.g., in Slack or Discord) for immediate coordination. This channel should be pre-configured with critical alerts from monitoring tools. Essential monitoring targets include: sequencer health (block production, RPC latency), data availability submission success rates, L1 bridge contract event logs for suspicious withdrawals, and prover status for validity or fraud proofs. Tools like Tenderly, Blocknative, or custom alerting via Prometheus/Grafana are commonly used. Automate alerts to trigger based on predefined thresholds, such as sequencer downtime exceeding 5 minutes.

When an incident is declared, the first responder's role is to diagnose and contain. Immediately gather logs and metrics to identify the incident's scope: Is it isolated to the sequencer, or is the L1 bridge affected? For a halted sequencer, the response may involve failover to a backup or manual transaction ordering. If a bug is suspected in a smart contract, the priority is to pause the vulnerable module—most rollup bridges include pause functions controlled by a multisig or timelock. Document every action taken in a shared log, as this will be crucial for post-mortem analysis and communicating with users.

Communication is critical. Prepare templated announcements for different severity levels. For a SEV-1 incident, immediately update the public status page and post a concise alert on social media (e.g., "We are investigating an issue with the sequencer. Transmissions are temporarily paused."). Provide regular, honest updates even if the root cause is unknown. For developers, maintain a real-time feed in your public Discord or Telegram support channel. Transparency builds trust; obscuring the severity or ETA for resolution often leads to greater community backlash and speculation.

After resolution, conduct a blameless post-mortem within 72 hours. The report should detail the timeline, root cause, impact metrics (downtime, affected users, financial impact), and, most importantly, actionable follow-up items. Examples include: "Add circuit breaker to sequencer batch logic," "Increase test coverage for edge-case withdrawal proofs," or "Implement more granular bridge pausing." Publish a summarized version of this post-mortem publicly. This practice, adopted by teams like Optimism and Arbitrum, demonstrates accountability and helps the entire ecosystem learn from the event.

prerequisites

ROLLUP OPERATIONS

Prerequisites for Effective Incident Response

A structured, pre-defined response plan is critical for minimizing downtime and financial loss during a rollup incident. This guide outlines the essential components to establish before an emergency occurs.

Effective incident response for a rollup begins with clear ownership and communication channels. Designate a primary on-call engineer and establish escalation paths to senior developers or protocol architects. Define communication protocols using tools like Discord war rooms, Telegram groups, or PagerDuty to ensure rapid, coordinated action. Crucially, maintain a public-facing status page (e.g., using Statuspage or a GitHub issue) to provide transparent, real-time updates to users and dApp developers, managing community expectations and reducing panic during an outage.

Technical preparedness requires comprehensive monitoring and alerting. Implement observability stacks that track core sequencer health metrics: batch_submission_latency, L1_gas_spike_detection, state_root_commitment_delay, and RPC_endpoint_availability. Set up alerts for deviations from baseline performance using tools like Prometheus, Grafana, or Datadog. Additionally, maintain ready access to sequencer private keys, multi-sig wallets, and upgrade contracts on the L1. These must be securely stored but instantly accessible to authorized responders to execute emergency pauses, upgrades, or fund recovery.

Documentation is the backbone of any response. Create and regularly update a runbook with step-by-step procedures for common failure scenarios. This should include:

Sequencer halt and restart procedures
Forced transaction inclusion via L1
Contract upgrade and pausing mechanisms
Cross-chain bridge pause/unpause commands Test these procedures regularly in a testnet or devnet environment that mirrors mainnet configurations. Familiarity with these steps under non-stressful conditions prevents critical mistakes during a live incident.

Finally, establish legal and operational protocols. Define decision-making authority for invoking emergency measures, which may involve a decentralized multisig or a pre-authorized committee. Understand the regulatory and contractual implications of pausing a network or rolling back state. Coordinate with key ecosystem partners—such as major bridges (LayerZero, Wormhole), oracles (Chainlink), and DEXs—to ensure their systems are aligned with your response actions, preventing fragmented liquidity or arbitrage attacks during the recovery process.

key-concepts-text

OPERATIONAL FRAMEWORK

How to Organize Rollup Incident Response

A structured incident response plan is critical for rollup operators to minimize downtime and protect user funds during a protocol failure.

Effective rollup incident response begins with preparation. This involves establishing a clear on-call rotation for core engineering and DevOps teams, documented communication channels (e.g., private war rooms, public status pages), and pre-defined severity classifications. For example, a Severity 1 (S1) incident might be a sequencer halt causing a total liveness failure, while an S2 could be a critical bug in a bridge contract. Teams should maintain a runbook with immediate diagnostic commands, such as checking sequencer health via an RPC endpoint (curl -X POST https://rollup-rpc.example.com -H "Content-Type: application/json" --data '{"jsonrpc":"2.0","method":"eth_blockNumber","id":1}').

When an incident is detected, the first phase is identification and declaration. The on-call engineer confirms the issue—such as stalled block production, a spike in failed transactions, or an exploit alert—and formally declares an incident within the designated channel. This triggers the assembly of the incident commander and relevant responders. Clear, time-stamped logs are essential. The team must quickly determine the incident's scope: Is it isolated to the sequencer, a data availability layer, a bridge, or the underlying smart contracts? This assessment dictates the next steps and communication strategy.

The core of the response is containment and mitigation. The goal is to stop the damage and restore basic functionality. For a sequencer failure, this may involve failing over to a backup node or, in extreme cases, coordinating a temporary pause via the upgrade multisig. If a vulnerability is found in a bridge, the team might need to temporarily disable deposits. All actions must be executed through the protocol's governance or emergency security council, following pre-authorized multisig procedures. During this phase, internal communication is paramount to avoid conflicting actions.

Parallel to technical mitigation is stakeholder communication. A status page should be updated with confirmed facts, impacted services, and estimated time to resolution. Transparent, timely updates build trust. For major incidents affecting user funds, a post-mortem must be published, detailing the root cause, response timeline, and corrective actions. This document, following a template like those from Google or GitLab, is a cornerstone of operational maturity and is often required by ecosystem partners and auditors.

Finally, the post-incident review is where long-term improvements are made. The team conducts a blameless retrospective to analyze the response: Were detection times adequate? Were runbooks followed? What new monitoring or circuit breakers could prevent recurrence? Findings are translated into concrete action items, such as deploying additional sequencer health checks, improving alert granularity, or updating contract pausing mechanisms. This cycle of preparation, response, and learning fortifies the rollup's resilience against future incidents.

incident-types

INCIDENT RESPONSE

Common Rollup Incident Types

Understanding common failure modes in rollups is the first step to building an effective response plan. This guide categorizes the primary technical and economic incidents that can affect L2 networks.

Sequencer Failure

The sequencer is the single point of failure for most optimistic and ZK rollups. When it goes offline, users cannot submit transactions, causing network downtime.

Causes: Hardware failure, DDoS attacks, or software bugs in the sequencer node.
Impact: Transactions halt; users must fall back to slower L1 submission.
Example: Arbitrum experienced a 45-minute sequencer outage in September 2021 due to a bug in batch posting logic.

EXPLORE

State Root Dispute / Invalid Proof

This is the core security failure for each rollup type.

Optimistic Rollups: A state root dispute occurs when a challenger successfully proves a sequencer submitted an invalid state transition. This triggers a fraud proof and reverts the chain.
ZK Rollups: An invalid proof is a cryptographic failure where a zero-knowledge proof is incorrectly verified, potentially allowing invalid state changes.
Response: Networks have a challenge period (e.g., 7 days for Optimism) or must rely on the verifier contract's correct operation.

EXPLORE

Data Availability Failure

Rollups must post transaction data to L1 for security. If this data is unavailable, users and validators cannot reconstruct the L2 state.

Causes: Sequencer censorship, L1 congestion pricing out data posts, or bugs in the data posting module.
Impact: Users cannot force transactions or verify state. Assets may be temporarily frozen.
Mitigation: EIP-4844 proto-danksharding on Ethereum is designed to drastically reduce data availability costs and risks.

EXPLORE

Bridge / Withdrawal Exploit

The bridge contract on L1 that manages deposits and withdrawals is a high-value attack target.

Common Vectors: Logic errors in withdrawal verification, signature malleability, or upgrade mechanism compromises.
Historical Example: The Nomad bridge hack in 2022 resulted in a $190M loss due to a faulty initialization parameter.
Prevention: Requires rigorous audits, time-locked upgrades, and multi-signature governance for critical changes.

EXPLORE

Economic & MEV Attacks

Rollups introduce new economic attack vectors that can destabilize the network.

Sequencer MEV: The centralized sequencer can front-run, censor, or reorder transactions for profit.
L1 Gas Auction Spikes: During congestion, sequencers may engage in bidding wars to post data, increasing costs for users.
Withdrawal Delay Exploitation: Attackers may exploit the 7-day challenge window in optimistic rollups for arbitrage or manipulation.

EXPLORE

Upgrade Governance Risk

Most rollups use upgradeable proxy contracts controlled by a multi-sig or DAO. A malicious or compromised upgrade can change system rules.

Risk: Governance keys can be stolen, or a vote can be used to insert a backdoor, drain funds, or censor users.
Mitigation: Implement security councils with veto power, long timelocks (e.g., 30+ days), and eventually transition to fully immutable code or decentralized validator sets.

EXPLORE

SEVERITY CLASSIFICATION

Incident Severity and Response Matrix

A framework for classifying rollup incidents by impact and defining the required response protocol.

Severity Level	Impact Description	Example Scenarios	Initial Response Time	Communication Protocol	Escalation Path
SEV-1: Critical	Total network halt, loss of funds, or critical consensus failure.	Sequencer downtime > 2 hours, invalid state root commitment, bridge exploit.	< 15 minutes	Public announcement + dedicated status page + all social channels.	Immediate escalation to core engineering and executive team.
SEV-2: High	Major service degradation, partial downtime, or significant performance issues.	Sequencer lag > 30 blocks, RPC endpoint failure, >50% fee spike.	< 1 hour	Public status page update + core community channels (Discord/TG).	Escalation to on-call SRE and protocol leads within 1 hour.
SEV-3: Medium	Non-critical bug, minor performance degradation, or UI/UX issues.	Explorer displaying stale data, minor RPC latency, non-critical contract bug.	< 4 hours	Internal tracking + notification in developer channels.	Assigned to engineering team for next business-day resolution.
SEV-4: Low	Cosmetic issues, documentation errors, or feature requests.	Typos on website, incorrect API docs, non-blocking UI bug.	Next business day	Internal ticketing system (e.g., Jira, Linear).	Routed to appropriate product or developer relations team.

response-process

ROLLUP OPERATIONS

Step-by-Step Incident Response Process

A structured framework for organizing and executing a rapid, effective response to security incidents or critical failures on a rollup network.

A formalized incident response process is critical for rollup operators to minimize downtime, financial loss, and reputational damage. Unlike monolithic blockchains, rollups introduce unique failure modes involving sequencers, data availability layers, and bridge contracts. The primary goals are containment, eradication, and recovery. This process should be documented, rehearsed via tabletop exercises, and integrated with on-chain governance mechanisms for protocol-level upgrades if necessary. Key stakeholders include the core dev team, sequencer operators, validators, and a designated communications lead.

The first phase is Preparation and Detection. This involves establishing monitoring for key metrics: sequencer liveness, transaction inclusion latency, state root divergence, and bridge fund balances. Tools like Prometheus, Grafana, and custom health checks for the rollup node's RPC endpoints are essential. Detection can come from automated alerts, user reports on social channels, or security monitoring services like Forta. Immediately upon detection, the incident commander is activated and a private, dedicated communication channel (e.g., a war room in Slack or Discord) is established to coordinate the response away from public view.

Next is Containment and Analysis. The immediate action is to assess the scope: Is the sequencer halted? Is the bridge paused? Are funds at risk? Short-term containment may involve using administrative functions, like pausing the L1CrossDomainMessenger contract in Optimism-style rollups, to prevent further malicious transactions. Simultaneously, the team conducts forensic analysis. This includes examining sequencer logs, analyzing the faulty batch or state transition, and tracing the incident's root cause. The analysis must differentiate between a software bug, a malicious exploit, or a failure in the underlying data availability layer (like Celestia or EigenDA).

The Eradication and Recovery phase involves deploying a fix. For a sequencer bug, this may mean patching the node software, restarting with a corrected version, and ensuring it re-syncs correctly. For a smart contract vulnerability, a governance-approved upgrade is required. Recovery must be carefully orchestrated: the fixed sequencer begins producing new blocks, verifiers confirm the new chain is valid, and the bridge is unpaused. A crucial step is ensuring the recovered chain's state is consistent and that no double-spends or incorrect state transitions occurred during the incident window. This often requires manual verification of critical state hashes.

Finally, conduct Post-Incident Review. Document a detailed timeline, root cause, impact assessment, and corrective actions. This report should be public (following a responsible disclosure period) to maintain transparency. Implement the corrective actions, which may include code audits, improved monitoring rules, or updates to the incident response plan itself. For example, after the 2022 Optimism outage due to a Geth dependency bug, the team published a post-mortem and improved their node health-check systems. This phase closes the loop, turning the incident into a learning opportunity to strengthen the rollup's resilience.

essential-tools

ROLLUP INCIDENT RESPONSE

Essential Monitoring and Tooling

A structured approach to detecting, analyzing, and resolving issues in rollup environments. This framework covers the tools and processes needed for effective incident management.

Establish a Runbook and Communication Protocol

Before an incident occurs, create a documented runbook and clear communication channels. This is your first line of defense.

Define Severity Levels: Classify incidents as P0 (critical/chain halted) to P3 (minor).
Assign Roles: Designate an Incident Commander, a Communications Lead, and technical responders.
Set Up War Rooms: Use tools like Slack (#incidents channel) or Discord with clear escalation paths to on-call engineers.
Document Procedures: Outline steps for common failures like sequencer downtime, batch submission errors, or state root mismatches.

EXPLORE

Monitor Core Sequencer Health

The sequencer is the heart of your rollup. Monitor its liveness, transaction processing, and batch creation.

Key Metrics: Sequencer uptime, transaction pool size, batch creation interval, and successful L1 submission rate.
Alerting: Set thresholds for batch submission failures or prolonged transaction finality delays (> 30 minutes).
Tools: Use Prometheus/Grafana dashboards with custom exporters, or managed services like Datadog. Implement health checks that probe the sequencer's RPC endpoint and submission status.

EXPLORE

Track Data Availability and Bridge Security

Ensure data is available on the base layer and the bridge operates securely. This is critical for trustless withdrawals.

Data Availability (DA) Monitoring: For validiums/volitions, verify data is posted to the DA layer (e.g., Celestia, EigenDA). For optimistic rollups, monitor transaction data is included in L1 calldata.
Bridge Monitoring: Track the bridge contract on L1 for deposit/withdrawal pauses, unusual large withdrawals, and governance proposals.
Tools: Use block explorers (Etherscan), custom scripts listening for L1 events, and therollup's official bridge UI for status checks.

EXPLORE

Analyze Prover Performance (ZK-Rollups)

For ZK-rollups, the prover is a critical bottleneck. Monitor its performance and proof submission success.

Key Metrics: Proof generation time, proof verification gas cost on L1, prover queue length, and hardware resource usage (GPU/CPU).
Failure Analysis: Log reasons for proof generation failures (e.g., circuit constraints, witness generation errors).
Tools: Instrument the prover service with detailed logging (e.g., using OpenTelemetry). Dashboards should show proof latency percentiles (p50, p99) and failure rates.

EXPLORE

Implement User-Facing Status Pages

Maintain transparency during an incident with a public status page. This manages community expectations and reduces support load.

Display: Show real-time status for RPC endpoints, the bridge, block production, and finality.
Incident Updates: Post timely updates on investigation, identified root cause, and estimated time to resolution.
Tools: Use services like Statuspage, Freshstatus, or an open-source alternative. Integrate with your monitoring stack to auto-update component status.

EXPLORE

Conduct Post-Incident Reviews

After resolving an incident, hold a blameless postmortem to improve systems and processes.

Document the Timeline: Create a detailed timeline from first detection to full resolution.
Identify Root Cause: Use the 5 Whys technique to move beyond symptoms (e.g., "sequencer down") to underlying causes (e.g., "database connection pool exhausted due to unoptimized query").
Actionable Follow-ups: Create tickets for monitoring gaps, code fixes, and runbook updates. Share findings internally and, where appropriate, with the community.

EXPLORE

communication-plan

OPERATIONAL GUIDE

How to Organize Rollup Incident Response

A structured communication plan is critical for managing security incidents, protocol failures, and network outages in rollup environments. This guide outlines the key components and steps for an effective response.

An incident response plan for a rollup must address the unique technical and social coordination challenges of Layer 2 systems. Unlike monolithic chains, a rollup incident can involve the sequencer, data availability layer, bridge contracts, and the underlying L1 settlement layer. Your plan should define clear severity tiers (e.g., Sev-1: Total network halt, Sev-2: Partial degradation) and assign specific on-call responsibilities for engineering, DevOps, and community teams. The first step is immediate internal triage using monitoring tools like block explorers, sequencer health checks, and bridge fund tracking to confirm the incident's scope.

Internal communication must be swift and structured. Use a dedicated, private channel (e.g., a war room in Slack or Discord) for the core response team. The first message should follow a standard template: Incident Title, Severity, Start Time, Affected Systems (Sequencer/Prover/Bridge), and Initial Impact. Designate a single Incident Commander to coordinate technical mitigation and a Communications Lead to manage external messaging. This separation prevents conflicting information and allows engineers to focus on resolution. All actions and discoveries should be logged in a shared document for post-mortem analysis.

External communication requires transparency balanced with precision. For a Sev-1 incident, immediately post a notice on the project's official status page and main social channel (e.g., Discord announcement or Twitter/X). The initial message should acknowledge the issue, state what is being investigated, and indicate when the next update will be provided—even if the root cause is unknown. Avoid technical speculation. Subsequent updates should follow at regular intervals (e.g., every 30-60 minutes) to maintain trust. For developers, use dedicated channels like a Telegram/Signal group for ecosystem partners and major dApp integrators.

The resolution and post-mortem phase is crucial for long-term resilience. Once the incident is mitigated, publish a preliminary "All Clear" notice, followed by a commitment to a detailed post-mortem report within a defined timeframe (typically 3-7 days). The public report should include: a timeline of events, root cause analysis, impact assessment (e.g., funds at risk, downtime duration), and a list of concrete corrective actions. For example, after a sequencer outage, actions might include implementing multi-sequencer failover or improving L1 gas price monitoring. This transparency is essential for maintaining validator, developer, and user trust in the rollup's operational integrity.

ROLLUP OPERATIONS

Frequently Asked Questions

Common questions and solutions for managing incidents in rollup environments, from sequencer downtime to state recovery.

Immediately verify the scope and cause. Check the sequencer's health endpoint (e.g., http://sequencer:7300/health), review logs for errors, and confirm if the issue is isolated or network-wide. Your primary goal is to prevent a state divergence between L1 and L2.

Activate Monitoring Alerts: Ensure your team is notified via PagerDuty, Slack, or OpsGenie.
Public Communication: Update your status page (e.g., using Statuspage.io) and post to social channels (Twitter, Discord) to inform users of degraded service.
Failover Assessment: Determine if you have a hot standby sequencer ready to take over, or if you need to initiate manual recovery procedures.

resource-links

GUIDES AND TOOLS

Resources and Further Reading

These resources help rollup teams design, document, and execute incident response processes. They cover governance, monitoring, communication, and real-world failure analysis.

Rollup Incident Response Playbooks

A written incident response playbook is critical for rollup operators, sequencers, and security councils. It defines who can act, what actions are allowed, and how information flows during failures.

Key elements to include:

Incident classification: sequencer halt, invalid state root, bridge compromise, data availability outage
Authority mapping: multisig signers, security council thresholds, timelocks
Recommended actions: pause withdrawals, disable sequencer, force batch submission
Escalation paths: when to notify Ethereum L1, validators, or external auditors

Teams like Optimism and Arbitrum maintain internal runbooks that map incidents to concrete on-chain actions. Treat the playbook as a living document and rehearse it during testnet outages or simulated failures.

Ethereum Foundation Security and Incident Coordination

The Ethereum Foundation coordinates cross-ecosystem security response for critical vulnerabilities and protocol-level incidents that affect L1, rollups, or shared infrastructure.

Relevant practices:

Private disclosure channels for zero-day vulnerabilities
Coordinated patch timelines across clients and rollups
Public postmortems with root cause analysis and lessons learned

Rollups that depend on Ethereum L1 for settlement or data availability should predefine when and how to engage EF security contacts. This is especially relevant for issues involving fraud proof verification, EVM client bugs, or consensus-layer interactions.

EXPLORE

L2Beat Risk Monitoring and Incident Signals

L2Beat provides independent risk assessments and real-time status updates for major rollups. It is commonly used by incident responders to understand how a failure affects user trust and external perception.

Useful signals during incidents:

Stage classification (Stage 0 to Stage 2) indicating upgrade and exit safety
Bridge risk factors that amplify incident impact
Public status changes that users and exchanges may react to

During an incident, teams should monitor how their rollup is represented externally and proactively communicate fixes or mitigations before inaccurate assumptions spread.

EXPLORE

Postmortems from Real Rollup Incidents

Studying prior rollup incidents is one of the most effective ways to improve response quality. Public postmortems reveal where detection, coordination, or authority failed.

Notable examples include:

Optimism sequencer outages caused by node bugs or L1 reorg handling
Arbitrum Nitro migrations with temporary downtime and user confusion
Bridge halts after suspicious activity or validator disagreements

When reviewing postmortems, extract concrete action items:

What signals were missed?
Which on-chain permissions slowed response?
How long did communication gaps last?

Feed these findings back into your own response playbooks and monitoring alerts.

On-Chain Pausing and Emergency Controls

Incident response is only effective if emergency controls are designed in advance. Rollups typically rely on pausable contracts, upgradeable proxies, and security councils.

Best practices:

Minimal pause scope: halt withdrawals or batch submission without freezing funds
Timelock bypass rules for critical vulnerabilities
Transparent key ownership documented publicly

Poorly designed emergency controls can worsen incidents by locking users in or creating governance disputes. Review pause logic regularly and test it during controlled drills to ensure execution speed and correctness under pressure.

conclusion

INCIDENT RESPONSE

Conclusion and Next Steps

A structured incident response plan is not a one-time document but a living framework that must be tested and refined. This final section outlines how to operationalize your plan and where to focus your ongoing security efforts.

Implementing your rollup incident response plan requires moving from theory to practice. Start by conducting a tabletop exercise with your core team. Simulate a scenario like a sequencer failure or a critical vulnerability in a bridge contract. Walk through each step of your plan: initial detection via your monitoring dashboard, internal communication via your designated Slack channel or PagerDuty, and the execution of your documented mitigation procedures. This dry run will expose gaps in your processes, such as unclear role assignments or missing escalation contacts, before a real crisis occurs.

Continuous improvement is critical. After any exercise or real incident, hold a formal post-mortem analysis. Document the timeline, root cause, impact, and most importantly, the action items for process improvement. Tools like Jira or Linear can track these tasks. Publicly sharing a sanitized version of this analysis, as teams like Optimism and Arbitrum have done, builds community trust and contributes to ecosystem-wide security knowledge. Regularly update your contact lists, runbook procedures, and tool configurations to reflect new team members, upgraded infrastructure, and lessons learned.

Your technical foundation must evolve alongside your processes. Invest in robust monitoring that goes beyond basic uptime. Implement canary deployments for sequencer upgrades and critical smart contracts to detect issues in a controlled environment. Utilize fraud proof or validity proof alerting to detect invalid state transitions. Consider engaging with professional audit firms for regular security reviews and bug bounty platforms like Immunefi to crowdsource vulnerability discovery. The security landscape for rollups is dynamic; staying ahead requires proactive investment in both human processes and automated defenses.

Finally, engage with the broader ecosystem. Participate in security forums and working groups within your rollup stack's community, such as the OP Stack Security Council or the Arbitrum DAO. Sharing threat intelligence and coordinating on cross-chain vulnerability disclosures makes the entire network more resilient. The next step is to begin drafting your plan, starting with the incident severity matrix and communication tree outlined in this guide. For further reading, consult resources like the L2BEAT Risk Framework and the Ethereum Rollup Security Checklist.