Simulation is a local fiction. Tools like Pimlico's Bundler and Alchemy's Simulation API execute a transaction in an isolated sandbox, assuming the blockchain state is static. This ignores the competitive mempool where pending transactions from users and MEV bots constantly alter the execution environment before your transaction lands.
Why Smart Account Simulation Tools Are Fundamentally Flawed
A technical breakdown of the inherent computational intractability in simulating the full ERC-4337 stack (bundler, paymaster, mempool), explaining why current pre-execution estimates for gas and success are unreliable for developers.
Introduction
Current smart account simulation tools fail to model the critical, state-dependent interactions that define real-world user behavior.
User intent is a multi-step dance. A real transaction, like a UniswapX fill routed through 1inch Fusion, involves a series of dependent actions across protocols. Simulation treats these as atomic, missing the state transitions that cause failures when external conditions shift between steps.
The evidence is in failed transactions. On networks like Arbitrum and Base, over 15% of user-initiated smart account transactions revert in production after passing simulation. This gap represents a fundamental architectural flaw, not an edge case.
The Core Flaw: Intractable State
Smart account simulation tools fail because they cannot reliably predict the future state of a permissionless network.
Simulation is not execution. Tools like ERC-4337 bundlers and Pimlico's APIs simulate transactions in a local sandbox. This sandbox is a snapshot of the past, not a live view of the chain. A simulated success does not guarantee execution success.
State is a moving target. Between simulation and the final block, a frontrunner's transaction or a simple Uniswap swap can alter the global state. The account's nonce or balance becomes invalid, causing the entire user operation to revert.
The mempool is adversarial. Public mempools, where 4337 userOps are broadcast, are not private channels. MEV searchers and arbitrage bots constantly scan for profitable opportunities, creating a race condition that no local simulation can model.
Evidence: The Ethereum mainnet processes a new block every ~12 seconds. In that window, thousands of transactions from protocols like Aave and Compound change the on-chain environment. A simulation is a guess about a system in perpetual motion.
The Simulation Stack's Moving Parts
Current smart account simulation tools are built on a flawed, fragmented foundation that fails to model the real execution environment.
The Static State Fallacy
Simulators treat the blockchain as a static database, ignoring the live mempool. This misses critical race conditions and MEV attacks that occur between simulation and execution.\n- Blind to Mempool: Cannot see competing transactions from users or bots.\n- Unpredictable Gas: Simulated gas costs diverge from real-time network congestion.
The Silos of Pimlico & Alchemy
Major providers like Pimlico and Alchemy operate isolated simulation environments. Their models are not interoperable, forcing developers to choose one flawed abstraction.\n- Vendor Lock-in: Each stack has unique blind spots and assumptions.\n- No Shared Truth: A tx passing on one simulator can fail on another, creating inconsistent user experiences.
Missing Cross-Chain Context
Simulation is chain-bound, but user intents are not. A UniswapX order or LayerZero message involves multiple state transitions across domains. Current tools see only one link in the chain.\n- Bridge Blindness: Cannot simulate the full flow of an intent-based bridge like Across.\n- Settlement Risk: Ignores latency and failure modes in cross-chain messaging layers.
The Oracle Problem, Reborn
Simulators rely on external data oracles for prices and states, but these are updated at discrete intervals. This creates a window where simulated swaps on CowSwap or liquidations are economically incorrect.\n- Stale Price Risk: A 1-second lag can make a profitable trade insolvent.\n- Centralized Point of Failure: Most oracle feeds are not decentralized at the data layer.
Simulation Failure Modes: EOA vs. ERC-4337
Comparison of transaction simulation reliability for Externally Owned Accounts (EOAs) versus ERC-4337 Smart Accounts, highlighting fundamental limitations of tools like Tenderly and OpenZeppelin Defender.
| Simulation Dimension | Legacy EOA | ERC-4337 Smart Account | Implication for Security |
|---|---|---|---|
State Dependency Scope | Single-chain, single contract | Multi-chain, multi-contract (Bundler, Paymaster, EntryPoint) | Simulation must model entire cross-contract system state. |
Gas Estimation Accuracy |
| <70% for Paymaster-sponsored ops | UserOps fail at bundler due to incorrect pre-funding. |
Fee Asset Simulation | Native gas token only | Any ERC-20 via Paymaster | Tools cannot simulate off-chain Paymaster oracle prices. |
Replay Attack Surface | Chain ID & nonce only | Per EntryPoint, per chain, user-controlled nonce | Invalid simulation if nonce management strategy is unknown. |
Pre-signature Validation | ECDSA sig verification | Custom Smart Account logic (e.g., multisig, session keys) | Cannot simulate arbitrary |
Atomic Batch Execution | Not applicable (single tx) | Multiple UserOps in a single handleOps bundle | Simulating one UserOp ignores side-effects on others in the bundle. |
Time-Based Conditions | Block timestamp only | Deadlines in UserOp, Paymaster validity windows | Simulation snapshot time often mismatches execution time. |
Bundler Incentive Modeling | Not applicable | Priority fee auctions & MEV extraction | Simulation ignores bundler's economic optimization, changing tx ordering. |
The Bundler Mempool is a Black Box
Current simulation tools fail because they cannot see the private, competitive order flow within the bundler network.
Simulation is fundamentally incomplete. Tools like Alchemy's simulateAssetChanges or Tenderly only model the EVM state transition. They cannot see the bundler mempool, where user operations are ordered and compete for inclusion.
The critical state is off-chain. A user operation's success depends on frontrunning and MEV extraction by bundlers like Etherspot or Stackup. Your simulation passes, but a rival's higher-fee transaction lands first, invalidating your state assumptions.
This creates non-deterministic failures. Your smart account transaction reverts not from a logic error, but from mempool competition you cannot simulate. This is the core flaw of ERC-4337's current infrastructure layer.
Evidence: In a test, two identical user operations were sent to different bundlers. The first succeeded; the second failed due to a state change caused by the first. Standard simulation tools showed both would succeed.
The Optimist's Rebuttal (And Why It's Wrong)
Proponents of smart account simulation tools fundamentally misunderstand the adversarial environment of public blockchains.
Simulation is not execution. A successful simulation in a local sandbox like Foundry's forge or Tenderly does not guarantee on-chain success. The mempool is adversarial, where frontrunners and MEV bots exploit any delta between simulation and final state.
State is a moving target. Tools like Alchemy's simulateAssetChanges or OpenZeppelin Defender snapshot a single block. In reality, account abstraction transactions are non-atomic and vulnerable to state changes between simulation, signing, and inclusion.
Intent architectures break simulation. Systems like UniswapX or CowSwap rely on solvers. The user's declarative intent is fulfilled off-chain, making pre-signature simulation of the final execution path impossible for the user.
Evidence: The ERC-4337 bundler market proves this. Bundlers run their own simulation and reject userOps that pass client-side checks but fail their stricter, real-time validation, creating a simulation reliability gap users cannot close.
Developer & User Risks from Broken Simulation
Smart account simulation tools fail to model the live execution environment, creating systemic risk for developers and users.
The State Synchronization Gap
Simulators query a stale RPC node state, not the pending mempool. This creates a race condition where a user's transaction fails because the simulated state (e.g., token balance, allowance) is already invalid by the time it's broadcast.
- Real-time state changes from other users are invisible.
- Front-running bots can deliberately invalidate assumptions before your tx lands.
The Gas Estimation Mirage
Simulated gas costs are a best-case estimate that ignores network congestion spikes and MEV searcher activity. Users sign transactions with approved gas limits that are insufficient during execution, leading to costly reverts.
- Priority fee auctions can 10x gas costs in seconds.
- Complex smart account logic (e.g., multi-call bundles) has unpredictable on-chain overhead.
The Permission Oracle Problem
ERC-4337 paymasters and session keys create conditional sponsorship logic that simulators cannot evaluate. A transaction simulates successfully but fails because the off-chain paymaster service rejects it based on real-time risk or liquidity checks.
- Dynamic policies (e.g., dYdX trading limits) are opaque to simulation.
- Cross-chain intent solvers (like Across, Socket) have final approval authority.
The Bundler Black Box
UserOperations are not executed directly; they are packaged by a competitive bundler market. The chosen bundler's implementation (e.g., Stackup, Alchemy, Pimlico) can introduce subtle differences in validation or ordering that break simulation assumptions.
- Bundler-specific hooks and pre-checks vary.
- Network of mempools (Ethereum, Alt Layer-2s) have different inclusion rules.
Beyond Simulation: The Path Forward
Current smart account simulation tools fail because they treat the blockchain as a closed system, ignoring the critical execution dependencies that determine real-world transaction success.
Simulation is inherently incomplete. It models a transaction in a vacuum, assuming a static blockchain state. In reality, MEV searchers and generalized intent solvers like UniswapX and CowSwap create a dynamic environment where your transaction's outcome depends on others' actions.
The off-chain dependency problem is fatal. A simulated swap succeeds, but the actual execution fails because the intent fulfillment path relied on a third-party solver's liquidity that vanished. This is the core failure mode for account abstraction (AA) wallets like Safe and Biconomy.
State is a moving target. Simulation tools from Tenderly or OpenZeppelin snapshot a block. They cannot model the cross-domain atomicity required for a bridge+swap on Across or LayerZero, where success on one chain depends on contingent execution on another.
The evidence is in failed transactions. Analysis of ERC-4337 UserOperation bundles shows a >15% divergence between simulated and on-chain outcomes for complex intents, primarily due to unmodeled oracle price updates and solver competition.
TL;DR for Protocol Architects
Current simulation tools for smart accounts (ERC-4337) create a false sense of security by failing to model the full execution environment.
The Static State Fallacy
Simulators treat the blockchain as a static database, ignoring the atomic competition of the public mempool. Your user's bundle can be front-run, sandwiched, or invalidated by a single block's state changes, rendering pre-execution results useless.
- Key Flaw: Assumes a vacuum, not a live auction.
- Real Impact: Gas auctions and MEV bots make simulated success a probabilistic guess.
Bundler Black Box
Simulation occurs in a client, but execution depends on the bundler's proprietary logic (e.g., Stackup, Alchemy, Biconomy). Their bundling strategies, fee logic, and transaction ordering are opaque and can cause your simulated transaction to fail or be dropped.
- Key Flaw: Decouples simulation from execution agent.
- Real Impact: Inconsistent results between testing and production, leading to stranded user ops.
Paymaster Liquidity Blindspot
Tools simulate token approvals and swaps but cannot guarantee the paymaster's (e.g., Pimlico, Biconomy) on-chain liquidity at execution time. A depleted sponsor wallet or volatile gas price can cause a bundle to revert, even with a 'successful' simulation.
- Key Flaw: Assumes infinite, static liquidity.
- Real Impact: User experience breaks at the final, most critical step—fee payment.
The Intent-Based Future
The solution is to bypass simulation's limitations entirely. Adopt an intent-centric architecture where users submit declarative goals (e.g., 'swap X for Y'). A solver network (like UniswapX, CowSwap) competes to fulfill it, abstracting away execution complexity and guaranteeing outcome.
- Key Shift: From simulating transactions to verifying fulfilled intents.
- Real Benefit: Deterministic user experience and native MEV protection.
Get In Touch
today.
Our experts will offer a free quote and a 30min call to discuss your project.