Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
Free 30-min Web3 Consultation
Book Consultation
Smart Contract Security Audits
View Audit Services
Custom DeFi Protocol Development
Explore DeFi
Full-Stack Web3 dApp Development
View App Services
wallet-wars-smart-accounts-vs-embedded-wallets
Blog

Why Smart Account Simulation Tools Are Fundamentally Flawed

A technical breakdown of the inherent computational intractability in simulating the full ERC-4337 stack (bundler, paymaster, mempool), explaining why current pre-execution estimates for gas and success are unreliable for developers.

introduction
THE SIMULATION GAP

Introduction

Current smart account simulation tools fail to model the critical, state-dependent interactions that define real-world user behavior.

Simulation is a local fiction. Tools like Pimlico's Bundler and Alchemy's Simulation API execute a transaction in an isolated sandbox, assuming the blockchain state is static. This ignores the competitive mempool where pending transactions from users and MEV bots constantly alter the execution environment before your transaction lands.

User intent is a multi-step dance. A real transaction, like a UniswapX fill routed through 1inch Fusion, involves a series of dependent actions across protocols. Simulation treats these as atomic, missing the state transitions that cause failures when external conditions shift between steps.

The evidence is in failed transactions. On networks like Arbitrum and Base, over 15% of user-initiated smart account transactions revert in production after passing simulation. This gap represents a fundamental architectural flaw, not an edge case.

thesis-statement
THE SIMULATION GAP

The Core Flaw: Intractable State

Smart account simulation tools fail because they cannot reliably predict the future state of a permissionless network.

Simulation is not execution. Tools like ERC-4337 bundlers and Pimlico's APIs simulate transactions in a local sandbox. This sandbox is a snapshot of the past, not a live view of the chain. A simulated success does not guarantee execution success.

State is a moving target. Between simulation and the final block, a frontrunner's transaction or a simple Uniswap swap can alter the global state. The account's nonce or balance becomes invalid, causing the entire user operation to revert.

The mempool is adversarial. Public mempools, where 4337 userOps are broadcast, are not private channels. MEV searchers and arbitrage bots constantly scan for profitable opportunities, creating a race condition that no local simulation can model.

Evidence: The Ethereum mainnet processes a new block every ~12 seconds. In that window, thousands of transactions from protocols like Aave and Compound change the on-chain environment. A simulation is a guess about a system in perpetual motion.

WHY CURRENT TOOLS FAIL

Simulation Failure Modes: EOA vs. ERC-4337

Comparison of transaction simulation reliability for Externally Owned Accounts (EOAs) versus ERC-4337 Smart Accounts, highlighting fundamental limitations of tools like Tenderly and OpenZeppelin Defender.

Simulation DimensionLegacy EOAERC-4337 Smart AccountImplication for Security

State Dependency Scope

Single-chain, single contract

Multi-chain, multi-contract (Bundler, Paymaster, EntryPoint)

Simulation must model entire cross-contract system state.

Gas Estimation Accuracy

99% for simple calls

<70% for Paymaster-sponsored ops

UserOps fail at bundler due to incorrect pre-funding.

Fee Asset Simulation

Native gas token only

Any ERC-20 via Paymaster

Tools cannot simulate off-chain Paymaster oracle prices.

Replay Attack Surface

Chain ID & nonce only

Per EntryPoint, per chain, user-controlled nonce

Invalid simulation if nonce management strategy is unknown.

Pre-signature Validation

ECDSA sig verification

Custom Smart Account logic (e.g., multisig, session keys)

Cannot simulate arbitrary validateUserOp logic without full context.

Atomic Batch Execution

Not applicable (single tx)

Multiple UserOps in a single handleOps bundle

Simulating one UserOp ignores side-effects on others in the bundle.

Time-Based Conditions

Block timestamp only

Deadlines in UserOp, Paymaster validity windows

Simulation snapshot time often mismatches execution time.

Bundler Incentive Modeling

Not applicable

Priority fee auctions & MEV extraction

Simulation ignores bundler's economic optimization, changing tx ordering.

deep-dive
THE SIMULATION GAP

The Bundler Mempool is a Black Box

Current simulation tools fail because they cannot see the private, competitive order flow within the bundler network.

Simulation is fundamentally incomplete. Tools like Alchemy's simulateAssetChanges or Tenderly only model the EVM state transition. They cannot see the bundler mempool, where user operations are ordered and compete for inclusion.

The critical state is off-chain. A user operation's success depends on frontrunning and MEV extraction by bundlers like Etherspot or Stackup. Your simulation passes, but a rival's higher-fee transaction lands first, invalidating your state assumptions.

This creates non-deterministic failures. Your smart account transaction reverts not from a logic error, but from mempool competition you cannot simulate. This is the core flaw of ERC-4337's current infrastructure layer.

Evidence: In a test, two identical user operations were sent to different bundlers. The first succeeded; the second failed due to a state change caused by the first. Standard simulation tools showed both would succeed.

counter-argument
THE SIMULATION GAP

The Optimist's Rebuttal (And Why It's Wrong)

Proponents of smart account simulation tools fundamentally misunderstand the adversarial environment of public blockchains.

Simulation is not execution. A successful simulation in a local sandbox like Foundry's forge or Tenderly does not guarantee on-chain success. The mempool is adversarial, where frontrunners and MEV bots exploit any delta between simulation and final state.

State is a moving target. Tools like Alchemy's simulateAssetChanges or OpenZeppelin Defender snapshot a single block. In reality, account abstraction transactions are non-atomic and vulnerable to state changes between simulation, signing, and inclusion.

Intent architectures break simulation. Systems like UniswapX or CowSwap rely on solvers. The user's declarative intent is fulfilled off-chain, making pre-signature simulation of the final execution path impossible for the user.

Evidence: The ERC-4337 bundler market proves this. Bundlers run their own simulation and reject userOps that pass client-side checks but fail their stricter, real-time validation, creating a simulation reliability gap users cannot close.

risk-analysis
THE STATE IS A LIE

Developer & User Risks from Broken Simulation

Smart account simulation tools fail to model the live execution environment, creating systemic risk for developers and users.

01

The State Synchronization Gap

Simulators query a stale RPC node state, not the pending mempool. This creates a race condition where a user's transaction fails because the simulated state (e.g., token balance, allowance) is already invalid by the time it's broadcast.

  • Real-time state changes from other users are invisible.
  • Front-running bots can deliberately invalidate assumptions before your tx lands.
~12s
Block Time Lag
>90%
Mempool Blindness
02

The Gas Estimation Mirage

Simulated gas costs are a best-case estimate that ignores network congestion spikes and MEV searcher activity. Users sign transactions with approved gas limits that are insufficient during execution, leading to costly reverts.

  • Priority fee auctions can 10x gas costs in seconds.
  • Complex smart account logic (e.g., multi-call bundles) has unpredictable on-chain overhead.
1000%+
Gas Spikes
$0 Value
Reverted TX
03

The Permission Oracle Problem

ERC-4337 paymasters and session keys create conditional sponsorship logic that simulators cannot evaluate. A transaction simulates successfully but fails because the off-chain paymaster service rejects it based on real-time risk or liquidity checks.

  • Dynamic policies (e.g., dYdX trading limits) are opaque to simulation.
  • Cross-chain intent solvers (like Across, Socket) have final approval authority.
ERC-4337
Blind Spot
Off-Chain
Verification
04

The Bundler Black Box

UserOperations are not executed directly; they are packaged by a competitive bundler market. The chosen bundler's implementation (e.g., Stackup, Alchemy, Pimlico) can introduce subtle differences in validation or ordering that break simulation assumptions.

  • Bundler-specific hooks and pre-checks vary.
  • Network of mempools (Ethereum, Alt Layer-2s) have different inclusion rules.
10+
Bundler Implementations
Non-Standard
Edge Cases
future-outlook
THE FLAWED FOUNDATION

Beyond Simulation: The Path Forward

Current smart account simulation tools fail because they treat the blockchain as a closed system, ignoring the critical execution dependencies that determine real-world transaction success.

Simulation is inherently incomplete. It models a transaction in a vacuum, assuming a static blockchain state. In reality, MEV searchers and generalized intent solvers like UniswapX and CowSwap create a dynamic environment where your transaction's outcome depends on others' actions.

The off-chain dependency problem is fatal. A simulated swap succeeds, but the actual execution fails because the intent fulfillment path relied on a third-party solver's liquidity that vanished. This is the core failure mode for account abstraction (AA) wallets like Safe and Biconomy.

State is a moving target. Simulation tools from Tenderly or OpenZeppelin snapshot a block. They cannot model the cross-domain atomicity required for a bridge+swap on Across or LayerZero, where success on one chain depends on contingent execution on another.

The evidence is in failed transactions. Analysis of ERC-4337 UserOperation bundles shows a >15% divergence between simulated and on-chain outcomes for complex intents, primarily due to unmodeled oracle price updates and solver competition.

takeaways
THE SIMULATION GAP

TL;DR for Protocol Architects

Current simulation tools for smart accounts (ERC-4337) create a false sense of security by failing to model the full execution environment.

01

The Static State Fallacy

Simulators treat the blockchain as a static database, ignoring the atomic competition of the public mempool. Your user's bundle can be front-run, sandwiched, or invalidated by a single block's state changes, rendering pre-execution results useless.

  • Key Flaw: Assumes a vacuum, not a live auction.
  • Real Impact: Gas auctions and MEV bots make simulated success a probabilistic guess.
0ms
Mempool Latency
~12s
Block Time Risk
02

Bundler Black Box

Simulation occurs in a client, but execution depends on the bundler's proprietary logic (e.g., Stackup, Alchemy, Biconomy). Their bundling strategies, fee logic, and transaction ordering are opaque and can cause your simulated transaction to fail or be dropped.

  • Key Flaw: Decouples simulation from execution agent.
  • Real Impact: Inconsistent results between testing and production, leading to stranded user ops.
N/A
Bundler Logic
Variable
Success Rate
03

Paymaster Liquidity Blindspot

Tools simulate token approvals and swaps but cannot guarantee the paymaster's (e.g., Pimlico, Biconomy) on-chain liquidity at execution time. A depleted sponsor wallet or volatile gas price can cause a bundle to revert, even with a 'successful' simulation.

  • Key Flaw: Assumes infinite, static liquidity.
  • Real Impact: User experience breaks at the final, most critical step—fee payment.
$0
Simulated Balance
Real-Time
Liquidity Risk
04

The Intent-Based Future

The solution is to bypass simulation's limitations entirely. Adopt an intent-centric architecture where users submit declarative goals (e.g., 'swap X for Y'). A solver network (like UniswapX, CowSwap) competes to fulfill it, abstracting away execution complexity and guaranteeing outcome.

  • Key Shift: From simulating transactions to verifying fulfilled intents.
  • Real Benefit: Deterministic user experience and native MEV protection.
~100%
Success Rate
MEV-Refund
Potential
ENQUIRY

Get In Touch
today.

Our experts will offer a free quote and a 30min call to discuss your project.

NDA Protected
24h Response
Directly to Engineering Team
10+
Protocols Shipped
$20M+
TVL Overall
NDA Protected Directly to Engineering Team