How to Understand EVM Opcodes Conceptually

introduction

CORE CONCEPTS

Introduction to EVM Opcodes

A conceptual guide to the fundamental instructions that power the Ethereum Virtual Machine and smart contract execution.

The Ethereum Virtual Machine (EVM) is the global, sandboxed runtime environment that executes all smart contracts on Ethereum and compatible chains. At its heart, the EVM is a stack-based, quasi-Turing-complete machine. Its operation is defined by a set of low-level instructions called opcodes. Each opcode is a 1-byte value that represents a specific atomic operation, such as arithmetic, logical comparison, memory access, or blockchain state manipulation. When you deploy a Solidity contract, it is compiled down to EVM bytecode, which is essentially a sequence of these opcodes.

Conceptually, you can think of opcodes as the assembly language for the blockchain. While high-level languages like Solidity or Vyper provide developer-friendly abstractions, opcodes are what the EVM actually processes. The EVM executes opcodes sequentially within an execution context defined by the current transaction. It manages computation using three primary data areas: the stack (a last-in-first-out structure holding 256-bit words), memory (a volatile byte array), and storage (a persistent key-value store tied to the contract). Most opcodes interact with these areas.

Opcodes are categorized by their function. Key categories include:

Arithmetic & Logic: ADD, MUL, LT (less than), AND.
Stack Operations: PUSH1, POP, SWAP1.
Memory & Storage: MSTORE (write to memory), SLOAD (read from storage).
Control Flow: JUMP, JUMPI (jump if).
System Operations: CALL, CREATE, SELFDESTRUCT. Each opcode consumes a specific amount of gas, which measures computational cost. For example, an ADD opcode costs 3 gas, while a SSTORE can cost 20,000 gas or more for a new value.

Understanding opcodes is crucial for advanced development and security auditing. It allows developers to:

Optimize gas usage by writing more efficient code that compiles to cheaper opcode sequences.
Debug complex issues by examining the low-level execution trace.
Conduct deeper security reviews to spot vulnerabilities like reentrancy or integer overflows at the bytecode level. Tools like the EVM Playground let you experiment with opcodes interactively, while debuggers in Foundry or Hardhat can show you the opcode-level execution step-by-step.

Let's examine a simple example. The Solidity expression uint256 c = a + b; compiles down to a sequence resembling:

code
PUSH1 0x0a // Push value of 'a' onto stack
PUSH1 0x14 // Push value of 'b' onto stack
ADD        // Pop two items, add them, push result

The ADD opcode pops the top two 256-bit values from the stack, computes their sum modulo 2²⁵⁶, and pushes the result back onto the stack. This stack-based paradigm is fundamental to all EVM operations.

Mastering EVM opcodes moves you from simply writing contracts to understanding precisely how they execute on-chain. This knowledge is foundational for building highly optimized DeFi protocols, conducting professional smart contract audits, and contributing to EVM client development. The official Ethereum Yellow Paper provides the formal specification for all opcodes and their behavior.

prerequisites

FOUNDATIONAL CONCEPTS

Prerequisites for Understanding EVM Opcodes

Before diving into the specifics of individual EVM opcodes, you need a solid grasp of the underlying architecture and data structures that make them work.

The Ethereum Virtual Machine (EVM) is a stack-based, quasi-Turing-complete state machine. Understanding opcodes requires familiarity with its core components: the execution stack, memory, and storage. The stack is a last-in-first-out (LIFO) data structure with a maximum depth of 1024 items, where most opcodes pop their inputs and push their results. Memory is a volatile byte array used for temporary data during a transaction, while storage is a persistent key-value store tied to a contract's address. Grasping this separation is fundamental, as opcodes like MSTORE, SLOAD, and SSTORE manipulate these different contexts.

You must also understand the concept of gas. Every opcode has a predefined gas cost (e.g., ADD costs 3 gas, SSTORE can cost 20,000+ gas). Gas is the computational fee paid for execution, and the EVM halts if a transaction runs out. This economic model directly influences opcode usage and contract optimization. For example, writing to storage is expensive, so efficient contracts minimize SSTORE operations. The official Ethereum Yellow Paper Appendix G provides the canonical gas cost table.

A working knowledge of low-level data representation is crucial. The EVM operates on 256-bit words (32 bytes). Opcodes handle data in hexadecimal format and often treat addresses and values identically as 32-byte words. Understanding two's complement for signed integers, big-endian byte ordering, and bitwise operations (AND, OR, XOR, NOT, SHL, SHR) is essential. For instance, the CALLDATALOAD opcode loads 32 bytes from call data onto the stack, which you then must parse correctly using masking and shifting operations to extract function arguments.

Finally, you should be comfortable reading EVM assembly or bytecode. Opcodes are the 1-byte instructions that comprise this bytecode. Tools like the Ethereum.org EVM Playground allow you to write low-level instructions and see the resulting stack, memory, and storage changes in real-time. Start by disassembling simple Solidity functions (e.g., a pure adder) using solc --opcodes to see how high-level logic maps to sequences of fundamental opcodes like PUSH1, ADD, and RETURN. This concrete mapping bridges the conceptual and the practical.

key-concepts-text

CORE CONCEPTS OF THE EVM

How to Understand EVM Opcodes Conceptually

Ethereum Virtual Machine opcodes are the fundamental, low-level instructions that execute smart contracts. This guide explains what they are, how they work, and how to interpret them.

An EVM opcode is a single, atomic instruction for the Ethereum Virtual Machine, analogous to an assembly instruction for a physical CPU. Each opcode is represented by a one-byte hexadecimal value (e.g., 0x60 is PUSH1, 0x01 is ADD). When you compile a Solidity contract, the high-level code is transformed into a sequence of these opcodes, forming the contract's bytecode. The EVM is a stack-based machine, meaning most operations pop their inputs from and push their results onto a last-in, first-out (LIFO) data stack.

Opcodes can be categorized by their function. Stack manipulation opcodes like PUSH, POP, DUP, and SWAP manage the data stack. Arithmetic and logic opcodes (ADD, SUB, MUL, LT, GT, AND, OR) perform calculations. Control flow opcodes (JUMP, JUMPI, PC) direct execution, while storage and memory opcodes (SLOAD/SSTORE, MLOAD/MSTORE) handle data persistence. Environmental opcodes (CALLER, CALLVALUE, NUMBER) access blockchain context, and halt opcodes (STOP, RETURN, REVERT) end execution.

To see opcodes in action, examine a simple function. The Solidity statement uint256 c = a + b; compiles to bytecode that conceptually executes: PUSH1 <value_of_b>, PUSH1 <value_of_a>, ADD, PUSH1 <storage_slot_for_c>, SSTORE. The PUSH opcodes load the values onto the stack, ADD pops the top two items, adds them, and pushes the result. Finally, SSTORE pops the result and storage location to save it. Every smart contract interaction is a sequence of such steps, with each step consuming gas.

Understanding opcodes is crucial for advanced development and security auditing. It allows you to:

Optimize gas costs by knowing which operations are expensive (e.g., SSTORE can cost 20,000 gas for a new value).
Decode complex transactions using tools like the EVM Playground or evm command-line disassemblers.
Write more efficient Yul or inline assembly, which provides direct access to opcodes.
Analyze contract security by identifying low-level patterns that could lead to vulnerabilities like reentrancy or integer overflows.

To practice, disassemble a simple contract. Compile a contract with solc --opcodes, or use Etherscan's 'Switch to Opcodes View' on a verified contract. Start by tracing the flow for a constructor or a simple view function. Compare the opcode sequence against the original Solidity to build intuition. Remember, the EVM has no registers; all temporary values live on the stack or in memory, which is a key conceptual difference from traditional processors.

opcode-categories

CONCEPTUAL GUIDE

EVM Opcode Categories

EVM opcodes are the fundamental instructions that power smart contract execution. Understanding their categories is key to analyzing gas costs, security, and contract logic.

Arithmetic & Logic Operations

These opcodes perform mathematical and logical computations on the stack.

ADD, SUB, MUL, DIV, MOD for basic arithmetic.
LT, GT, EQ, ISZERO for comparisons.
AND, OR, XOR, NOT for bitwise logic. These are the building blocks for all contract calculations and conditionals.

Stack, Memory, and Storage

Opcodes that manage the EVM's three primary data areas.

Stack: PUSH1, POP, SWAP1, DUP1 manipulate the 1024-item stack.
Memory: MSTORE, MLOAD handle volatile, expandable byte arrays.
Storage: SSTORE, SLOAD interact with persistent contract state, costing ~20,000 gas per write. Understanding access costs here is critical for optimization.

Control Flow & Halting

Opcodes that direct execution flow and terminate the program.

JUMP, JUMPI enable conditional and unconditional branching to program counters.
PC gets the current program counter.
STOP, RETURN, REVERT, INVALID halt execution. REVERT refunds remaining gas and is essential for safe error handling.

Environmental & Block Information

Opcodes that provide context about the transaction and blockchain state.

CALLER, ORIGIN, ADDRESS identify transaction participants.
CALLVALUE, GASPRICE access transaction details.
NUMBER, TIMESTAMP, DIFFICULTY read from the current block header. These are used for access control and time-based logic.

System Operations

High-level operations for contract interaction and creation.

CALL, DELEGATECALL, STATICCALL, CREATE enable inter-contract communication and deployment. DELEGATECALL preserves the caller's context, a common source of security vulnerabilities in proxy patterns.
SELFDESTRUCT removes a contract from the state.

Keccak256 & Cryptographic Ops

Opcodes for cryptographic hashing and address derivation.

SHA3 (now called KECCAK256) computes the 256-bit hash of a memory region, essential for Merkle proofs and storage key derivation.
ADDRESS converts a public key to an Ethereum address.
ECRECOVER verifies ECDSA signatures on-chain, though it is rarely used directly due to gas cost.

CORE OPERATIONS

Common EVM Opcodes and Their Functions

A reference for the most frequently used opcodes in the Ethereum Virtual Machine, categorized by function.

Opcode (Hex)	Mnemonic	Gas Cost	Description	Stack Input -> Output
0x00	STOP	0	Halts execution of the contract.	[] -> []
0x01	ADD	3	Addition modulo 2^256.	[a, b] -> [a + b]
0x02	MUL	5	Multiplication modulo 2^256.	[a, b] -> [a * b]
0x20	SHA3	30 + dynamic	Computes the Keccak-256 hash of a memory region.	[offset, size] -> [hash]
0x31	BALANCE	2600 (Cold) / 100 (Warm)	Gets the balance (in wei) of the given account.	[address] -> [balance]
0x35	CALLDATALOAD	3	Gets 32 bytes of input data starting at a byte offset.	[offset] -> [data]
0x51	MLOAD	3	Loads a 32-byte word from memory.	[offset] -> [value]
0x52	MSTORE	3	Stores a 32-byte word to memory.	[offset, value] -> []
0x54	SLOAD	2100 (Cold) / 100 (Warm)	Loads a 32-byte word from storage.	[key] -> [value]
0x55	SSTORE	Dynamic (20k+ for new)	Stores a 32-byte word to storage.	[key, value] -> []
0x56	JUMP	8	Alters the program counter to a new location.	[dest] -> []
0xf3	RETURN	0	Halts execution and returns data from memory.	[offset, size] -> []
0xfd	REVERT	0	Halts execution, reverts state, and returns data.	[offset, size] -> []

gas-execution-model

GAS AND THE EXECUTION MODEL

How to Understand EVM Opcodes Conceptually

EVM opcodes are the fundamental instructions that define all computation on Ethereum. This guide explains their role in gas metering and state transitions.

The Ethereum Virtual Machine (EVM) is a stack-based, quasi-Turing-complete state machine. Its behavior is defined by a set of 140+ low-level operations called opcodes. Each opcode, such as ADD (0x01) or SSTORE (0x55), performs a specific atomic action on the EVM's execution context—manipulating the stack, memory, or persistent storage. Conceptually, you can think of a smart contract's bytecode as a sequence of these opcodes, which the EVM fetches and executes one by one to transition the blockchain's global state.

Every opcode has a predefined gas cost, which measures the computational and storage resources required to execute it. Simple arithmetic opcodes like ADD cost 3 gas, while state-modifying operations like SSTORE can cost 20,000 gas or more for a new storage slot. This gas system prevents infinite loops and allocates block space efficiently. When you send a transaction, you set a gas limit; the EVM halts execution with an "out of gas" error if the cumulative cost of opcodes exceeds this limit, reverting all state changes.

Opcodes interact with key EVM data structures: the stack (a LIFO structure for temporary values, 1024 items deep), memory (a volatile byte array for the current execution), and storage (a persistent key-value store tied to the contract). For example, PUSH1 0x80 places the value 0x80 on the stack. MSTORE pops two items from the stack: an offset and a value, then writes the value to memory at that offset. SSTORE pops a key and value to write to permanent contract storage.

Understanding opcode categories helps conceptualize contract execution. Key groups include: arithmetic and logic (ADD, MUL, LT), environmental info (CALLER, NUMBER), control flow (JUMP, JUMPI), and state operations (SLOAD, SSTORE). A JUMPI opcode implements conditional logic by popping a jump destination and a condition from the stack; it only jumps if the condition is non-zero. This low-level control flow is what higher-level Solidity constructs like if/else statements compile down to.

To analyze opcodes directly, you can use the evm command-line tool from go-ethereum. Compile a Solidity contract and disassemble its runtime bytecode: evm disasm bytecode.bin. You can also debug step-by-step execution using evm --code $(cat bytecode.bin) --debug run. Watching the stack, memory, and program counter change for each opcode is the best way to build a concrete mental model of the EVM's deterministic state machine.

stack-memory-storage

GUIDE

How to Understand EVM Opcodes Conceptually

A conceptual breakdown of the Ethereum Virtual Machine's core data structures: the stack, memory, and storage, which are fundamental to understanding smart contract execution.

The Ethereum Virtual Machine (EVM) is a stack-based, quasi-Turing-complete machine that executes smart contract bytecode. To understand how it processes operations (opcodes), you must first grasp its three primary data areas: the stack, memory, and storage. Each serves a distinct purpose with specific cost, persistence, and accessibility characteristics. The stack is used for immediate computations, memory is for temporary data during a transaction, and storage is for persistent state on the blockchain. Opcodes like ADD, MSTORE, and SSTORE interact directly with these areas.

The stack is the EVM's primary workspace, operating as a last-in, first-out (LIFO) data structure with a maximum depth of 1024 items, each 32 bytes (256 bits) wide. It holds the immediate operands and results for arithmetic, logical, and control flow operations. For example, the ADD opcode pops the top two 32-byte values from the stack, adds them, and pushes the 32-byte result back. This design makes the EVM simple and deterministic but limits complex data handling, necessitating the use of memory and storage for larger or persistent data.

Memory is a byte-addressable, volatile scratchpad that is freshly allocated for each message call (transaction). It is cheap to expand but must be paid for with gas. You can think of it as RAM for a single contract execution. Opcodes like MSTORE(offset, value) write a 32-byte word to a given memory offset, and MLOAD(offset) reads one. Memory is linear and can be expanded as needed, but data is cleared once the transaction ends. It's ideal for temporary arrays, strings, or intermediate computation results that don't need to live on-chain permanently.

In contrast, storage is a persistent, key-value database that is part of the global Ethereum state. Each contract has its own storage, a mapping from 256-bit keys to 256-bit values. Modifying storage with SSTORE is one of the most expensive operations in terms of gas because it permanently alters the blockchain's state. Reading storage with SLOAD is also relatively costly. Storage is used for a contract's long-term variables, like token balances in an ERC-20 contract. Its state persists across transactions and is accessible to anyone, forming the contract's public, on-chain record.

Conceptually, you can visualize opcode execution as moving data between these areas. A typical pattern might involve: 1) loading a value from storage (SLOAD) onto the stack, 2) performing calculations on the stack (ADD, MUL), 3) storing an intermediate result in memory (MSTORE) for later use, and 4) finally writing a final result back to storage (SSTORE). Understanding the cost and lifecycle of each area—ephemeral stack, temporary memory, permanent storage—is crucial for writing efficient, gas-optimized smart contracts.

from-solidity-to-bytecode

UNDERSTANDING THE EVM

From Solidity to Bytecode

This guide explains how your Solidity code is transformed into the low-level EVM opcodes that execute on the blockchain, providing a conceptual map of the compilation process.

When you write a contract in Solidity, you are creating a high-level abstraction. The Solidity compiler (solc) translates this human-readable code into EVM bytecode, a sequence of bytes stored on-chain. This bytecode is not directly human-readable, but it can be disassembled into a list of EVM opcodes, the fundamental instructions the Ethereum Virtual Machine understands. Each opcode, like PUSH1, ADD, or SSTORE, performs a single, atomic operation on the EVM's stack, memory, or storage.

The compilation process involves several key stages. First, your Solidity syntax is parsed and converted into an Abstract Syntax Tree (AST). The compiler then performs optimizations and generates an intermediate representation before finally outputting the raw bytecode. This bytecode includes two main parts: the initialization bytecode, which runs only once at deployment to set up the contract, and the runtime bytecode, which is the permanent logic stored at the contract's address and executed on every call.

To see this in action, consider a simple function: function add(uint a, uint b) public pure returns (uint) { return a + b; }. The compiled bytecode for this function would include opcodes to load the input parameters a and b onto the stack (PUSH, DUP, MLOAD), perform the addition (ADD), and then handle the return operation. You can inspect this using tools like the Ethereum Foundation's EVM Playground or by compiling with solc --opcodes.

Conceptually, the EVM is a stack-based machine. Most opcodes consume values from the top of the stack and push results back onto it. For example, the ADD opcode pops the top two items, adds them, and pushes the result. Other components include volatile memory (a byte array for temporary data) and persistent storage (a key-value store on-chain). Understanding this model—stack, memory, storage, and the program counter—is essential for debugging gas costs and low-level vulnerabilities.

Analyzing opcodes is crucial for advanced development and security. It allows you to optimize gas usage by understanding the cost of each operation (e.g., SSTORE is expensive). Security auditors read opcodes to find hidden logic or vulnerabilities that may be obscured in the high-level source. Tools like Etherscan's "Bytecode Decompiler" or dedicated disassemblers can help bridge this gap, showing you the exact instructions your financial logic will execute.

resource-links

DEVELOPER GUIDES

Tools and Resources

These tools and references help developers understand EVM opcodes at a conceptual level, from stack mechanics to gas accounting. Each card focuses on a different learning angle so you can connect low-level execution with Solidity and protocol behavior.

evm.codes Opcode Reference

The evm.codes site is the most practical way to build intuition for how individual opcodes behave.

Key ways to use it conceptually:

Browse opcodes by category: Arithmetic, Stack, Memory, Storage, Control Flow, System
For each opcode, review:
- Stack input and output sizes
- Gas cost and when it changes
- Formal description aligned with the Yellow Paper
Use the interactive stack diagrams to see how values are consumed and produced

Example insight:

Comparing SLOAD (100 gas) vs SSTORE (up to 20,000 gas) explains why storage writes dominate execution cost
Seeing CALL, DELEGATECALL, and STATICCALL side by side clarifies execution context differences better than Solidity docs

This resource works best when you already know Solidity and want to understand what the compiler generates and why certain patterns are expensive or unsafe.

EXPLORE

Ethereum Yellow Paper (Execution Model)

The Ethereum Yellow Paper defines the EVM as a formal state transition system. While dense, focusing on specific sections helps build a mental model of opcode execution.

Recommended sections to read conceptually:

Section 9: Execution Model
Section 9.4: The EVM stack, memory, and storage
Section 9.5: Gas calculation rules

What you gain:

A precise definition of how the stack (1024 items, 256-bit words) constrains execution
Why memory is byte-addressable but word-costed
How gas is deducted per opcode, not per Solidity line

Use the Yellow Paper as a reference, not a tutorial. Pair it with evm.codes or a debugger, and look up rules only after observing behavior in practice. This approach makes the formalism manageable and useful.

EXPLORE

Remix EVM Debugger

The Remix IDE debugger lets you step through opcode execution without setting up a local node.

How to use it for conceptual understanding:

Compile a small Solidity function
Run a transaction in Remix VM
Switch the debugger view from Solidity to Opcode level

What to focus on:

How high-level statements expand into many low-level opcodes
The dominance of PUSH, DUP, and SWAP instructions generated by the compiler
Where JUMP and JUMPI appear in conditional logic

Concrete example:

A simple require(x > 0) introduces multiple stack operations, comparison opcodes, and conditional jumps

Remix is especially useful for understanding control flow and stack discipline without reading raw bytecode manually.

EXPLORE

ETHervm.io Interactive Playground

ETHervm.io provides a browser-based EVM playground where you can execute raw opcodes step by step.

Why it matters conceptually:

You write opcode sequences directly, not Solidity
You see stack, memory, and gas change after each step
Errors like stack underflow or invalid jump become obvious

Exercises worth trying:

Implement addition using only PUSH, ADD, and STOP
Trigger a revert using REVERT vs INVALID and observe gas behavior
Compare memory expansion costs by increasing offsets

This tool is ideal for internalizing the EVM as a stack machine, which is difficult to grasp from compiled Solidity alone.

EXPLORE

Solidity Compiler Output (solc --asm)

Using solc --asm bridges the gap between Solidity and EVM opcodes without jumping straight to raw bytecode.

How to use it:

Run solc --asm Contract.sol
Inspect the generated assembly blocks
Map assembly instructions to their corresponding opcodes

What this teaches:

How the compiler uses stack shuffling aggressively
Why certain Solidity patterns generate repeated SLOAD calls
How function dispatch is implemented using JUMPDEST tables

Example insight:

Internal function calls compile to jumps, not CALL opcodes
External calls introduce full message call semantics and context switching

This approach is especially useful for understanding gas optimization techniques and why some micro-optimizations actually matter at the opcode level.

EVM OPCODES

Frequently Asked Questions

Common developer questions about Ethereum Virtual Machine opcodes, addressing conceptual understanding, debugging, and optimization.

In the EVM context, opcode and instruction are often used interchangeably, but they have a specific relationship. An opcode is the one-byte numerical value (e.g., 0x60 for PUSH1) that the EVM reads directly from the bytecode. An instruction is the human-readable mnemonic (like PUSH1, ADD, SSTORE) that corresponds to that opcode. The assembler (like the solc compiler) translates your Solidity code into these mnemonics and then into their final opcode bytes. When you debug a transaction, tools like Tenderly or the Remix debugger show you the executing instructions, not the raw byte values.

conclusion

KEY TAKEAWAYS

Conclusion and Next Steps

Understanding EVM opcodes conceptually provides a foundational view of smart contract execution. This guide has outlined the core principles, from the stack machine model to gas costs and security implications.

Conceptualizing the EVM as a stack-based virtual machine is the most critical step. Every operation, from simple addition (ADD) to complex cryptographic verification (SHA3), manipulates data on this stack. This model explains why operations have specific gas costs and why stack underflow/overflow are common vulnerabilities. Tools like the Ethereum Yellow Paper and EVM playgrounds allow you to visualize this process step-by-step for any contract bytecode.

To deepen your understanding, analyze real contract bytecode. Decompile a simple contract (like an ERC-20 token) using evm.disassemble in Foundry or an online disassembler. Trace through the opcodes, noting how storage slots are accessed with SSTORE/SLOAD, how function selectors are matched, and how control flow jumps with JUMP/JUMPI. Comparing the opcode execution of a standard transfer versus an optimized one reveals the direct impact of code efficiency on gas fees.

The next step is applying this knowledge practically. When writing Solidity or Vyper, you can predict the opcodes your code will compile to, enabling you to write more gas-efficient and secure contracts. For example, knowing that extcodesize checks are expensive informs design decisions around contract interactions. Furthermore, this conceptual framework is essential for advanced areas like EVM-compatible L2 development, building debugging tools, or conducting smart contract security audits.