Halmos symbolic testing tutorial: prove your properties
Halmos symbolic testing tutorial: prove your properties
Fuzzing is great at finding bugs through random exploration. But there's a catch — it can't tell you a property always holds. It can only tell you it didn't break during the campaign. Halmos takes a different approach. It uses symbolic execution to mathematically prove that your properties hold for all possible inputs. Or it finds a counterexample that breaks them.
That's a big difference. Let me show you how to use it.
What Halmos actually does
Halmos takes your Solidity test functions and replaces concrete inputs with symbolic values (variables that represent every possible value simultaneously). It then runs your code symbolically, exploring all execution paths, and checks whether your assertions can ever fail.
If it can't find a way to break the assertion, your property is proven. If it can, you get a concrete counterexample showing exactly which inputs cause the failure.
This is formal verification without leaving your Foundry workflow. No new language to learn. No separate spec files. Just write Solidity tests with a check_ prefix instead of test_.
Installation
Halmos is a Python tool. Install it with pip:
pip install halmos
Or if you want the latest development version:
pip install git+https://github.com/a16z/halmos.git
Verify the install:
halmos --version
You'll also need Foundry installed since Halmos uses forge build for compilation. If you don't have it:
curl -L https://foundry.sh | bash
foundryup
That's it. No Haskell, no Docker. Just Python and Foundry.
Your first symbolic test
Let's start with something simple, proving that an ERC-20 transfer is correct. Create a Foundry project:
mkdir halmos-tutorial && cd halmos-tutorial
forge init --no-commit
Here's our token contract, src/SimpleToken.sol:
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;
contract SimpleToken {
mapping(address => uint256) public balanceOf;
uint256 public totalSupply;
constructor(uint256 _initialSupply) {
balanceOf[msg.sender] = _initialSupply;
totalSupply = _initialSupply;
}
function transfer(address to, uint256 amount) external returns (bool) {
require(balanceOf[msg.sender] >= amount, "Insufficient balance");
balanceOf[msg.sender] -= amount;
balanceOf[to] += amount;
return true;
}
function mint(address to, uint256 amount) external {
balanceOf[to] += amount;
totalSupply += amount;
}
}
Now the symbolic test, test/SimpleToken.t.sol:
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;
import "forge-std/Test.sol";
import "../src/SimpleToken.sol";
contract SimpleTokenSymTest is Test {
SimpleToken token;
function setUp() public {
token = new SimpleToken(1_000_000e18);
}
// Symbolic test: transfer preserves total balance between sender and receiver
function check_transfer_preserves_balance(
address sender,
address receiver,
uint256 amount
) public {
// Assumptions -- constrain the symbolic values
vm.assume(sender != receiver);
vm.assume(sender != address(0));
vm.assume(receiver != address(0));
// Give sender some tokens
uint256 senderInitial = amount; // Use amount as initial balance too
vm.assume(senderInitial >= amount);
deal(address(token), sender, senderInitial);
uint256 receiverInitial = token.balanceOf(receiver);
uint256 totalBefore = senderInitial + receiverInitial;
// Perform transfer
vm.prank(sender);
token.transfer(receiver, amount);
uint256 totalAfter = token.balanceOf(sender) + token.balanceOf(receiver);
// This must hold for ALL possible inputs
assert(totalAfter == totalBefore);
}
// Symbolic test: transfer never increases sender balance
function check_transfer_decreases_sender(
address sender,
address receiver,
uint256 amount
) public {
vm.assume(sender != receiver);
vm.assume(sender != address(0));
deal(address(token), sender, amount);
uint256 before = token.balanceOf(sender);
vm.prank(sender);
token.transfer(receiver, amount);
assert(token.balanceOf(sender) <= before);
}
}
Notice the check_ prefix. That's how Halmos identifies symbolic tests. Everything else looks like a normal Foundry test.
Running Halmos
halmos
Halmos automatically finds all check_ functions and runs symbolic analysis:
[PASS] check_transfer_preserves_balance(address,address,uint256)
[PASS] check_transfer_decreases_sender(address,address,uint256)
Tested: 2, Passed: 2, Failed: 0
Both properties are proven — not just tested against random inputs, but verified for every possible combination of sender, receiver, and amount. That's the power of symbolic execution.
Understanding counterexamples
What happens when a property doesn't hold? Let's add a buggy function to our token:
// Bug: no overflow check on mint (well, Solidity 0.8 has it,
// but let's test totalSupply consistency)
function check_totalSupply_equals_sum(uint256 mintAmount, address user) public {
vm.assume(user != address(0));
vm.assume(user != address(this));
uint256 supplyBefore = token.totalSupply();
token.mint(user, mintAmount);
// This SHOULD hold... but does it?
assert(token.totalSupply() == supplyBefore + mintAmount);
}
This one actually passes because Solidity 0.8 reverts on overflow. But if the contract used unchecked blocks or was compiled with 0.7.x, Halmos would find the exact mintAmount that causes overflow and present it as a counterexample:
[FAIL] check_totalSupply_equals_sum(uint256,address)
Counterexample:
mintAmount = 115792089237316195423570985008687907853269984665640564039457584007913129639936
user = 0x0000000000000000000000000000000000010000
The counterexample gives you concrete values. You can plug them into a regular Foundry test to reproduce the issue.
Loop bounds
Halmos needs to know how many times to unroll loops. By default, it uses a conservative bound. If your contract has loops, you'll want to configure this:
halmos --loop 10
The --loop flag sets the maximum loop iteration count. If your code has a for loop that iterates up to n times, Halmos will analyze up to 10 iterations.
Be careful with this. Higher bounds = exponentially more paths = longer analysis time. Start with --loop 2 and increase only if Halmos warns about bounded loops being hit.
# If you get "loop bound reached" warnings:
halmos --loop 5 --solver-timeout-assertion 60000
Solver timeout
The SMT solver (Z3) does the heavy lifting. Complex properties on complex contracts can take a while. Configure the timeout:
halmos --solver-timeout-assertion 30000 # 30 seconds per assertion
halmos --solver-timeout-branching 10000 # 10 seconds per branch
If you hit timeouts, it doesn't mean the property is broken. It means the solver couldn't decide within the time limit. You might need to:
- Simplify the property
- Add more
vm.assume()constraints to reduce the search space - Break one complex check into multiple simpler ones
Storage assumptions
Halmos handles storage symbolically, which is powerful but can lead to path explosion. For complex contracts, you'll want to constrain initial storage:
function check_withdraw_correctness(uint256 depositAmt, uint256 withdrawAmt) public {
// Constrain to realistic values
vm.assume(depositAmt > 0 && depositAmt <= 1e30);
vm.assume(withdrawAmt > 0 && withdrawAmt <= depositAmt);
// Set up known state
deal(address(token), address(this), depositAmt);
token.approve(address(vault), depositAmt);
vault.deposit(depositAmt);
uint256 sharesBefore = vault.balanceOf(address(this));
vault.withdraw(withdrawAmt);
uint256 sharesAfter = vault.balanceOf(address(this));
assert(sharesAfter <= sharesBefore);
}
The vm.assume() calls are critical. Without them, Halmos explores every possible uint256 value, including ones that don't make sense (like depositing more tokens than exist). Assumptions narrow the search space without weakening the proof.
When to use Halmos vs fuzzers
This is the question everyone asks. Here's my honest take.
Use Halmos when:
- You need mathematical certainty that a property holds
- The property is about a single function or a short call sequence
- You're verifying arithmetic correctness (no off-by-one errors, no overflow)
- The contract has bounded complexity (few loops, limited storage)
Use fuzzers when:
- You're testing stateful properties across many transactions
- The contract has deep state spaces that symbolic execution can't traverse
- You want broad coverage quickly
- The property involves timing, multiple actors, or complex DeFi interactions
Use both when:
- It's a critical contract handling real money (and it usually is)
- Halmos proves arithmetic properties, fuzzers explore state-dependent ones
Check the Halmos vs Echidna comparison for a deeper look at the tradeoffs.
A real-world example: proving ERC-20 correctness
Let's do something more complete. Here's a suite of symbolic tests that prove key ERC-20 properties:
contract ERC20SymbolicSuite is Test {
SimpleToken token;
function setUp() public {
token = new SimpleToken(1_000_000e18);
}
// Transfer to self doesn't change balance
function check_self_transfer(uint256 amount) public {
address user = address(0xBEEF);
deal(address(token), user, amount);
uint256 before = token.balanceOf(user);
vm.prank(user);
token.transfer(user, amount);
assert(token.balanceOf(user) == before);
}
// Zero transfer is a no-op
function check_zero_transfer(address from, address to) public {
vm.assume(from != address(0));
uint256 fromBefore = token.balanceOf(from);
uint256 toBefore = token.balanceOf(to);
vm.prank(from);
token.transfer(to, 0);
assert(token.balanceOf(from) == fromBefore);
assert(token.balanceOf(to) == toBefore);
}
// Mint increases totalSupply by exact amount
function check_mint_increases_supply(address to, uint256 amount) public {
vm.assume(to != address(0));
uint256 supplyBefore = token.totalSupply();
vm.assume(amount <= type(uint256).max - supplyBefore); // no overflow
token.mint(to, amount);
assert(token.totalSupply() == supplyBefore + amount);
}
// Transfer doesn't change totalSupply
function check_transfer_constant_supply(
address from, address to, uint256 amount
) public {
vm.assume(from != address(0));
deal(address(token), from, amount);
uint256 supplyBefore = token.totalSupply();
vm.prank(from);
token.transfer(to, amount);
assert(token.totalSupply() == supplyBefore);
}
}
Run the full suite:
halmos --function check_
Each passing test is a mathematical proof. That's something no amount of fuzzing can give you.
Debugging tips
Path explosion. If Halmos hangs, your contract probably has too many branches. Use --solver-timeout-branching 5000 to fail fast on expensive paths and identify which functions cause blowup.
Unsupported opcodes. Halmos doesn't support every EVM opcode yet. If you hit "unsupported opcode" errors, check which cheatcodes or precompiles your contract uses. Sometimes you can work around it by mocking the problematic function.
Memory vs time. Halmos can eat a lot of memory on complex contracts. Monitor your system resources. If it's using 16GB+, you probably need to simplify your test.
# Verbose output helps debug issues
halmos -vvv --function check_specific_property
Integrating with your workflow
I like to run Halmos alongside fuzz tests in CI:
# Run both
forge test # Regular + fuzz tests
halmos # Symbolic tests
Name your files clearly: *.t.sol for Foundry tests, and prefix symbolic test contracts with Sym (like SimpleTokenSymTest). It keeps things organized.
Next steps
You've got the foundations. From here:
- Read the Halmos symbolic execution guide for more advanced techniques
- Explore how formal verification fits into a complete security workflow
- Learn about property-based testing principles that apply to both fuzzing and symbolic execution
- Try combining Halmos proofs with Echidna or Medusa campaigns for defense in depth
Symbolic testing won't replace fuzzing. Fuzzing won't replace symbolic testing. You want both. Start with Halmos for the critical arithmetic, add fuzzers for the stateful exploration, and sleep better knowing your properties actually hold.
Get a professional security audit
Try Recon Pro
Related Posts
Certora CVL tutorial: a practitioner's guide to writing specs
A practical guide to writing Certora CVL specs. Covers the language basics, real DeFi patterns, comm...
Slither beyond defaults: writing custom detectors for your protocol
Default Slither detectors catch generic bugs. Custom detectors catch your protocol's bugs. Here's ho...