Fuzzing as a service vs DIY: when to hire and when to do it yourself
Fuzzing as a Service vs DIY: When to Hire and When to Do It Yourself
By alex — Strategy & Research at Recon
Every protocol team building on-chain eventually asks the same question: should we run our own fuzzing, or should we pay someone to do it?
It's not a trivial decision. DIY fuzzing means learning the tools, writing the properties, maintaining the infrastructure, and interpreting the results. Hiring a fuzzing service means spending money but getting expert-written tests and someone else handling the complexity.
Neither option is universally better. The right choice depends on your team's size, expertise, budget, and how complex your protocol is. Let me lay out both sides honestly so you can make the call.
What DIY fuzzing looks like
Running your own fuzzing means your team writes property tests, runs the fuzzer, triages results, and maintains everything as the codebase changes.
The tools
You've got solid options available today:
Foundry fuzz tests — The easiest entry point. Write a test function with parameters, Foundry generates random inputs. Great for stateless property testing. If you're already using Foundry for your test suite, you can add fuzz tests with minimal overhead.
// Basic Foundry fuzz test -- your team writes these
function testFuzz_depositWithdrawRoundTrip(uint256 amount) public {
amount = bound(amount, 1, 1_000_000e18);
uint256 sharesBefore = vault.totalSupply();
vault.deposit(amount, address(this));
uint256 shares = vault.balanceOf(address(this));
vault.redeem(shares, address(this), address(this));
// Should get back approximately what we deposited
uint256 returned = token.balanceOf(address(this));
assertApproxEqAbs(returned, amount, 1, "Lost more than dust");
}
Echidna, Property-based fuzzer from Trail of Bits. Supports stateful fuzzing, it generates sequences of transactions and checks invariants after each one. More powerful than single-input fuzz tests but harder to set up.
Medusa, Similar to Echidna but built differently. Often faster on large codebases due to parallelization. Uses the same property format, so you can often switch between them.
Halmos, Symbolic execution. Instead of random inputs, it reasons about all possible inputs. Slower but can prove properties, not just test them. Catches edge cases that random fuzzing might miss.
The learning Curve
Here's the honest truth: writing good fuzz tests is harder than writing unit tests. Anyone can write testFuzz_add(uint256 a, uint256 b). Writing properties that actually catch real bugs takes practice.
Week 1-2: Your team gets Foundry fuzz tests running. Mostly stateless tests, "this function shouldn't revert with valid inputs" type stuff. Useful but shallow.
Month 1-2: Someone on the team starts learning invariant testing. They write handler contracts, define actor sets, and run stateful campaigns. This is where it gets powerful, and where most teams stall.
Month 3+: The team can write properties that test multi-step attack scenarios, cross-contract interactions, and economic invariants. This is where DIY fuzzing actually catches the bugs that matter.
Most teams I've seen get through the first phase but don't push into the second and third. They end up with fuzz tests that are marginally better than unit tests, lots of random inputs, but the properties aren't checking what matters.
Maintenance burden
This is the part people underestimate. Fuzz tests aren't write-once-and-forget:
- Code changes break handlers. Refactor a function signature? Update every handler that calls it.
- New features need new properties. Ship a new lending market? You need properties for it.
- Corpus management. Long-running campaigns build up a corpus of interesting inputs. Someone needs to manage, prune, and re-run against new code.
- Triage time. When a fuzzer reports a failure, someone needs to determine: is this a real bug, a test issue, or an expected behavior? This takes experience.
For a team of 5 developers, expect 1 person spending 20-30% of their time on fuzz test maintenance. For larger codebases, it could be a full-time role.
What it costs (DIY)
The tools are mostly free:
- Foundry: free
- Echidna: free
- Medusa: free
- Halmos: free
The real cost is developer time. At $150-200K/year fully loaded for a Solidity engineer:
- Learning phase: 1-2 months of partial allocation = $12-25K
- Ongoing maintenance: 20-30% of one engineer = $30-60K/year
- Compute costs for CI: $500-2,000/month depending on scale
Total first year: ~$50-90K in opportunity cost.
That's not nothing. And the output quality depends entirely on who's writing the properties.
What fuzzing as a service looks like
Hiring a service means an external team writes properties, runs campaigns, and delivers results. Here's what that typically involves.
The service model
A typical FaaS engagement looks like this:
- Scoping: The service reviews your codebase and identifies which properties to test.
- Property writing: Expert auditors write invariant properties based on your protocol's logic.
- Campaign execution: They run the fuzzer (usually with more compute than you'd use internally) and tune the campaign.
- Triage and reporting: Results are triaged, false positives filtered, real bugs documented with PoCs.
- Deliverables: You get the test suite (yours to keep), the results, and usually fix-review.
Recon pro: cloud fuzzing
Recon Pro is our cloud fuzzing platform. It runs coverage-guided fuzzing campaigns at scale with managed infrastructure. You push your test suite (written by you, or by us during an audit), and Pro handles:
- Parallel fuzzing across many cores
- Extended campaign durations (hours to days, not minutes)
- Coverage tracking and corpus management
- Result dashboards
The value is you get the compute and infrastructure without managing it. Write properties locally, push them up, get results back.
Expert property writing
The bigger differentiator isn't infrastructure, it's who's writing the properties. An experienced security researcher writes different properties than a developer learning fuzzing:
// Developer-written property (common, but shallow)
function invariant_totalSupplyNotZeroAfterDeposit() public {
if (ghost_totalDeposited > 0) {
assertGt(vault.totalSupply(), 0);
}
}
// Expert-written property (catches real bugs)
function invariant_noUserCanExtractMoreThanDeposited() public {
for (uint256 i = 0; i < actors.length; i++) {
uint256 currentValue = vault.convertToAssets(
vault.balanceOf(actors[i])
);
uint256 totalIn = ghost_deposited[actors[i]];
uint256 totalOut = ghost_withdrawn[actors[i]];
// Net extraction should never exceed deposits + yield
// This catches donation attacks, rounding exploits,
// share inflation, and withdrawal ordering bugs
assertLe(
totalOut + currentValue,
totalIn + ghost_totalYieldAccrued + DUST,
"Value extraction exceeded deposits + yield"
);
}
}
The first property is fine. It checks a basic fact. The second property encodes an economic invariant that catches entire categories of attacks: donation attacks, rounding errors, share manipulation, and more.
Writing the second kind requires auditing experience. You need to have seen the attacks to know which properties prevent them.
What it costs (FaaS)
Pricing varies by engagement type:
| Service | Typical cost | What you get |
|---|---|---|
| One-time fuzzing campaign | $20-50K | Property suite + campaign results |
| Full audit with fuzzing | $80-200K | Manual review + invariant tests + report |
| Ongoing Recon Pro subscription | $2-10K/month | Cloud infrastructure + dashboard |
| Retainer (expert on call) | $5-15K/month | Ongoing property updates + triage |
A one-time engagement gets you a property suite you own forever. The ongoing subscription keeps the campaigns running against new code. The retainer means someone updates properties when your protocol evolves.
The decision matrix
Here's how to decide. Be honest about where your team is:
Go DIY if:
-
Your team has fuzzing experience. Someone on the team has written invariant tests before and understands property-based testing. Not "read a tutorial", actually shipped fuzz tests that found bugs.
-
Your protocol is relatively simple. A single-purpose contract (token, simple vault, staking) has fewer interaction points. The property surface is manageable for a small team.
-
You're early stage. Pre-launch, code is changing fast. Writing properties in-house means they evolve with the code. An external engagement gets stale quickly during rapid development.
-
Budget is tight. If you genuinely can't afford a service, DIY with Foundry fuzz tests is infinitely better than no fuzzing at all. Start with the basics and grow.
Hire a service if:
-
No one on the team has done it before. The learning curve is real, and the first few months of DIY fuzzing catch shallow bugs at best. An expert gets you deep coverage from day one.
-
Your protocol is complex. Multi-contract DeFi protocols, lending markets, AMMs with concentrated liquidity, cross-chain bridges, have huge state spaces. Properties that cover the real attack surface require experience writing them.
-
You're approaching launch. If you need results in weeks, not months, a service delivers faster than training your team. After launch, you can bring fuzzing in-house using the property suite they built.
-
You want security guarantees. A service with an audit report, test suite, and fix review provides a package that investors and users recognize. "We wrote our own fuzz tests" doesn't carry the same weight with auditors and insurance providers.
-
Your TVL justifies it. If you're securing $50M+, the $50-200K for expert fuzzing is insurance. The cost of not auditing is measured in millions.
The hybrid approach (Often best)
Here's what I recommend most often:
-
Start with DIY. Your developers write basic Foundry fuzz tests during development. This costs nothing extra and catches simple bugs early.
-
Hire for the launch audit. An expert team writes the deep invariant properties. You get the test suite, and they train your team on how it works.
-
Maintain in-house. Post-launch, your team owns the property suite. They update it as the protocol evolves. If they get stuck, they have the service on retainer for questions.
-
Run campaigns in the cloud. Use Recon Pro for extended fuzzing campaigns. Write properties locally, push to the cloud for heavy compute.
This gets you expert-quality properties (step 2) with in-house ownership (step 3) and scalable infrastructure (step 4). You pay for expertise when it matters most and build internal capability over time.
Break-Even analysis
Let's get concrete. When does each option make financial sense?
Scenario a: small team (3-5 devs), simple protocol
| DIY | FaaS | |
|---|---|---|
| First year cost | ~$40K (developer time) | ~$60K (one-time + Pro sub) |
| Year 2+ cost | ~$30K/year (maintenance) | ~$24K/year (Pro sub + occasional retainer) |
| Quality of coverage | Medium (limited by team experience) | High (expert properties) |
| Break-even point | Never, DIY is cheaper if team is capable | If team lacks experience, FaaS pays for itself by finding 1 bug that DIY wouldn't |
Scenario b: medium team (5-15 devs), complex DeFi protocol
| DIY | FaaS | |
|---|---|---|
| First year cost | ~$80K (senior dev time + learning) | ~$150K (full audit + Pro) |
| Year 2+ cost | ~$60K/year | ~$36K/year |
| Quality of coverage | Medium-High (if you hire/train right) | High from day one |
| Break-even point | Year 3 if team ramps successfully | Immediate if audit catches critical before launch |
Scenario c: large protocol ($100M+ TVL)
At this scale, the question isn't DIY vs FaaS, it's both. You should have internal fuzzing expertise AND external audits AND ongoing cloud campaigns. The $200K for a full-scope engagement is 0.2% of your TVL. One prevented exploit pays for a decade of auditing.
Common mistakes
Mistake 1: Thinking Foundry fuzz tests are enough. Foundry's built-in fuzzing is great for stateless properties but limited for stateful testing. If your protocol's bugs require specific transaction sequences (most DeFi bugs do), you need stateful fuzzing with a tool like Echidna, Medusa, or Chimera.
Mistake 2: Writing properties that test the implementation, not the specification. "Function X returns the same value as before" isn't a useful property, you're just testing that the code does what it does. "No user's withdrawable value ever decreases without a corresponding action" tests what should be true.
Mistake 3: Running fuzzing campaigns for 5 minutes. Short campaigns only find surface-level issues. Meaningful stateful fuzzing needs hours to explore deep state spaces. This is where cloud infrastructure matters.
Mistake 4: Ignoring results because "it's probably a false positive." Every failure deserves investigation. If you're getting too many false positives, your properties are wrong, fix the properties, don't ignore the results.
Mistake 5: Treating fuzzing as a checkbox. "We run fuzz tests in CI" means nothing if the properties are shallow. Depth matters more than existence.
The bottom line
DIY fuzzing works if your team is committed and has the right experience. FaaS works if you need expert-quality coverage fast or your protocol's complexity exceeds your team's testing experience.
For most teams, the best path is: hire experts for the launch audit, learn from the property suite they deliver, and maintain it in-house going forward with cloud infrastructure for heavy campaigns.
The worst option? No fuzzing at all. Even basic Foundry fuzz tests catch bugs that unit tests miss. Start somewhere and grow.
Ready to get started? Try Recon Pro for cloud fuzzing campaigns, or request an audit to get an expert-written property suite for your protocol.
Further reading
Related Posts
The complete smart contract security pipeline: first commit to mainnet
Every step from first commit to mainnet, in order. Static analysis, unit tests, invariant testing, f...
Mutation testing for smart contracts: measure your test suite quality
Your tests pass. But are they actually good? Mutation testing injects faults into your code and chec...