Repricing Methodology
The proposed gas parameters come from the actual time each execution client takes to run
a targeted set of benchmarks: runtimes are collected from the Benchmarkoor API by
benchmarkoor-fetch and turned into gas numbers by evm-gasfit.
Five stages turn a wall-clock measurement into a gas number:
- Benchmark data collection — synthetic blocks stress one operation at a time; Benchmarkoor records each block's runtime and opcode counts.
- Runtime model — an NNLS regression isolates the per-unit runtime of each opcode, per client.
- Glue adjustment — the overhead of the supporting opcodes that surround the target in each fixture is netted out.
- Runtime → gas — the adjusted runtime is converted to gas at a fixed anchor rate.
- Worst-case selection — the proposal takes the slowest client for each parameter, then evaluates any derived parameters.
Sections 1–5 below detail those stages. A further analysis builds on the resulting gas numbers and is documented after them: the throughput-loss analysis, which measures how much throughput each repricing choice — keeping today's prices, rounding, or fractional gas — wastes per op and over real mainnet traffic. The companion Bottlenecks Methodology covers the separate question of which operations are already too slow to reprice cheaper.
1. Benchmark data collection
The proposed gas costs are grounded in the actual time each execution client takes to run a specific set of benchmarks. Similar to the methodology of the Gas Cost Estimator project, we generate synthetic blocks that isolate and stress individual EVM operations and use them to derive the various gas parameters.
Concretely, to benchmark a single operation, different blocks are created by varying the number of times the target operation is executed and by changing the parameter values passed to it. These test blocks come from the EEST benchmark suite. The raw data that feeds this analysis was queried from the following two Benchmarkoor suites:
3182dda7b93dee61a11611f320a39015
The Benchmarkoor tool then runs each block and collects the metrics we need: the block's
total execution time and the number of times each operation was executed. Each block is
run multiple times on every client to account for variability in execution time. Runtimes
are pulled from the Benchmarkoor API by benchmarkoor-fetch; the benchmarks
ran on the amsterdam fork and runs are selected by the
.*-full.* run-id pattern. The compute suite drives the target-opcode fits,
while the stateful suite feeds the glue-opcode analysis. Each block contributes one row to
the model: a measured test_runtime_ms alongside the count of every opcode it
executed.
2. Runtime model (NNLS)
evm-gasfit solves one non-negative least squares (NNLS) regression per
(opcode spec, model variant, client) combination. It models the measured
runtime of each fixture as a linear function of how many times the target opcode runs:
test_runtime_ms = intercept + target_coef · opcount + Σ (param_i · opcount · value_i)
The target_coef — the per-unit runtime of the opcode under study — is the
quantity we care about. The trailing sum carries variable-cost terms: opcodes whose cost
scales with an operand (MOD/SMOD/ADDMOD/MULMOD by bit-width, KECCAK256 base + per-word,
several precompiles) get their own fixture parameters, while constant-cost opcodes
(DIV, SDIV) collapse to a single coefficient. Parameters that are constant across all
fixtures are dropped automatically — their column is indistinguishable from
target_coef — and a fit is skipped entirely if there are too few
observations or the opcount never varies.
The non-negativity constraint is the point of NNLS: gas costs cannot be negative, and an unconstrained fit would let correlated features take opposite signs to cancel noise. NNLS instead drives such a coefficient to exactly zero or spreads the signal across the remaining features.
Inference is a non-parametric bootstrap over the rows of the design
matrix (1000 iterations). Each coefficient gets a p-value — the bootstrap share of
near-zero estimates — and a 2.5 / 97.5 percentile confidence interval. A fit is flagged
as a poor-fit selection when its p-value exceeds 0.05 or its R²
falls below 0.5. Fits cover 6 clients (all shown here
and on the runtime/glue pages; ethrex is fitted too but held out of the
worst-case selection in step 5):
besuerigonethrexgethnethermindreth
See the runtime model page for every per-fit regression summary, or the evm-gasfit NNLS docs for the full derivation.
3. Glue adjustment
Glue adjustment is enabled.
A benchmark never measures an opcode in isolation: the recorded runtime also includes the
glue opcodes that set up arguments, advance the program counter, and clean the
stack (pushes, dups, loads, control flow). Because NNLS attributes the entire
intercept + target_coef · opcount budget to the target, glue that scales with
the loop body inflates the estimate. Glue adjustment removes that overhead before the
runtime is converted to gas.
evm-gasfit works from a fixed set of canonical glue opcodes, fit in
dependency tiers (pure opcodes first, then cycle, then mixed tiers that lean on the
already-priced ones). The adjustment runs in three steps:
- Detect which glue opcodes actually contaminated each measurement, by
correlating each glue opcode's per-fixture count against the target
opcountand recording the count ratioΔglue / Δopcount. - Estimate a per-unit runtime (
glue_runtime_ms) for every (client, glue opcode) with its own tiered NNLS fit. - Subtract the total glue overhead from the target coefficient:
adjusted_target_coef = max(0, target_coef − Σ_g ratio_g · glue_runtime_ms_g)
A glue partner only contributes to the sum when its own fit clears the same
p-value (< 0.05) and R² (≥ 0.5) gates; the confidence interval shifts with it and
clips at zero. Skipped fits — targets left without a glue correction — are listed in the
generated new_gas_proposal.md warnings. See the
glue page for the per-fit detail and the
evm-gasfit glue docs
for the tier structure.
4. Runtime → gas
Each adjusted runtime coefficient is converted to gas at a throughput
anchor — the gas/second a client should be able to sustain on the worst
case. With runtime_ms in milliseconds:
new_gas = ⌈ anchor_rate · runtime_ms / 1000 ⌉
Because the proposed cost is exactly linear in the anchor — and the worst-case client doesn't change with it — the Repricing page recomputes every cost live in the browser as you move the anchor; no re-fit is needed. The committed baseline is 100,000,000 gas/s (100 Mgas/s). Any op whose measured cost implies a slower throughput than the anchor lands above its current price and is flagged as needing an increase; the ceiling keeps the result an integer.
5. Worst-case selection & derived parameters
A single gas parameter is fit many times — once per client, sometimes across several fixtures. The proposal consolidates those in two passes, always erring toward the slowest observation so the cost holds on every implementation:
- Per client — among the rows that pass the p-value and R² gates, take
the one with the largest runtime; if none qualify, the best available is kept and
flagged
poor_fit. - Across clients — take the client with the largest runtime, among the
clients eligible for the worst case.
ethrexis fitted and shown on the runtime and glue pages but is excluded from this step, so it never sets a proposed gas cost (the bottleneck analysis excludes it the same way).
Finally, derived parameters (those defined in terms of others) are evaluated in
declaration order against the fitted values, the fork baseline, and any
new_params integer baselines, using a small arithmetic-only expression
language. The generated new_gas_proposal.md client-comparison section shows how
far the worst client sits from the rest of the field. The
evm-gasfit gas-params docs
cover the selection and derivation rules in full.
Throughput loss of repricing
At 100 Mgas/s the fair price for an op is its exact fractional gas — the cost at which a block of that op runs at exactly the anchor's Mgas/s. The Throughput-loss page compares three ways to price each op against that ideal: keep today's integer cost (no reprice), round the fair price up to an integer ≥ 1 (reprice + round), or charge the fair price directly (reprice + fractional, zero loss). Per fitted parameter:
exact = anchor_rate · runtime_ms / 1000 loss(no reprice) = max(current − exact, 0) loss(round) = ⌈exact⌉ − exact
The analysis focuses on the operations repricing makes cheaper. Throughput is
wasted only by overcharging: an op whose fair price is below today's cost
(exact < current) is overpriced, so the gas limit binds before the real-time
budget and capacity sits idle. An op that repricing would make more expensive
(underpriced today, exact > current) is undercharged — it isn't wasting
throughput — so its no-reprice loss is floored at zero rather than counted as a
negative offset. The no-reprice metric is therefore a pure sum of the wasted throughput on the
cost-decrease ops, with no undercharge netting against it.
Opcode-level loss is the unweighted per-op loss / charged — the share
of the gas you pay that is waste. Mainnet-traffic loss sums each per-execution
op's loss weighted by how often it runs on mainnet, divided by G = total block
gas_used over the same window — all gas the chain burns, including the 21k intrinsic
and calldata, so it reads as a share of real block gas. Opcode counts come from Xatu's CBT
fct_opcode_gas_by_opcode_hourly table and the denominator from
fct_execution_gas_used_hourly (make composition → committed
data/tx_composition*.csv + data/tx_block_gas_hourly.csv). Per-word /
per-round coefficients are shown per-op but excluded from the weighted aggregate (their weight
depends on data sizes not modelled here). Rounding discards roughly half a gas per cheap-op
execution at any anchor, so its loss is nearly anchor-flat; the no-reprice loss shrinks as the
anchor rises (today's fixed cost becomes a smaller overcharge) and an op drops out of the loss
once the anchor pushes it underpriced.
Mapping gas parameters to loss — static vs. dynamic costs
Loss is computed per fitted gas coefficient, not per opcode. An op whose cost has
a static and a size-dependent part is fit as two separate parameters and appears as two rows. For
example KECCAK256 splits into OPCODE_KECCAK256_BASE (the per-execution cost) and
OPCODE_KECCAK256_PER_WORD (the marginal cost per 32-byte word); the copy opcodes and
the hash precompiles split the same way. Each coefficient gets its own modelled
runtime_ms (hence its own exact) and is compared against the matching
current cost parsed from the proposal table — base against today's base gas, per-word against
today's per-word gas.
The two metrics treat the dynamic coefficients differently:
-
Opcode-level loss includes every coefficient — both base and per-word/round/point
— since
loss / chargedis well-defined for each without any external weight. -
Mainnet-traffic loss weights each coefficient by how often it runs, and Xatu
gives us execution counts but not operand sizes. A per-word cost would need the
distribution of words hashed or copied per call, which we don't model — so
only the base (static, per-execution) coefficients are weighted in, each by its
opcode's execution count. The per-word / per-round / per-point coefficients
(
KECCAK256,SHA256,RIPEMD160,IDENTITY,BLAKE2F,ECPAIRING) are excluded from the weighted aggregate. The traffic number therefore captures base-cost rounding only and understates the total for size-heavy ops; the per-op table still shows their loss individually.
PUSH/DUP/SWAP variants are normalized to one parameter (PUSH2→PUSH) and
their mainnet counts summed, so the single base coefficient carries the combined frequency.
Ops that need a price increase are held at zero loss. When an op is
underpriced today — its fair cost at the anchor exceeds the current gas
(exact > current, e.g. EXP, whose modelled cost is far above today's
base) — repricing would raise its cost, not cut it. A block heavy in that op runs
slower than the anchor implies; it is undercharged, not wasting throughput — the
underpricing the bottleneck analysis and a repricing increase exist
to fix. Because this page focuses on the cost-decrease ops, the no-reprice loss is floored
at zero for these (max(current − exact, 0)) rather than counted as a negative
undercharge. So no undercharge offsets the overcharges, and the no-reprice headline is a pure sum
of wasted throughput over the ops repricing makes cheaper — not a signed net. The round and
fractional scenarios still price the underpriced ops correctly (⌈exact⌉ ≥ exact raises
the cost). Coefficients with a zero or unparseable current cost are dropped from the no-reprice
metric only.
Scope & coverage
This analysis covers the EVM's compute operations: arithmetic, bitwise,
comparison, stack, control-flow, hashing (KECCAK256), memory, block/transaction and call
context, and every precompile (including BLS12-381). The genuinely state- and IO-bound
families — storage (SLOAD/SSTORE), account access, transient storage, CREATE, and BLOCKHASH
— are deliberately excluded, as their runtime is dominated by state access rather than
compute. The op set is the corresponding evm-gasfit presets, declared in
fit.yaml.
A few coverage gaps are worth noting: the per-word coefficients for CALLDATACOPY, RETURNDATACOPY and MCOPY aren't fit because the benchmark fixtures don't vary copy size (only their base cost is priced), the BLS12-381 G1/G2 multi-scalar-multiplication precompiles had no matching fixtures in this run, and the MODEXP precompile has no preset upstream. These params simply don't appear in the table.
Gas-cost baseline
Proposed values are compared against the osaka
gas-cost table — distinct from the amsterdam fork the benchmarks ran on. The
baseline is the osaka GasCosts patched with any config
overrides and new_params baselines, and it is the cost reference each proposed
value diffs against.
Reproducibility
Re-run the full pipeline end to end:
make fetch # → data/raw/ (needs a Benchmarkoor token in secrets.json)
make gasfit # → data/gasfit/
make bottlenecks # → data/bottlenecks.csv (worst-case Mgas/s per op)
make composition # → data/tx_composition*.csv + tx_block_gas_hourly.csv (Xatu; needs xatu creds)
make fracgas # → data/fracgas*.csv (repricing throughput loss at the anchor)
make site # renders this site into docs/
The committed data/ directory makes the published site self-contained:
every figure and table on these pages traces back to those artifacts.