Reading the outputs¶

A run of evm-gasfit writes its artifacts under the directory you pass to write_reports(out_dir) (CLI: --out). This page is a file-by-file tour of what's there and how to read it.

out_dir/
├── results.csv
├── new_gas_all_params.csv
├── new_gas.csv
├── runtime_estimation_autogenerated_report.md
├── new_gas_proposal.md
├── meta.json
├── figs/                                       # only if output.plots: true
├── glue_results.csv                            # only if glue_adjustment.enabled
├── glue_opcodes_by_test.csv                    # only if glue_adjustment.enabled
└── glue_opcodes_autogenerated_report.md        # only if glue_adjustment.enabled

CSV artifacts¶

`results.csv`¶

One row per successful NNLS fit — i.e. per (spec, model_by-combo, client). Columns include:

Group	Columns
Keys	`test_name`, `target_opcode`, `client_name`, one column per `model_by` axis.
Fit	`nobs`, `rsquared`, `rsquared_adj`.
Intercept	`intercept_runtime_ms`, `intercept_pvalue`.
Target	`target_coef_runtime_ms`, `target_coef_pvalue`, `target_coef_conf_int_low`, `target_coef_conf_int_high`.
Per extra	`<extra>_runtime_ms`, `<extra>_pvalue`, `<extra>_conf_int_low`, `<extra>_conf_int_high` (for each surviving `model_params` entry beyond `target_coef`).

Skipped fits (insufficient data, constant opcount, solver failure) do not produce a row — only the log carries the warning. See NNLS modeling for skip reasons.

`new_gas_all_params.csv`¶

The aggregator's per-(gas_param, client, candidate) expansion of results.csv. One row for every model_params entry of every fit that could contribute to a gas param — winning and losing candidates together. Columns include gas_param, client_name, the source test_name / target_opcode / model_coef_name / model_by-combo, the source_label of the producing model spec (presets[<name>] or models.custom[<i>] — the disambiguator between candidates that are otherwise identical, e.g. two specs differing only in filter_by), the runtime + CI + p-value, the glue_adjustment applied (zero when no glue row matched), new_gas_decimal (raw conversion), new_gas_rounded (ceil-rounded), and two booleans:

is_winner — set on the row picked by the per-client worst-case selector.
poor_fit — set on every candidate that failed the p-value / R² thresholds, not just winners. A row that is both is_winner and poor_fit is a fallback winner: the whole (gas_param, client) group had no qualifying row and the selector fell back to the best-effort pick.

`new_gas.csv`¶

The final proposal: one row per gas param, taken as the across-client worst case of the per-client winners. Columns are gas_param, client_name (the worst-case client), runtime_ms, CI bounds, the contributing selected_test / selected_opcode / selected_model_coef_name, glue_adjustment, new_gas_decimal, new_gas_rounded, and model_by-combo columns. This is what's diffed against the patched fork baseline in the report.

`meta.json`¶

Run metadata: package version, fork name, anchor rate, config hash, and timestamps. Useful as an audit pointer when comparing runs.

Markdown reports¶

`runtime_estimation_autogenerated_report.md`¶

Per-spec NNLS summary. For each (test, target, model_by-combo):

Per-client coefficient table with point estimate, CI, p-value.
R² and nobs.
Diagnostic plots inlined when output.plots: true.

This is the place to inspect fit quality — the proposal report only surfaces poor-fit selections, not every poor fit.

`new_gas_proposal.md`¶

The headline artifact. Sections, in order:

Run metadata — timestamp, fork, anchor rate.
Summary — single line counting proposed / increased / decreased / new / unresolved params and warning counts.
Contents — TOC with anchor links.
Proposed gas parameters — the diff table for fitted rows: gas_param, current_gas, proposed_gas, diff, pct, with <no fit> rows for unresolved params kept in a separate block.
Client comparison — for each fitted param, the worst-client and second-worst-client values plus worst / second-worst ratio. Large ratios (≳ 2×) flag the worst client as a likely outlier. The per-client overview is rendered either as:
A log2(proposed / current) heatmap when output.plots: true (red = more expensive than current, green = cheaper, blank rows = new_params declared without a baseline). The cell the per-client selector picked for each client is outlined.
A markdown table fallback when plots are off; the winning cell is bolded.
Worst-case provenance — one collapsible <details> block per gas param, showing every (test_name, target_opcode, model_coef_name, model_by) contender × every client. This is where you go when a proposed value looks surprising and you want to see what the other candidates said.
Warnings — four subsections:
Missing parameters — proposed names that produced no value.
Incomplete client coverage — proposed for some clients, absent on others.
Missing glue adjustments — glue contributions skipped due to fit-quality gates (see Glue adjustment).
Other — anything else.
Poor-fit selections — the same poor-fit winners the new_gas.csv row flags, plus a separate Other weak candidates subsection listing losing candidates that also failed the thresholds. Use this to decide whether the worst-case pick is solid or whether you should split / refit the spec.

`glue_opcodes_autogenerated_report.md` (glue runs only)¶

Per-glue-opcode summary: tier, driver fixture, per-client coefficient with quality stats, and plots when enabled. See Glue adjustment for what the tiers mean.

Figures (`figs/`)¶

When output.plots: true:

figs/runtime/ — per-fit scatter plus regression line, one PNG per (spec, model_by-combo, client).
figs/glue/ — per-priced-glue diagnostic plots (glue runs only).
The proposal report's heatmaps are inlined directly as base64 data URIs so the .md is self-contained.

Turning plots off (output.plots: false) skips figs/ entirely and falls back to markdown tables in the report.

Reading the outputs¶

CSV artifacts¶

results.csv¶

new_gas_all_params.csv¶

new_gas.csv¶

meta.json¶

Markdown reports¶

runtime_estimation_autogenerated_report.md¶

new_gas_proposal.md¶

glue_opcodes_autogenerated_report.md (glue runs only)¶

Figures (figs/)¶