> ## Documentation Index
> Fetch the complete documentation index at: https://benchgen.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Read Results

> Understand the benchmark results report and what the metrics mean.

After a benchmark run completes, Eval generates a structured results report. This page explains what each section means and how to use it.

***

## Results Report Structure

### Summary metrics

| Metric          | What it means                                                                   |
| --------------- | ------------------------------------------------------------------------------- |
| **Accuracy**    | Percentage of test cases where the model's response matched the expected answer |
| **Avg latency** | Mean response time per question in milliseconds                                 |
| **Avg cost**    | Mean token cost per question (API models only)                                  |
| **Pass / Fail** | Count of passed and failed cases                                                |

### Per-question breakdown

Each test case shows:

* The input prompt
* The model's response
* The expected answer
* Pass / Fail status
* Latency and token usage

### Failure analysis

Eval groups failing cases by error pattern (wrong format, factual error, refusal, hallucination) to help you identify the most impactful issues to fix.

***

## Comparing Runs

Select two or more runs from the run history to view a side-by-side diff. Useful for measuring improvement after a fine-tune.

***

## Next Steps

* [Export failing cases to Train](/eval/export-datasets)