> ## Documentation Index
> Fetch the complete documentation index at: https://benchgen.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# YAML Reference

> Every field in competition.yaml for a BenchGen custom environment bundle.

`competition.yaml` is the entry point for every custom environment bundle. BenchGen reads this file to understand the environment's structure, which data and programs to use, and how to display results.

***

## Top-level fields

```yaml theme={null}
title: My Evaluation Environment
description: A short description shown in the Environments Hub.
image: logo.png
terms: terms.md
pages:
  - title: Overview
    file: overview.md
  - title: Evaluation
    file: evaluation.md
phases:
  - ...
tasks:
  - ...
leaderboard:
  - ...
```

| Field         | Required | Description                                                          |
| ------------- | -------- | -------------------------------------------------------------------- |
| `title`       | Yes      | Display name shown in the hub and run UI                             |
| `description` | No       | One-line summary shown on the environment card                       |
| `image`       | No       | Path to a logo image inside the bundle                               |
| `terms`       | No       | Path to a Markdown file with terms of use                            |
| `pages`       | No       | Additional Markdown pages rendered as environment documentation tabs |

***

## `pages`

Optional documentation tabs displayed alongside the environment. Each entry needs a `title` and a `file` path relative to the bundle root.

```yaml theme={null}
pages:
  - title: Overview
    file: overview.md
  - title: Data format
    file: data.md
```

***

## `phases`

Phases define the active evaluation windows. Most environments have a single phase. Each phase references one or more tasks by their `index`.

```yaml theme={null}
phases:
  - index: 0
    name: Evaluation
    description: Main evaluation phase
    start: 2025-01-01
    end: 2027-12-31
    tasks:
      - 0
```

| Field         | Required | Description                                       |
| ------------- | -------- | ------------------------------------------------- |
| `index`       | Yes      | Zero-based integer identifier for this phase      |
| `name`        | Yes      | Display name                                      |
| `description` | No       | Short description of the phase                    |
| `start`       | No       | ISO 8601 date when the phase opens (`YYYY-MM-DD`) |
| `end`         | No       | ISO 8601 date when the phase closes               |
| `tasks`       | Yes      | List of task indices active in this phase         |

***

## `tasks`

Each task defines a single evaluation problem — the data and programs needed to score one type of submission.

```yaml theme={null}
tasks:
  - index: 0
    name: Main task
    description: Evaluate model accuracy on the test set
    scoring_program: scoring_program.zip
    reference_data: reference_data.zip
    ingestion_program: ingestion_program.zip   # optional
    input_data: input_data.zip                 # optional
```

| Field               | Required | Description                                         |
| ------------------- | -------- | --------------------------------------------------- |
| `index`             | Yes      | Zero-based integer identifier, referenced by phases |
| `name`              | Yes      | Display name                                        |
| `description`       | No       | What the task is evaluating                         |
| `scoring_program`   | Yes      | Path to the scoring program zip inside the bundle   |
| `reference_data`    | Yes      | Path to the reference data zip                      |
| `ingestion_program` | No       | Path to the ingestion program zip                   |
| `input_data`        | No       | Path to the input data zip                          |

***

## `solutions`

Optional reference solutions included with the bundle. BenchGen uses these to verify that the scoring program works correctly before the environment is published.

```yaml theme={null}
solutions:
  - index: 0
    path: example_solution.zip
    tasks:
      - 0
```

| Field   | Required | Description                                |
| ------- | -------- | ------------------------------------------ |
| `index` | Yes      | Zero-based identifier                      |
| `path`  | Yes      | Path to the solution zip inside the bundle |
| `tasks` | Yes      | Task indices this solution applies to      |

***

## `leaderboard`

Defines the columns displayed in the results table. Keys must match the keys your scoring program writes to `scores.json`.

```yaml theme={null}
leaderboard:
  - title: Results
    key: main
    columns:
      - title: Accuracy
        key: accuracy
        index: 0
        sorting: desc
      - title: F1
        key: f1
        index: 1
        sorting: desc
```

### Leaderboard group fields

| Field     | Required | Description                                  |
| --------- | -------- | -------------------------------------------- |
| `title`   | Yes      | Section heading in the results table         |
| `key`     | Yes      | Unique identifier for this leaderboard group |
| `columns` | Yes      | List of column definitions (see below)       |

### Column fields

| Field     | Required | Description                                                              |
| --------- | -------- | ------------------------------------------------------------------------ |
| `title`   | Yes      | Column heading                                                           |
| `key`     | Yes      | Must match the key in `scores.json` output by your scoring program       |
| `index`   | Yes      | Display order (zero-based)                                               |
| `sorting` | No       | `asc` or `desc` — direction used to rank submissions. Defaults to `desc` |

***

## Complete example

```yaml theme={null}
title: Text Classification Benchmark
description: Evaluates model accuracy on a multi-class text classification task.
image: logo.png

pages:
  - title: Overview
    file: overview.md
  - title: Data format
    file: data.md

phases:
  - index: 0
    name: Evaluation
    start: 2025-01-01
    end: 2027-12-31
    tasks:
      - 0

tasks:
  - index: 0
    name: Classification
    description: Predict the correct category for each input text
    scoring_program: scoring_program.zip
    reference_data: reference_data.zip
    input_data: input_data.zip
    ingestion_program: ingestion_program.zip

solutions:
  - index: 0
    path: example_solution.zip
    tasks:
      - 0

leaderboard:
  - title: Results
    key: main
    columns:
      - title: Accuracy
        key: accuracy
        index: 0
        sorting: desc
      - title: F1
        key: f1
        index: 1
        sorting: desc
```

***

## Next steps

* [Bundle structure](/eval/bundle-structure) — the files inside the `.zip` and what each one does
* [Create a custom environment](/eval/create-environment) — end-to-end upload walkthrough
