> ## Documentation Index
> Fetch the complete documentation index at: https://benchgen.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Add a Dataset

> Register a training dataset by importing it from HuggingFace or uploading your own file.

Every training run needs a dataset. Adding one is a quick two-part flow: give the dataset a name, then choose where its data comes from. You can import a public dataset from HuggingFace or upload your own file.

***

## Two ways to add a dataset

| Source               | Use when                                | What you provide                                       |
| -------------------- | --------------------------------------- | ------------------------------------------------------ |
| **From HuggingFace** | You want a public dataset from the Hub. | A search term, then pick the dataset from the results. |
| **Upload File**      | You have your own data.                 | Your dataset file.                                     |

<Info>
  Datasets exported from an Eval benchmark run show up automatically under the **Fine-tune** filter, so you don't need to add those by hand. See [Export datasets to Train](/eval/export-datasets).
</Info>

***

## Steps

### 1. Open the Datasets page and click Add Dataset

In the **Train** tab, click **Datasets** in the left sidebar. The **AI Datasets** page lists your datasets, filterable by **Public Library**, **My Datasets**, and **Fine-tune**. Click **+ Add Dataset** in the top right.

<img src="https://mintcdn.com/benchgen-8fc81371/FddC5uLEIMRz8cT0/images/train/dataset/01-datasets-list.jpg?fit=max&auto=format&n=FddC5uLEIMRz8cT0&q=85&s=a392d3e2c6c341c567b8e1a9f016f060" alt="The AI Datasets page with the Add Dataset button" width="1478" height="941" data-path="images/train/dataset/01-datasets-list.jpg" />

### 2. Enter the basic details

The **Add Dataset** panel slides in. Give the dataset a **name** (for example `my-math-dataset`) and, optionally, a short **description**. You can edit the description later.

Click **Add Dataset** to continue. You'll choose where the data comes from on the next step.

<img src="https://mintcdn.com/benchgen-8fc81371/FddC5uLEIMRz8cT0/images/train/dataset/02-basic-details.jpg?fit=max&auto=format&n=FddC5uLEIMRz8cT0&q=85&s=4f3153d9f01885c92674bb5d5776aff7" alt="The Add Dataset panel with the dataset name and description fields" width="1478" height="941" data-path="images/train/dataset/02-basic-details.jpg" />

<Tip>
  Use a name you'll recognize later when selecting a dataset for a training run. Avoid throwaway names like `test1`.
</Tip>

### 3. Choose where the data comes from

The dataset is created in a **Draft** state and opens to its card. The **Add Dataset** card prompts you to choose a source. Pick one of the two tabs.

#### Option A — Import from HuggingFace

On the **From HuggingFace** tab, type a dataset name into **Search HuggingFace datasets**. Matching datasets appear with their download count, language, and license tags. Click the one you want.

<img src="https://mintcdn.com/benchgen-8fc81371/FddC5uLEIMRz8cT0/images/train/dataset/03-source-huggingface.jpg?fit=max&auto=format&n=FddC5uLEIMRz8cT0&q=85&s=da6258cbac41bd88db6f5f92d2480b5c" alt="Searching HuggingFace for a dataset" width="1478" height="941" data-path="images/train/dataset/03-source-huggingface.jpg" />

The selected dataset shows as a chip, and the **Add dataset** button becomes active. Click **Add dataset** to import it.

<img src="https://mintcdn.com/benchgen-8fc81371/FddC5uLEIMRz8cT0/images/train/dataset/04-dataset-selected.jpg?fit=max&auto=format&n=FddC5uLEIMRz8cT0&q=85&s=9f0125d37fdcae7746015dd88e22d592" alt="A HuggingFace dataset selected with the Add dataset button enabled" width="1478" height="941" data-path="images/train/dataset/04-dataset-selected.jpg" />

#### Option B — Upload a file

On the **Upload File** tab, upload your own dataset file, then click **Add dataset**.

### 4. Confirm the dataset is ready

BenchGen registers the dataset and fills in its card with details such as **Rows**, **Columns**, **Splits**, download size, and update dates. The status badge reflects the source (for example **HuggingFace**).

<img src="https://mintcdn.com/benchgen-8fc81371/FddC5uLEIMRz8cT0/images/train/dataset/05-dataset-ready.jpg?fit=max&auto=format&n=FddC5uLEIMRz8cT0&q=85&s=1ea40da9d5e16b1cccb94c8a4bf6eab2" alt="The dataset card after import, showing row, column, and split details" width="1478" height="941" data-path="images/train/dataset/05-dataset-ready.jpg" />

Your dataset now appears in the **Datasets** list and is available to select when you configure a training run.

***

## Next Steps

* [Fine-tune a model](/train/fine-tune-a-model) using your dataset.
* [Export datasets from Eval](/eval/export-datasets) to turn benchmark failures into training data.
