---
title: Supervised LLM Fine-Tuning
type: templates
category: LLM Fine-tuning
cat: llm-fine-tuning
order: 903
is_new: t
meta_title: Create dataset for supervised LLM fine-tuning
meta_description: Template for creating dataset for supervised LLM fine-tuning with Label Studio for your machine learning and data science projects.
---
This template is designed for you to get started with the process of supervised LLM fine-tuning.
The goal of supervised LLM fine-tuning is to optimize your large language model (LLM) to generate responses given user-defined prompts which are more context specific than the original foundation model.
Open in Label Studio
There are multiple scenarios where you might want to use this approach:
- Document classification
- General Question-Answering (QA) systems
- Information Retrieval
- Customer Support services
Normally the need for this dataset collection template arise when you want:
- train the large foundation model to follow your instructions;
- create any Natural Language Processing (NLP) type of model, such as Sentiment Analysis, Text Summarization, Question-Answering, Named-entity recognition tailored to your domain;
- fix the model's mistakes in the generated responses;
- generate responses in a specific style;
- switching from few-shot model generation scenario to the zero-shot learning mode, without a need to provide any
examples;
- reduce model generation costs.
All these tasks require a dataset of prompts and corresponding responses, or completions, for example:
```json
[
{
"prompt": "",
"response": ""
},
{
"prompt": "",
"response": ""
},
{
"prompt": "",
"response": ""
}
]
```
## How to collect the dataset
Start your project with collecting initial set of prompts:
```json
[
{
"prompt": ""
},
{
"prompt": ""
},
{
"prompt": ""
}
]
```
Each JSON item will be rendered as a separate task in Label Studio to complete the response.
## Starting your labeling project
*Need a hand getting started with Label Studio? Check out our [Zero to One Tutorial](https://labelstud.io/blog/zero-to-one-getting-started-with-label-studio/).*
1. Create new project in Label Studio
2. Go to **Settings > Labeling Interface > Browse Templates > Generative AI > Supervised LLM Fine-tuning**
3. Save
Alternatively, you can create a new project by using our Python SDK:
```python
import label_studio_sdk
ls = label_studio_sdk.Client('YOUR_LABEL_STUDIO_URL', 'YOUR_API_KEY')
project = ls.create_project(title='Chatbot Model Assessment', label_config='...')
```
## Import the dataset
Using the Python SDK you can import the dataset with input prompts into Label Studio. With the `PROJECT_ID` of the project
you've just created, run the following code:
```python
from label_studio_sdk import Client
ls = Client(url='', api_key='')
project = ls.get_project(id=PROJECT_ID)
project.import_tasks('prompts.json')
```
Then you can start annotating the dataset by creating the responses.
## Export the dataset
There have to be from hundreds to thousands of tasks labeled to get your LLM being fine-tuned, depending on the
complexity of your problem statement.
After you've labeled enough tasks, you can export the dataset in the following raw Label Studio JSON format:
```json
[
{
"id": 1,
"data": {
"prompt": "Generate a Python function that takes a list of integers as input and returns the sum of all even numbers in the list."
},
"annotations": [
{
"id": 1,
"created_at": "2021-03-03T14:00:00.000000Z",
"result": [
{
"from_name": "instruction",
"to_name": "prompt",
"type": "textarea",
"value": {
"text": [
"def sum_even_numbers(numbers):\n return sum([n for n in numbers if n % 2 == 0])"
]
}
}
],
// other fields
```
It represents the list of tasks with annotations. Each task has a `data.prompt` field with the input prompt, and each "annotations" item contains a response result under `result.value.text` field.
You can create more than one annotation per task.
Alternatively, you can download the same data in CSV format:
```csv
prompt,instruction
"Generate...","def sum..."
```
## How to configure the labeling interface
The `Supervised Language Model Fine-tuning` template includes the following labeling interface in XML format:
```xml
```
Here it takes input prompt in `"$prompt"` variable and renders it as a text block with a blue background defined
by `