---
title: Get data into Label Studio
short: Import data
type: guide
tier: all
order: 157
order_enterprise: 157
meta_title: Import Data into Label Studio
meta_description: Label and annotate data for your machine learning and data science projects using common file formats or the Label Studio JSON format.
section: "Import & Export"
---
Get data into Label Studio by importing files, referencing URLs, or syncing with cloud or database storage.
- If your data is stored in a cloud storage bucket, see [Sync data from cloud or database storage](storage.html).
- If your data is stored in a Redis database, see [Sync data from cloud or database storage](storage.html).
- If your data is stored at internet-accessible URLs, in files, or directories, [import it from the Label Studio UI](#Import-data-from-the-Label-Studio-UI).
- If your data is stored locally, [import it into Label Studio](#Import-data-from-a-local-directory).
- If your data contains predictions or pre-annotations, see [Import pre-annotated data into Label Studio](predictions.html).
## General guidelines for importing data
* It’s best to keep about 100k tasks / 100k annotations per project for optimal performance.
* Avoid frequent imports because each new import requires lengthy background operations. One import per 30 seconds will work without overloads.
!!! attention
For large projects or business critical projects, do not [upload media files through the Label Studio interface](#Import-data-from-the-Label-Studio-UI). This is especially true for files such as images, audio, video, timeseries, etc.
Uploading data through the Label Studio UI works fine for proof of concept projects, but it is not recommended for larger projects. Label Studio is not designed as a hosting service at scale and does not have backups for imported media resources.
**Risks when uploading through the UI**:
You will face challenges when attempting to do the following:
* Importing tasks with predictions
* Exporting your data
* Moving your data to another Label Studio instance
* Redeploying Label Studio
We ***strongly*** recommend that you configure [source storage](storage) instead.
## Types of data you can import into Label Studio
You can import many types of data, including text, timeseries, audio, and image data. The file types supported depend on the type of data.
| Data type | Supported file types |
| --- | --- |
| Audio | .flac, .m4a, .mp3, .ogg, .wav |
| [HyperText (HTML)](#Import-HTML-data) | .html, .htm, .xml |
| Images | .bmp, .gif, .jpg, .png, .svg, .webp |
| Paragraphs (Dialogue) | .json |
| Structured data | .csv, .tsv |
| [Text](#Plain-text) | .txt, .json |
| [Time series](#Import-CSV-or-TSV-data) | .csv, .tsv, .json |
| [Tasks with multiple data types](#Basic-Label-Studio-JSON-format) | .csv, .tsv, .json, .jsonl*, .parquet*+ |
| Video | .mp4, .webm |
\* *Cloud storage only*
\+ *Label Studio Enterprise and Starter Cloud only*
If you don't see a supported data or file type that you want to import, please let us know by submitting an issue to the Label Studio Repository.
### How to import your data
The most secure and reliable method to import your data is to store the data outside of Label Studio and import references to the data using URLs. You can import a list of URLs in a TXT, CSV, or TSV file, or reference the URLs in [JSON task format](#Basic-Label-Studio-JSON-format).
If you're importing audio, image, or video data, you must use URLs to refer to those data types.
If you're importing HTML, text, dialogue, or timeseries data using the ``, ``, ``, or `` tags in your labeling configuration, you can either load data directly, or load data from a URL.
- To load data from a URL, specify `valueType="url"` in your labeling configuration.
- To load data directly into the Label Studio database, specify `valueType="text"` for `HyperText` or `Text` data, or `valueType="json"` for `Paragraph` or `TimeSeries` data.
!!! note
If you load data from a URL, the data is not saved in Label Studio. If you want an annotated task export to include the data that you annotated, you must import the data into the Label Studio database without using URL references, or combine the data with the annotations after exporting.
{% details Click to expand example configurations with each valueType %}
#### Example with valueType="text"
Labeling configuration:
{% codeblock lang:xml %}
{% endcodeblock %}
JSON file to import:
{% codeblock lang:json %}
{
"text": "My awesome opossum"
}
{% endcodeblock %}
CSV file to import:
{% codeblock lang:csv %}
text
My awesome opossum
{% endcodeblock %}
{% enddetails %}
## How to retrieve data
There are several steps to retrieve the data to display in the `Object` tag. The data retrieval is also used in [dynamic choices](/templates/serp_ranking.html) and [labels](/templates/inventory_tracking.html). Use the following parameters in the `Object` tag.
### `value` (required)
The `value` parameter represents the source of the data. It can be plain text or a step of complex data retrieval system. It can be denoted using the following forms:
`value` (required)
#### Variables
In most cases, the `Object` tag has the value with one variable (prefixed with a $) in it.
For example, `` seeks the "audio" field in the imported JSON object:
```json
{
"data": {
"audio": "https://host.name/myaudio.wav"
}
}
```
#### Plain text
The value parameter can be a string. It is useful for `Header` and `Text`.
Also, you can use the content of the tag as value. It is useful for descriptive text tags and is applied for `Label` and `Choice`.
For example:
```xml
Label audio:other
```
#### Other cases
1. The `value` parameter can be a text containing variables prefixed by $.
For example:
```xml
```
2. The `value` parameter can also refer to nested data in arrays and dicts (`$texts[2]` and `$audio.url`).
For example:
```xml
```
### `valueType` (optional)
The `valueType` parameter defines how to treat the data retrieved from the previous steps.
There are two options such as the "url" and raw data. Currently the raw data input can be "text” or "json”. The “text” is used for `HyperText` and `Text` tags and "json" is used for `TimeSeries` tag.
For example:
- Using “url”: `` displays the text loaded by the URL.
- Using “text”: `` displays the URL without loading the text.
### `resolver` (optional)
Use this parameter to retrieve data from multi-column csv on [S3 or other cloud storage](/guide/storage.html). Label Studio can retrieve it only in run-time, so it's secure.
If you import a file with a list of tasks, and every task in this list is a link to another file in the storage. In this case, you can use the `resolver` parameter to retrieve the content of these files from a storage.
#### Use Case
There is a list of tasks, where the "remote" field of every task is a link to a CSV file in the storage. Every CSV file has a “text” column with text to be labeled. Every CSV file has a “text” column with text to be labeled. For example:
Tasks:
```json
[
{ "remote": "s3://bucket/text1.csv" },
{ "remote": "s3://bucket/text2.csv" }
]
```
CSV file:
```csv
id;text
12;The most flexible data annotation tool. Quickly installable. Build custom UIs or use pre-built labeling templates.
```
#### Solution
To retrieve the file, use the following parameters:
1. `value="$remote"`: The URL to CSV on S3 is in "remote" field of task data. If you use the `resolver` parameter the `value` is always treated as URL, so you don't need to set `valueType`.
2. `resolver="csv|separator=;|column=text"`: Load this file in run-time, parse it as CSV, and get the “text” column from the first row.
3. Display the result.
#### Syntax
The syntax for the `resolver` parameter consists of a list of options separated by a `|` symbol.
The first option is the type of file.
!!! note
Currently, only CSV files are supported.
The remaining options are parameters of the specified file type with optional values. The parameters for CSV files are:
- `headless`: A CSV file does not have headers (this parameter is boolean and can't have a value).
- `separator=;`: CSV separator, usually can be detected automatically.
- `column=1`: In `headless` mode use zero-based index, otherwise use column name.
For example, `resolver="csv|headless|separator=;|column=1"`
## How to format your data to import it
Label Studio treats different file types different ways.
If you want to import multiple types of data to label at the same time, for example, images with captions or audio recordings with transcripts, you must use the [basic Label Studio JSON format](#Basic-Label-Studio-JSON-format).
[You can also use a CSV file or a JSON list of tasks to point to URLs with the data](#How-to-import-your-data), rather than directly importing the data if you need to import thousands of files. You can import files containing up to 250,000 tasks or up to 50MB in size into Label Studio.
If you're specifying data in a cloud storage bucket or container, and you don't want to [sync cloud storage](storage.html), create and specify [presigned URLs for Amazon S3 storage](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html), [signed URLs for Google Cloud Storage](https://cloud.google.com/storage/docs/access-control/signed-urls), or [shared access signatures for Microsoft Azure](https://docs.microsoft.com/en-us/azure/storage/common/storage-sas-overview) in a JSON, CSV, TSV or TXT file.
### Basic Label Studio JSON format
The best way to import data into Label Studio is to use a JSON-formatted list of tasks. The `data` key of the JSON file references each task as an entry in a JSON dictionary. If there is no `data` key, Label Studio interprets the entire JSON file as one task.
In the `data` JSON dictionary, use key-value pairs that correspond to the source key expected by the object tag in the [label configuration](setup.html#Customize-the-labeling-interface-for-your-project) that you set up for your project.
Depending on the type of object tag, Label Studio interprets field values differently:
- ``: `value` is interpreted as plain text.
- ``: `value` is interpreted as HTML markup.
- ``: `value` is interpreted as a base64 encoded HTML markup.
- `