---
title: Multi-Turn Chat Evaluation
type: templates
category: LLM Evaluations
cat: llm-evaluations
order: 972
meta_title: Multi-Turn Chat Evaluation Template
meta_description: Use the SDK to create a dynamic template for evaluating multi-turn chats.
date: 2025-01-21 10:49:57
---
This template uses the example available here: [Multi-turn Chat Labeling: Evaluating Virtual Assistant Conversations](https://github.com/HumanSignal/label-studio-examples/blob/main/multi-turn-chat/Readme.md)
You can use this example to evaluate multi-turn chat conversations in Label Studio, identifying areas to enhance your virtual assistant’s performance and user experience.
For this example, you will need the following:
- Label Studio instance
- Label Studio SDK (`pip install label-studio-sdk`)
- Python 3.8+ with pandas
## Labeling configuration
In this example, the labeling configuration is dynamically generated. This is necessary because each chat has a different number of turns (questions and responses).
To build your own template XML, you will need to follow the steps outlined in the following notebook: [**Evaluating Virtual Assistant Conversations.ipynb**](https://github.com/HumanSignal/label-studio-examples/blob/main/multi-turn-chat/Evaluating%20Virtual%20Assistant%20Conversations.ipynb)
However, here is an example of the labeling configuration for a 5-turn chat:
```xml
```
## About the labeling configuration
#### Paragraphs
```xml
```
This displays the entire conversation in one column under “Full Conversation” using a Paragraphs tag. It shows each message (with role and content) as a dialogue.
On the other column, it organizes annotation questions by turn. Each “Turn” is inside a collapsible `` component and has its own `` tag. For example:
```xml
```
This lets you see only the subset of the conversation relevant to that turn.
#### Choices
For each turn, there are multiple blocks, each focusing on different questions:
1. User’s intent in this turn (multiple choice).
2. Whether the assistant’s response addresses that intent (single choice).
3. Whether the assistant’s response is accurate/helpful (single choice).
4. The implied “action” of the assistant’s response (multiple choice).
The `toName` attributes (for instance, `toName="turn1_prg"`) tie each set of choices to that turn’s Paragraphs object, so each question is specifically linked to the text of that turn.
## Related tags
- [Paragraphs](/tags/paragraphs.html)
- [Choices](/tags/choices.html)