---
title: Zero-shot object detection and image segmentation with Grounding DINO and SAM
type: guide
tier: all
order: 15
hide_menu: true
hide_frontmatter_title: true
meta_title: Image segmentation in Label Studio using a Grounding DINO backend and SAM
meta_description: Label Studio tutorial for using Grounding DINO and SAM for zero-shot object detection in images
categories:
- Computer Vision
- Image Annotation
- Object Detection
- Zero-shot Image Segmentation
- Grounding DINO
- Segment Anything Model
This integration will allow you to:
See here for more details about the pre-trained Grounding DINO model.
Before you begin, you must install the Label Studio ML backend.
This tutorial uses the grounding_sam example.
docker-compose.yml to include the following:LABEL_STUDIO_HOST sets the endpoint of the Label Studio host. Must begin with http://LABEL_STUDIO_ACCESS_TOKEN sets the API access token for the Label Studio host. This can be found by logging
into Label Studio and going to the Account & Settings page.
Example:
LABEL_STUDIO_HOST=http://123.456.7.8:8080LABEL_STUDIO_ACCESS_TOKEN=your-api-keydocker compose updocker ps. You will use this URL when connecting the backend to a Label Studio project. Usually this is http://localhost:9090.Create a project and edit the labeling config (an example is provided below). When editing the labeling config, make sure to add all rectangle labels under the RectangleLabels tag, and all corresponding brush labels under the BrushLabels tag.
<View>
<Image name="image" value="$image"/>
<Style>
.lsf-main-content.lsf-requesting .prompt::before { content: ' loading...'; color: #808080; }
</Style>
<View className="prompt">
<TextArea name="prompt" toName="image" editable="true" rows="2" maxSubmissions="1" showSubmitButton="true"/>
</View>
<RectangleLabels name="label" toName="image">
<Label value="cats" background="yellow"/>
<Label value="house" background="blue"/>
</RectangleLabels>
<BrushLabels name="label2" toName="image">
<Label value="cats" background="yellow"/>
<Label value="house" background="blue"/>
</BrushLabels>
</View>
For the best user experience, it is recommended to use a GPU. To do this, you can update the docker-compose.yml file including the following lines:
environment:
- NVIDIA_VISIBLE_DEVICES=all
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
Combine the Segment Anything Model with your text input to automatically generate mask predictions!
To do this, set USE_SAM=true before running.
Warning: Using GroundingSAM without a GPU may result in slow performance and is not recommended. If you must use a CPU-only machine, and experience slow performance or don't see any predictions on the labeling screen, consider one of the following:
- Increase memory allocated to the Docker container (e.g.memory: 16Gindocker-compose.yml)
- Increase the prediction timeout on Label Studio instance with theML_TIMEOUT_PREDICT=100environment variable.
- Use "MobileSAM" as a lightweight alternative to "SAM".
If you want to use a more efficient version of SAM, set USE_MOBILE_SAM=true.
Note: This is an experimental feature.
Clone the Label Studio feature branch that includes the experimental batching functionality.
git clone -b feature/dino-support https://github.com/HumanSignal/label-studio.git
Run this branch with docker compose up
Note: If your prompt is different from the label values you have assigned, you can use the underscore to give the correct label values to your prompt outputs. For example, if you wanted to select all brown cats but still give them the label value "cats" from your labeling config, your prompt would be "brown cat_cats".
Adjust BOX_THRESHOLD and TEXT_THRESHOLD values in the Dockerfile to a number between 0 to 1 if experimenting. Defaults are set in dino.py. For more information about these values, click here.
If you want to use SAM models saved from either directories, you can use the MOBILESAM_CHECKPOINT and SAM_CHECKPOINT as shown in the Dockerfile.