Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes

Zhi Cai^1,2, Yingjie Gao^1,2, Yaoyan Zheng^1,2, Nan Zhou^1,2 and Di Huang^1,2

¹SCSE Beihang University, ²IRIP Lab Beihang University

📢 Latest Updates

Aug-1-24: We open source the code, models.🔥🔥
Jul-20-24: Crowd-SAM paper is released arxiv link. 🔥🔥
Jul-1-24: Crowd-SAM has been accepted to ECCV-24 🎉.

Overview

Crowd-SAM is a novel few-shot object detection and segmentation method designed to handle crowded scenes. Generally, object detection requires extensive labels for training, which is quite time-consuming, especially in crowded scenes. In this work, We combine SAM with the specifically designed efficient prompt sampler and a mask selection PWD-Net to achieve fast and accurate pedestrian detection! Crowd-SAM achieves 78.4% AP on the Crowd-Human benchmark with 10 supporting images which is comparable to supervised detectors.

Installation

We recommend to use virtual enviroment, e.g. Conda, for installation:

Create virtual environment:
```
conda create -n crowdsam python=3.8
```

Clone this repository:

git clone https://github.com/yourusername/crowd-sam.git
cd crowdsam
pip install -r requirements.txt
git submodule update --init --recursive
pip install .

Download DINOv2(Vit-L) checkpoint SAM(ViT-L) checkpoint.

Place the donwdloaded weights in the weights directory. If it does not exist, use command mkdir weights to create one.

Data Preparation

1. CrowdHuman

Download the CrowdHuman dataset from the official website. Note that we only need the CrowdHuman_val.zip and annotation_val.odgt. For training data, we have prepared it in the crowdhuman_train directory and please copy the files into ./dataset/crowdhuman before training. Extract and place the downdloaded zip files in the dataset directory and it should look like this:

crowdsam/
├── dataset/
│   └── crowdhuman/
│       ├── annotation_val.odgt
│       ├── Images
└── ...

Run the script to convert odgt file to json file.

python tools/crowdhuman2coco.py -o annotation_val.odgt -v -s val_visible.json -d dataset/crowdhuman

How to use

To start training the model, run the following command:

python train.py --config_file ./configs/config.yaml

Our model configs are written with yaml in the configs directory. Make sure to update the config.yaml file with the appropriate paths and parameters as needed.

We prepare a pretrained adapter weights for CrowdHuman here

To evaluate the model, we recommend to use the following command for batch evaluation:

python tools/batch_eval.py --config_file ./configs/config.yaml -n num_gpus

To visualize the outputs, use the following command:

python tools/test.py --config_file ./configs/config.yaml --visualize

To run demo on your images, use the following command:

python tools/demo.py --config_file ./configs/config.yaml --input target_directory

This will run the evaluation script on the test dataset and output the results.

📊 Qualitative Results

Acknowlegement

We build our project based on the segment-anything and dinov2.

📜 Citation

You can cite our paper with such bibtex:

@inproceedings{cai2024crowd,
  title={Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes},
  author={Cai, Zhi and Gao, Yingjie and Zheng, Yaoyan and Zhou, Nan and Huang, Di},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
configs		configs
crowdhuman_train		crowdhuman_train
crowdsam		crowdsam
dinov2 @ e1277af		dinov2 @ e1277af
figures		figures
segment_anything_cs		segment_anything_cs
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
dataset		dataset
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes

Zhi Cai^1,2, Yingjie Gao^1,2, Yaoyan Zheng^1,2, Nan Zhou^1,2 and Di Huang^1,2

¹SCSE Beihang University, ²IRIP Lab Beihang University

📢 Latest Updates

Overview

Installation

Data Preparation

1. CrowdHuman

How to use

📊 Qualitative Results

Acknowlegement

📜 Citation

About

Releases

Packages

Languages

FelixCaae/CrowdSAM

Folders and files

Latest commit

History

Repository files navigation

Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes

Zhi Cai1,2, Yingjie Gao1,2, Yaoyan Zheng1,2, Nan Zhou1,2 and Di Huang1,2

1SCSE Beihang University, 2IRIP Lab Beihang University

📢 Latest Updates

Overview

Installation

Data Preparation

1. CrowdHuman

How to use

📊 Qualitative Results

Acknowlegement

📜 Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Zhi Cai^1,2, Yingjie Gao^1,2, Yaoyan Zheng^1,2, Nan Zhou^1,2 and Di Huang^1,2

¹SCSE Beihang University, ²IRIP Lab Beihang University

Packages