AutoMM Detection - Convert VOC Format Dataset to COCO Format¶
Pascal VOC is a collection of datasets for object detection.
And VOC format refers to the specific format (in .xml
file) the Pascal VOC dataset is using.
In this tutorial, we will convert VOC2007 dataset from VOC format to COCO format. See AutoMM Detection - Prepare Pascal VOC Dataset for how to download it.
We will use our tool voc2coco
. This python script is in our code:
voc2coco.py,
and you can also run it as a cli: python3 -m autogluon.multimodal.cli.voc2coco
.
Note: In Autogluon MultiModalPredictor, we strongly recommend using COCO as your data format. However, for fast proof testing we also have limit support for VOC format.
Convert Existing Splits¶
Under VOC format root path, we have the following folders:
Annotations ImageSets JPEGImages
And normally there are some pre-defined split files under ImageSets/Main/
:
train.txt
val.txt
test.txt
...
We can convert those splits into COCO format by simply running given the root directory, e.g. ./VOCdevkit/VOC2007
:
python3 -m autogluon.multimodal.cli.voc2coco --root_dir ./VOCdevkit/VOC2007
The command line output will show the progress:
Start converting !
17%|█████████████████▍ | 841/4952 [00:00<00:00, 15571.88it/s
Now those splits are converted to COCO format in Annotations
folder under the root directory:
train_cocoformat.json
val_cocoformat.json
test_cocoformat.json
...
Convert Existing Splits¶
Instead of using predefined splits, you can also split the data with the train/validation/test ratio you want. Note that this does not require any pre-existing split files. To split train/validation/test by 0.6/0.2/0.2, run:
python3 -m autogluon.multimodal.cli.voc2coco --root_dir ./VOCdevkit/VOC2007 --train_ratio 0.6 --val_ratio 0.2
The command line output will show the progress:
Start converting !
17%|█████████████████▍ | 841/4952 [00:00<00:00, 15571.88it/s
And this will generate user splited COCO format in Annotations
folder under the root directory:
usersplit_train_cocoformat.json
usersplit_val_cocoformat.json
usersplit_test_cocoformat.json
Other Examples¶
You may go to AutoMM Examples to explore other examples about AutoMM.
Customization¶
To learn how to customize AutoMM, please refer to Customize AutoMM.