COCO, LVIS, Open Images V4 classes mapping

Giuseppe Amato; Paolo Bolettieri; Fabio Carrara; Fabrizio Falchi; Claudio Gennaro; Nicola Messina; Lucia Vadicamo; Claudio Vairo

doi:10.5281/zenodo.7194300

Published October 13, 2022 | Version v1

Dataset Open

COCO, LVIS, Open Images V4 classes mapping

1. ISTI-CNR

Contributors

Data curators:

1. ISTI-CNR

This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes.

COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes.

We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet

This repository contains the following files:

coco_classes_map.txt, contains the mapping for the 80 coco classes
lvis_classes_map.txt, contains the mapping for the 1460 coco classes
openimages_classes_map.txt, contains the mapping for the 601 coco classes
classname_hyperset_definition.csv, contains the final set of 1460 classes, their definition and hierarchy
all-classnames.xlsx, contains a side-by-side view of all classes considered

This mapping was used in VISIONE [Amato et al. 2021, Amato et al. 2022] that is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). For the object detection VISIONE uses three pre-trained models: VfNet [Zhang et al. 2021] (trained on COCO dataset), Mask R-CNN [He et al. 2017] (trained on LVIS), and a Faster R-CNN+Inception ResNet (trained on the Open Images V4).

This is repository is released under a Creative Commons Attribution license, please cite the following paper if you use it in your work in any form:

@inproceedings{amato2021visione,
  title={The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval},
  author={Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio},
  journal={Journal of Imaging},
  volume={7},
  number={5},
  pages={76},
  year={2021},
  publisher={Multidisciplinary Digital Publishing Institute}
}

References:

[Amato et al. 2022] Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: , et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52

[Amato et al. 2021] Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L. and Vairo, C., 2021. The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. Journal of Imaging, 7(5), p.76.

[Gupta et al.2019] Gupta, A., Dollar, P. and Girshick, R., 2019. Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5356-5364).

[He et al. 2017] He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

[Kuznetsova et al. 2020] Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A. and Duerig, T., 2020. The open images dataset v4. International Journal of Computer Vision, 128(7), pp.1956-1981.

[Lin et al. 2014] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L., 2014, September. Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.

[Zhang et al. 2021] Zhang, H., Wang, Y., Dayoub, F. and Sunderhauf, N., 2021. Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8514-8523).

Files

classname_hyperset_definition.csv

Files (291.3 kB)

Name	Size	Download all
all-classnames.xlsx md5:ebe9de7ccbb6b37da95e2b9c07bc236e	121.2 kB	Download
classname_hyperset_definition.csv md5:7c3029931bae0dca2d4cc4ecd90e3ea9	127.8 kB	Preview Download
coco_classes_map.txt md5:64ed4bffa15c874612445e34b57ebb86	1.4 kB	Preview Download
lvis_classes_map.txt md5:4f4ea26c41871e5d81e3f3fdab59a430	28.8 kB	Preview Download
openimages_classes_map.txt md5:9f3cdcd25d468fe02387319e8548b3d4	12.2 kB	Preview Download

Additional details

Is supplement to: Conference paper: 10.1007/978-3-030-98355-0_52 (DOI); Journal article: 10.3390/jimaging7050076 (DOI)

European Commission
AI4Media – A European Excellence Centre for Media, Society and Democracy 951911

	All versions	This version
Views	862	861
Downloads	658	657
Data volume	71.1 MB	71.0 MB

COCO, LVIS, Open Images V4 classes mapping

Contributors

Data curators:

Files

classname_hyperset_definition.csv

Files (291.3 kB)

Additional details

Related works

Funding

COCO, LVIS, Open Images V4 classes mapping

Creators

Contributors

Data curators:

Description

Files

classname_hyperset_definition.csv

Files (291.3 kB)

Additional details

Related works

Funding