COCO, LVIS, Open Images V4 classes mapping

Giuseppe Amato; Paolo Bolettieri; Fabio Carrara; Fabrizio Falchi; Claudio Gennaro; Nicola Messina; Lucia Vadicamo; Claudio Vairo

doi:10.5281/zenodo.7194300

Published October 13, 2022 | Version v1

Dataset Open

COCO, LVIS, Open Images V4 classes mapping

1. ISTI-CNR

Contributors

Data curator (2):

1. ISTI-CNR

This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes.

COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes.

We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet

This repository contains the following files:

coco_classes_map.txt, contains the mapping for the 80 coco classes
lvis_classes_map.txt, contains the mapping for the 1460 coco classes
openimages_classes_map.txt, contains the mapping for the 601 coco classes
classname_hyperset_definition.csv, contains the final set of 1460 classes, their definition and hierarchy
all-classnames.xlsx, contains a side-by-side view of all classes considered

This mapping was used in VISIONE [Amato et al. 2021, Amato et al. 2022] that is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). For the object detection VISIONE uses three pre-trained models: VfNet [Zhang et al. 2021] (trained on COCO dataset), Mask R-CNN [He et al. 2017] (trained on LVIS), and a Faster R-CNN+Inception ResNet (trained on the Open Images V4).

This is repository is released under a Creative Commons Attribution license, please cite the following paper if you use it in your work in any form:

@inproceedings{amato2021visione,
  title={The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval},
  author={Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio},
  journal={Journal of Imaging},
  volume={7},
  number={5},
  pages={76},
  year={2021},
  publisher={Multidisciplinary Digital Publishing Institute}
}

References:

[Amato et al. 2022] Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: , et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52

[Amato et al. 2021] Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L. and Vairo, C., 2021. The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. Journal of Imaging, 7(5), p.76.

[Gupta et al.2019] Gupta, A., Dollar, P. and Girshick, R., 2019. Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5356-5364).

[He et al. 2017] He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

[Kuznetsova et al. 2020] Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A. and Duerig, T., 2020. The open images dataset v4. International Journal of Computer Vision, 128(7), pp.1956-1981.

[Lin et al. 2014] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L., 2014, September. Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.

[Zhang et al. 2021] Zhang, H., Wang, Y., Dayoub, F. and Sunderhauf, N., 2021. Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8514-8523).

Files

classname_hyperset_definition.csv

Files (291.3 kB)

Name	Size	Download all
all-classnames.xlsx md5:ebe9de7ccbb6b37da95e2b9c07bc236e	121.2 kB	Download
classname_hyperset_definition.csv md5:7c3029931bae0dca2d4cc4ecd90e3ea9	127.8 kB	Preview Download
coco_classes_map.txt md5:64ed4bffa15c874612445e34b57ebb86	1.4 kB	Preview Download
lvis_classes_map.txt md5:4f4ea26c41871e5d81e3f3fdab59a430	28.8 kB	Preview Download
openimages_classes_map.txt md5:9f3cdcd25d468fe02387319e8548b3d4	12.2 kB	Preview Download

Additional details

Is supplement to: Conference paper: 10.1007/978-3-030-98355-0_52 (DOI); Journal article: 10.3390/jimaging7050076 (DOI)

European Commission
AI4Media - A European Excellence Centre for Media, Society and Democracy 951911

	All versions	This version
Views	1,663	1,662
Downloads	1,243	1,242
Data volume	123.3 MB	123.2 MB

Contributors

Data curator (2):

classname_hyperset_definition.csv

Files (291.3 kB)

Related works

Funding

COCO, LVIS, Open Images V4 classes mapping

Authors/Creators

Contributors

Data curator (2):

Description

Files

classname_hyperset_definition.csv

Files (291.3 kB)

Additional details

Related works

Funding