Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJune 2024
Towards Retrieval-Augmented Architectures for Image Captioning
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 20, Issue 8Article No.: 242, Pages 1–22https://doi.org/10.1145/3663667The objective of image captioning models is to bridge the gap between the visual and linguistic modalities by generating natural language descriptions that accurately reflect the content of input images. In recent years, researchers have leveraged deep ...
- research-articleMarch 2024
Video Surveillance and Privacy: A Solvable Paradox?
Through experiments on action recognition and natural language description, we show that the paradox of surveillance and privacy can be solved by artificial intelligence and that respect for human rights is not a chimera.
- research-articleOctober 2022
Retrieval-Augmented Transformer for Image Captioning
CBMI '22: Proceedings of the 19th International Conference on Content-based Multimedia IndexingPages 1–7https://doi.org/10.1145/3549555.3549585Image captioning models aim at connecting Vision and Language by providing natural language descriptions of input images. In the past few years, the task has been tackled by learning parametric models and proposing visual feature extraction ...