Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
Mar 17, 2023 · A diffusion-based text-video retrieval framework (DiffusionRet), which models the retrieval task as a process of gradually generating joint distribution from ...
In this paper, we propose a novel diffusion-based text-video retrieval framework, called DiffusionRet, which addresses the limitations of current ...
To this end, we propose a novel diffusion-based text- video retrieval framework, called DiffusionRet, which ad- dresses the limitations of current ...
We compare the proposed DiffusionRet with other methods on five benchmark text-video retrieval datasets, including. MSRVTT [37], LSMDC [33], MSVD [5], ...
A diffusion-based text-video retrieval framework (Diffusion-Ret), which models the retrieval task as a process of gradually generating joint distribution from ...
This work creatively tackles the text-video retrieval task from a generative viewpoint and model the correlation between the text and the video as their joint ...
To this end, we propose a novel diffusion-based text- video retrieval framework, called DiffusionRet, which ad- dresses the limitations of current ...
Recent work (Wan et al., 2024) has shown that visual and textual tokens exhibit distinct attention patterns in multi-head attention.
People also ask
This is accomplished through a diffusion-based text-video retrieval framework (DiffusionRet), which models the retrieval task as a process of gradually ...