Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Code for the paper AttnGrounder: Talking to Cars with Attention

License

Notifications You must be signed in to change notification settings

vk-mittal14/AttnGrounder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AttnGrounder: Talking to Cars with Attention

AttnGrounder: Talking to Cars with Attention by Vivek Mittal.

Accepted at ECCV'20 C4AV Workshop. Talk2Car dataset used for this paper is available at https://talk2car.github.io/.

Model Overview

complete_model

Abstract:

We propose Attention Grounder (AttnGrounder), a singlestage end-to-end trainable model for the task of visual grounding. Visual grounding aims to localize a specific object in an image based on a given natural language text query. Unlike previous methods that use the same text representation for every image region, we use a visual-text attention module that relates each word in the given query with every region in the corresponding image for constructing a region dependent text representation. Furthermore, for improving the localization ability of our model, we use our visual-text attention module to generate an attention mask around the referred object. The attention mask is trained as an auxiliary task using a rectangular mask generated with the provided ground-truth coordinates. We evaluate AttnGrounder on the Talk2Car dataset and show an improvement of 3.26% over the existing methods.

Attention Map in Action

attention_map

Usage

Preprocessed Talk2Car data is available at this link extract it under ln_data folder. Download the images following instruction given at this link. Extract all the images in ln_data\images folder. All the hyperparameters are set, just run the following command in working directory (if you face memory issue try decreasing the batch size).

python train_yolo.py --batch_size 14

Credits

Part of the code or models are from DMS, MAttNet, Yolov3, Pytorch-yolov3 and One Stage Grounding.

About

Code for the paper AttnGrounder: Talking to Cars with Attention

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published