Local relation network with multilevel attention for visual question answering.

AllBooks Images Videos Maps News Shopping

Scholarly articles for Local relation network with multilevel attention for visual question answering.

scholar.google.com › citations

… with multilevel attention for visual question answering
Sun · Cited by 10

Local relation network with multilevel attention for visual question ...

We proposed LRNs, which provide deeper semantic information, and a multilevel attention mechanism for VQA tasks. •. We comprehensively evaluated the COCO-QA ...

Local relation network with multilevel attention for visual question ...

dl.acm.org › doi › j.jvcir.2020.102762

Highlights •We proposed LRNs, which provide deeper semantic information, and a multilevel attention mechanism for VQA tasks.•We comprehensively evaluated ...

Local relation network with multilevel attention for visual question ...

www.semanticscholar.org › paper › Loca...

An image captioning method based on local relation network using a multilevel attention approach with graph neural network that not only fully explores the ...

[PDF] Multi-level Attention Networks for Visual Question Answering - Microsoft

www.microsoft.com › 2017/06 › M...

Inspired by the recent success of text-based question an- swering, visual question answering (VQA) is proposed to automatically answer natural language ...

Multilevel attention and relation network based image captioning ...

dl.acm.org › doi

Also, a multilevel attention approach is used to focus on a given image region and its related image regions, thus enhancing the image representation capability ...

Multilevel attention and relation network based image captioning ...

www.researchgate.net › ... › Images

Sep 16, 2022 · In this paper, a Local Relation Network (LRN) is designed over the objects and image regions which not only discovers the relationship between ...

[PDF] Cross-Modal Relational Reasoning Network for Visual Question ...

openaccess.thecvf.com › papers › C...

In this paper, to align the relation-consistent pairs and integrate the interpretability of VQA systems, we propose a Cross-modal Relational Rea- soning Network ...

Path-Wise Attention Memory Network for Visual Question Answering - MDPI

www.mdpi.com › ...

Sep 7, 2022 · We propose a path attention memory network (PAM) to construct a more robust composite attention model.

(PDF) Multi-modal co-attention relation networks for visual question ...

www.researchgate.net › publication › 36...

Oct 4, 2022 · The existing research on the visual question answering model mainly focuses on the point of view of attention mechanism and multi-modal fusion.

Multi-modal adaptive gated mechanism for visual question ...

www.ncbi.nlm.nih.gov › PMC10306234

Jun 28, 2023 · Visual Question Answering (VQA) is a multimodal task that uses natural language to ask and answer questions based on image content.