article

Guest Editorial: Image and Language Understanding

Authors:

Margaret Mitchell,

John C. Platt,

Kate SaenkoAuthors Info & Claims

International Journal of Computer Vision, Volume 123, Issue 1

Pages 1 - 3

https://doi.org/10.1007/s11263-017-0993-y

Published: 01 May 2017 Publication History

Abstract

No abstract available.

References

[1]

Antol, S., Agrawal, A., Lu, J., Mitchell, M., Batra, D., Zitnick, C. L., & Parikh, D. (2015). VQA: Visual question answering. In Proceedings of the IEEE international conference on computer vision (pp. 2425-2433).

Crossref

Google Scholar

[2]

Devlin, J., Cheng, H., Fang, H., Gupta, S., Deng, L., He, X., Zweig, G., & Mitchell, M. (2015). Language models for image captioning: The quirks and what works. In Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Vol. 2: short papers, pp. 100-105). Association for Computational Linguistics, Beijing, China. http://www.aclweb.org/anthology/P15-2017.

Google Scholar

[3]

Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., & Darrell, T. (2015). Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2625-2634).

Google Scholar

[4]

Fang, H., Gupta, S., Iandola, F., Srivastava, R. K., Deng, L., Dollár, P., Gao, J., He, X., Mitchell, M., & Platt, J. C., et al. (2015). From captions to visual concepts and back. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1473-1482).

Google Scholar

[5]

Guadarrama, S., Krishnamoorthy, N., Malkarnenkar, G., Venugopalan, S., Mooney, R., Darrell, T., & Saenko, K. (2013). Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition. In 2013 IEEE international conference on computer vision (ICCV) (pp. 2712-2719). IEEE.

Crossref

Google Scholar

[6]

Jabri, A., Joulin, A., & van der Maaten, L. (2016). Revisiting visual question answering baselines. In European conference on computer vision (pp. 727-739). Berlin: Springer.

Google Scholar

[7]

Karpathy, A., & Fei-Fei, L. (2015). Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3128-3137).

Google Scholar

[8]

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).

Digital Library

Google Scholar

[9]

Kulkarni, G., Premraj, V., Ordonez, V., Dhar, S., Li, S., Choi, Y., et al. (2013). Babytalk: Understanding and generating simple image descriptions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12), 2891-2903.

Digital Library

Google Scholar

[10]

Lawrence, S., Giles, C. L., Tsoi, A. C., & Back, A. D. (1997). Face recognition: A convolutional neural-network approach. IEEE Transactions on Neural Networks, 8(1), 98-113.

Digital Library

Google Scholar

[11]

LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., et al. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4), 541-551.

Digital Library

Google Scholar

[12]

Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft COCO: Common objects in context. In European conference on computer vision (pp. 740-755). Berlin: Springer.

Google Scholar

[13]

Malinowski, M., & Fritz, M. (2014). A multi-world approach to question answering about real-world scenes based on uncertain input. In Advances in neural information processing systems (pp. 1682-1690).

Crossref

Google Scholar

[14]

Malisiewicz, T., & Efros, A. (2009). Beyond categories: The visual memex model for reasoning about object relationships. In Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, A. Culotta (Eds.), Advances in neural information processing systems (Vol. 22, pp. 1222-1230).

Crossref

Google Scholar

[15]

Mikolov, T., Karafiát, M., Burget, L., Cernocky, J., & Khudanpur, S. (2010). Recurrent neural network based language model. In Interspeech (Vol. 2, pp. 1045-1048). Makuhari, Chiba: ISCA.

Google Scholar

[16]

Mitchell, M., Han, X., Dodge, J., Mensch, A., Goyal, A., Berg, A., Yamaguchi, K., Berg, T., Stratos, K., & Daumé III, H. (2012). Midge: Generating image descriptions from computer vision detections. In Proceedings of the 13th conference of the European chapter of the association for computational linguistics (pp. 747-756). Association for Computational Linguistics.

Crossref

Google Scholar

[17]

Nowlan, S. J., & Platt, J. C. (1995). A convolutional neural network hand tracker. In Advances in neural information processing systems (pp. 901-908).

Crossref

Google Scholar

[18]

Parikh, D., Zitnick, C. L., & Chen, T. (2008). From appearance to context-based recognition: Dense labeling in small images. In IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008 (pp. 1-8). IEEE.

Google Scholar

[19]

Ren, M., Kiros, R., & Zemel, R. (2015). Exploring models and data for image question answering. In Advances in neural information processing systems (pp. 2953-2961).

Crossref

Google Scholar

[20]

Roberts, L. G. (1963). Machine perception of three-dimensional solids. Ph.D. thesis, MIT.

Google Scholar

[21]

Sadeghi, M. A., & Farhadi, A. (2011). Recognition using visual phrases. In 2011 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1745-1752). IEEE.

Crossref

Google Scholar

[22]

Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3156-3164).

Google Scholar

[23]

Zitnick, C. L., Agrawal, A., Antol, S., Mitchell, M., Batra, D., & Parikh, D. (2016). Measuring machine intelligence through visual question answering. AI Magazine, 37(1).

Google Scholar

Cited By

View all

Roitberg AMartinez MHaurilet MStiefelhagen R(2018)Towards a Fair Evaluation of Zero-Shot Action Recognition Using External DataComputer Vision – ECCV 2018 Workshops10.1007/978-3-030-11018-5_8(97-105)Online publication date: 8-Sep-2018
https://dl.acm.org/doi/10.1007/978-3-030-11018-5_8

Index Terms

Guest Editorial: Image and Language Understanding
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Index terms have been assigned to the content through auto-classification.

Recommendations

Network-Centric Military Communications [Guest Editorial]

The four articles in this special section are devoted to network-centric military communications.
Guest editorial: military communications

Since its introduction in the late 1990s, the concept of network-centric operations to enable information sharing has been a fundamental element of the vision of military organizations throughout the world. As the complexity of operational environments ...
Guest Editorial

<P>It was with great pleasure and a sense of privilege that I received the invitation from Arie Lewin to contribute the guest editorial to this special issue of Organization Science. I was appointed Editor-in-Chief of Organization Studies in January ...

Comments

Information & Contributors

Information

Published In

cover image International Journal of Computer Vision

International Journal of Computer Vision Volume 123, Issue 1

May 2017

120 pages

ISSN:0920-5691

Issue’s Table of Contents

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 May 2017

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Roitberg AMartinez MHaurilet MStiefelhagen R(2018)Towards a Fair Evaluation of Zero-Shot Action Recognition Using External DataComputer Vision – ECCV 2018 Workshops10.1007/978-3-030-11018-5_8(97-105)Online publication date: 8-Sep-2018
https://dl.acm.org/doi/10.1007/978-3-030-11018-5_8

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Index Terms

Recommendations

Network-Centric Military Communications [Guest Editorial]

Guest editorial: military communications

Guest Editorial

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations