Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
survey

Visualizing Natural Language Descriptions: A Survey

Published: 29 June 2016 Publication History
  • Get Citation Alerts
  • Abstract

    A natural language interface exploits the conceptual simplicity and naturalness of the language to create a high-level user-friendly communication channel between humans and machines. One of the promising applications of such interfaces is generating visual interpretations of semantic content of a given natural language that can be then visualized either as a static scene or a dynamic animation. This survey discusses requirements and challenges of developing such systems and reports 26 graphical systems that exploit natural language interfaces and addresses both artificial intelligence and visualization aspects. This work serves as a frame of reference to researchers and to enable further advances in the field.

    References

    [1]
    G. Adorni, M. Manzo, and G. Ferrari. 1983. Natural language input for scene generation. In Proceedings of the First Conference on European Chapter of the Association for Computational Linguistics, 175--82. Association for Computational Linguistics.
    [2]
    G. Adorni, M. Manzo, and F. Giunchiglia. 1984. Natural language driven image generation. In Proceedings of COLING 84, 495--500.
    [3]
    Rakesh Agrawal, Sreenivas Gollapudi, Anitha Kannan, and Krishnaram Kenthapadi. 2011. Enriching textbooks with images. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, 1847--56. CIKM’11. ACM, New York, NY.
    [4]
    Ola Akerberg, Hans Svensson, Bastian Schulz, and Pierre Nugues. 2003. Carsim: An automatic 3d text-to-scene conversion system applied to road accident reports. In Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, 2, 191--94. Association for Computational Linguistics.
    [5]
    Norman I. Badler, Rama Bindiganavale, Jan Allbeck, William Schuler, Liwei Zhao, and Martha Palmer. 2000. Parameterized Action Representation for Virtual Human Agents. MIT Press, Cambridge, MA. http://dl.acm.org/citation.cfm?id=371552.371567.
    [6]
    Norman I. Badler, Martha S. Palmer, and Rama Bindiganavale. 1999. Animation control for real-time virtual humans. Commun. ACM 42, 8, 64--73.
    [7]
    Norman I. Badler, Cary B. Phillips, and Bonnie Lynn Webber. 1993. Simulating Humans: Computer Graphics Animation and Control. Oxford University Press, New York.
    [8]
    Collin F. Baker, Charles J. Fillmore, and John B. Lowe. 1998. The berkeley framenet project. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1, 86--90. ACL’98. Stroudsburg, PA, USA: Association for Computational Linguistics.
    [9]
    Pierre Bangalore, Alexis Nasr, Owen Rambow, and Beno Sagot. 2009. MICA: A probabilistic dependency parser based on tree insertion grammars. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 185--88. Association for Computational Linguistics.
    [10]
    Yoshua Bengio. 2009. Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1, 1--127.
    [11]
    Rama Bindiganavale, William Schuler, Jan M. Allbeck, Norman I. Badler, Aravind K. Joshi, and Martha Palmer. 2000. Dynamically altering agent behaviors using natural language instructions. In Proceedings of the Fourth International Conference on Autonomous Agents, 293--300. AGENTS’00. ACM, New York, NY.
    [12]
    Edgar Bolaño-Rodríguez, Juan C. González-Moreno, David Ramos-Valcarcel, and Luiz Vázquez-López. 2011. Using multi-agent systems to visualize text descriptions. In Advances on Practical Applications of Agents and Multiagent Systems, Yves Demazeau, Michal Pěchoucěk, Juan M. Corchado, and Javier Bajo Pérez (Eds.). 39--45. Advances in Intelligent and Soft Computing 88. Springer, Berlin. http://link.springer.com/chapter/10.1007/978-3-642-19875-5_5.
    [13]
    Bob Coyne, Daniel Bauer. 2011. VigNet: Grounding language in graphics using frame semantics. In Proceedings of the ACL 2011 Workshop on Relational Models of Semantics, 28--36.
    [14]
    Johan Carlberger and Viggo Kann. 1999. Implementing an efficient part-of-speech tagger. Softw. Pract. Exper. 29, 9, 815--32.
    [15]
    Angel Chang, Manolis Savva, and Christopher Manning. 2014a. Interactive learning of spatial knowledge for text to 3D scene generation. In Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, 14--21. http://aclanthology.info/papers/interactive-learning-of-spatial-knowledge-for-text-to-3d-scene-generation.
    [16]
    Angel Chang, Manolis Savva, and Christopher Manning. 2014b. Semantic parsing for text to 3D scene generation. In Proceedings of the ACL 2014 Workshop on Semantic Parsing, 17--21. http://aclanthology.info/papers/semantic-parsing-for-text-to-3d-scene-generation.
    [17]
    Angel Chang, Manolis Savva, and Christopher Manning. 2014c. Learning spatial knowledge for text to 3D scene generation. In EMNLP, 2028--38.
    [18]
    Siddhartha Chaudhuri, Evangelos Kalogerakis, Stephen Giguere, and Thomas Funkhouser. 2013. Attribit: Content creation with semantic attributes. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology, 193--202. UIST’13. ACM, New York, NY.
    [19]
    Zeng Chen, Jin Hou, Dengsheng Zhang, and Xue Qin. 2012. An annotation rule extraction algorithm for image retrieval. Pattern Recogn. Lett. 33, 10, 1257--68.
    [20]
    Cheng-Chieh Chiang. 2013. Interactive tool for image annotation using a semi-supervised and hierarchical approach. Comput. Stand. Interf. 35, 1, 50--58.
    [21]
    S. R. Clay and J. Wilhelms. 1996. Put: Language-based interactive manipulation of objects. IEEE Comput. Graphics Appl. 16, 2, 31--39.
    [22]
    Bob Coyne, Alex Klapheke, Masoud Rouhizadeh, Richard Sproat, and Daniel Bauer. 2012. Annotation tools and knowledge representation for a text-to-scene system. In Proceedings of COLING 2012, 679--94. http://aclanthology.info/papers/annotation-tools-and-knowledge-representation-for-a-text-to-scene-system.
    [23]
    Bob Coyne, Owen Rambow, Julia Hirschberg, and Richard Sproat. 2010. Frame semantics in text-to-scene generation. In Knowledge-Based and Intelligent Information and Engineering Systems, Rossitza Setchi, Ivan Jordanov, Robert J. Howlett, and Lakhmi C. Jain (Eds.). 375--84. Lecture Notes in Computer Science 6279. Springer, Berlin. http://link.springer.com/chapter/10.1007/978-3-642-15384-6_40.
    [24]
    Bob Coyne and Richard Sproat. 2001. WordsEye: An automatic text-to-scene conversion system. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, 487--96. SIGGRAPH’01. ACM New York, NY.
    [25]
    Bob Coyne, Richard Sproat, and Julia Hirschberg. 2010. Spatial relations in text-to-scene conversion. In Workshop at Spatial Cognition: Computational Models of Spatial Language Interpretation, 9--16. Mt. Hood, OR, USA.
    [26]
    Hamish Cunningham, Diana Maynard, Kalina Bontcheva, and Valentin Tablan. 2002. A framework and graphical development environment for robust NLP tools and applications. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 168--75. Philadelphia, PA, USA.
    [27]
    Sylvain Dupuy, Arjan Egges, Vincent Legendre, and Pierre Nugues. 2001. Generating a 3d simulation of a car accident from a written description in natural language: The carsim system. In Proceedings of the Workshop on Temporal and Spatial Information Processing, 13, 1--8. Association for Computational Linguistics.
    [28]
    Charles Fillmore. 1968. The case for case. In Universals in Linguistic Theory, 1--88. Holt, Rinehart, New York, NY.
    [29]
    Karën Fort, Gilles Adda, and K. Bretonnel Cohen. 2011. Amazon mechanical turk: Gold mine or coal mine? Comput. Linguist. 37, 2, 413--20.
    [30]
    K. Glass and S. Bangay. 2008. Automating the creation of 3D animation from annotated fiction text. In Proceedings of the IADIS International Conference on Computer Graphics and Visualization, 3--10.
    [31]
    Kevin Glass. 2008. Automating the conversion of natural language fiction to multi-modal 3D animated virtual environments. Ph.D. Dissertation. Department of Computer Science, Rhodes University.
    [32]
    Zhiguo Gong, Hou U. Leong, and Chan Wa Cheang. 2005. Web image indexing by using associated texts. Knowl. Inform. Syst. 10, 2, 243--64.
    [33]
    Eva Hanser, Paul Mc Kevitt, Tom Lunney, and Joan Condell. 2009. SceneMaker: Automatic visualisation of screenplays. In KI 2009: Advances in Artificial Intelligence, Bärbel Mertsching, Marcus Hund, and Zaheer Aziz (Eds.). 265--72. Lecture Notes in Computer Science 5803. Springer, Berlin,. http://link.springer.com/chapter/10.1007/978-3-642-04617-9_34.
    [34]
    Eva Hanser, Paul Mc Kevitt, Tom Lunney, Joan Condell, and Minhua Ma. 2010. SceneMaker: Multimodal visualisation of natural language film scripts. In Knowledge-Based and Intelligent Information and Engineering Systems, Rossitza Setchi, Ivan Jordanov, Robert J. Howlett, and Lakhmi C. Jain (Eds.). 430--39. Lecture Notes in Computer Science 6279. Springer, Berlin. http://link.springer.com/chapter/10.1007/978-3-642-15384-6_46.
    [35]
    Kaveh Hassani and Won-Sook Lee. 2015. Adaptive animation generation using web content mining. In 2015 IEEE International Conference on Evolving and Adaptive Intelligent Systems (EAIS), 1--8.
    [36]
    Kaveh Hassani, Ali Nahvi, and Ali Ahmadi. 2013a. Architectural design and implementation of intelligent embodied conversational agents using fuzzy knowledge base. J. Intell. Fuzzy Syst. 25, 3, 811--23.
    [37]
    Kaveh Hassani, Ali Nahvi, and Ali Ahmadi. 2013b. Design and implementation of an intelligent virtual environment for improving speaking and listening skills. Interact. Learn. Environ. 24, 1, 252--71.
    [38]
    T. Järvinen and P. Tapanainen. 1997. A dependency parser for english. TR-1. Department of General Linguistics, University of Helsinki.
    [39]
    Richard Johansson, David Williams, Anders Berglund, and Pierre Nugues. 2004. Carsim: A system to visualize written road accident reports as animated 3d scenes. In Proceedings of the 2nd Workshop on Text Meaning and Interpretation. Vol. 57--64. Association for Computational Linguistics.
    [40]
    Dhiraj Joshi. 2004. The story picturing engine: Finding elite images to illustrate a story using mutual reinforcement. In In MIR’04: Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, 119--26. ACM Press, New York, NY.
    [41]
    Dhiraj Joshi, James Z. Wang, and Jia Li. 2006. The story picturing engine—a system for automatic text illustration. ACM Trans. Multimedia Comput. Commun. Appl. 2, 1, 68--89.
    [42]
    Caitlin Kelleher and Randy Pausch. 2007. Using storytelling to motivate programming. Commun. ACM 50, 7, 58--64.
    [43]
    Deniz Kilinç and Adil Alpkocak. 2011. An expansion and reranking approach for annotation-based image retrieval from web. Expert Syst. Appl. 38, 10, 13121--27.
    [44]
    Sangwon Lee and Jin Yan. 2014. The potential of a text-based interface as a design medium: An experiment in a computer animation environment. Interacting with Computers, September.
    [45]
    H. Liu and P. Singh. 2004. Conceptnet—a practical commonsense reasoning tool-kit. BT Technol. J. 22, 4, 211--26.
    [46]
    Zhi-Qiang Liu and Ka-Ming Leung. 2005. Script visualization (ScriptViz): A smart system that makes writing fun. Soft Comput. 10, 1, 34--40.
    [47]
    Minhua Ma. 2006. Automatic Conversion of Natural Language to 3D Animation. Ph.D Dissertation., University of Ulster, Londonderry. http://www.paulmckevitt.com/phd/mathesis.pdf.
    [48]
    Minhua Ma and Paul Mc Kevitt. 2005. Visual semantics and ontology of eventive verbs. In Natural Language Processing -- IJCNLP 2004, Keh-Yih Su, Jun’ichi Tsujii, Jong-Hyeok Lee, and Oi Yee Kwong, (Eds.). 187--96. Lecture Notes in Computer Science 3248. Springer, Berlin. http://link.springer.com/chapter/10.1007/978-3-540-30211-7_20.
    [49]
    Minhua Ma and Paul Mc Kevitt. 2007. Virtual human animation in natural language visualisation. Artific. Intell. Rev. 25, 1--2, 37--53.
    [50]
    Christopher Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven Bethard, and David McClosky. 2014. The stanford corenlp natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 55--60.
    [51]
    M. Manzo, G. Adorni, and F. Giunchiglia. 1986. Reasoning about scene descriptions. IEEE Proc. Natural Lang. 74, 1013--25.
    [52]
    R. Mihalcea and P. Tarau. 2004. TextRank: Bringing order into texts. In Proceedings of Empirical Methods in Natural Language Processing, 404--11.
    [53]
    George Miller. 1998. WordNet: An Electronic Lexical Database, Christiane Fellbaum (Ed.). A Bradford Book, Cambridge, MA.
    [54]
    Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, and Alex Graves, et al. 2015. Human-level control through deep reinforcement learning. Nature 518 (7540), 529--33.
    [55]
    Juan Carlos González Moreno and Luis Vázquez López. 2009. Using techniques based on natural language in the development process of multiagent systems. In International Symposium on Distributed Computing and Artificial Intelligence 2008 (DCAI 2008), Juan M. Corchado, Sara Rodríguez, James Llinas, and José M. Molina, (Eds.). 269--73. Advances in Soft Computing 50. Springer, Berlin. http://link.springer.com/chapter/10.1007/978-3-540-85863-8_32.
    [56]
    Amanda Oddie, Paul Hazlewood, Brian Farrimond, and Steve Presland. 2011. Applying deductive techniques to the creation of realistic historical 3D spatiotemporal visualisations from natural language narratives. In Proceedings of the 2011 International Conference on Electronic Visualisation and the Arts, 97--105. EVA’11. UK: British Computer Society, Swinton, UK. http://dl.acm.org/citation.cfm?id=2227233.2227252.
    [57]
    M. Oshita 2009. Generating animation from natural language texts and framework of motion database. In International Conference on CyberWorlds, 2009. CW’09, 146--53.
    [58]
    Masaki Oshita. 2010. Generating animation from natural language texts and semantic analysis for motion search and scheduling. Vis. Comput. 26, 5, 339--52.
    [59]
    Patrick Paroubek, Yves Schabes, and Aravind K. Joshi. 1992. XTAG: A graphical workbench for developing tree-adjoining grammars. In Proceedings of the Third Conference on Applied Natural Language Processing, 223--30. ANLC’92. USA: Association for Computational Linguistics, Stroudsburg, PA.
    [60]
    Juan Pavón and Jorge Gómez-Sanz. 2003. Agent oriented software engineering with INGENIAS. In Multi-Agent Systems and Applications III, Vladimír Mařík, Michal Pěchouček, and Jörg Müller (Eds.). 394--403. Lecture Notes in Computer Science 2691. Springer, Berlin. http://link.springer.com/chapter/10.1007/3-540-45023-8_38.
    [61]
    S. Presland, B. Farrimond, P. Hazlewood, and A. Oddie. 2010. Creating complex interactive 3D visualisations of naval battles from natural language narratives. In Developments in E-Systems Engineering (DESE), 2010, 113--18.
    [62]
    Chris Quirk, Pallavi Choudhury, Jianfeng Gao, Hisami Suzuki, Kristina Toutanova, Michael Gamon, Wen-tau Yih, Lucy Vanderwende, and Colin Cherry. 2012. MSR SPLAT, a language analysis toolkit. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstration Session, 21--24. NAACL HLT’12. : Association for Computational Linguistics, Stroudsburg, PA. http://dl.acm.org/citation.cfm?id = 2386856.2386862.
    [63]
    Adwait Ratnaparkhi. 1996. A maximum entropy model for part-of-speech tagging. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 133--142.
    [64]
    Masoud Rouhizadeh, Daniel Bauer, Robert Eric Coyne, Owen C. Rambow, and Richard Sproat. 2011. Collecting spatial information for locations in a text-to-scene conversion system. In Computational Models for Spatial Languages, CogSci. http://academiccommons.columbia.edu/catalog/ac:163766.
    [65]
    Masoud Rouhizadeh, Margit Bowler, Richard Sproat, and Bob Coyne. 2011. Collecting semantic data from mechanical turk for a lexical knowledge resource in a text to picture generating system. In Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011). http://aclanthology.info/papers/collecting-semantic-data-from-mechanical-turk-for-a-lexical-knowledge-resource-in-a-text-to-picture-generating-system.
    [66]
    Masoud Rouhizadeh, Bob Coyne, and Richard Sproat. 2011. Collecting semantic information for locations in the scenario-based lexical knowledge resource of a text-to-scene conversion system. In Knowledge-Based and Intelligent Information and Engineering Systems, Andreas König, Andreas Dengel, Knut Hinkelmann, Koichi Kise, Robert J. Howlett, and Lakhmi C. Jain (Eds.). 378--87. Lecture Notes in Computer Science 6884. Springer, Berlin. http://link.springer.com/chapter/10.1007/978-3-642-23866-6_40.
    [67]
    M. Rouhizadeh, M. Bowler, R. Sproat, and B. Coyne. 2010. Data collection and normalization for building the scenario-based lexical knowledge resource of a text-to-scene conversion system. In 2010 5th International Workshop on Semantic Media Adaptation and Personalization (SMAP), 25--30.
    [68]
    S. Sekine 1998. Corpus-based parsing and sublanguage studies. Department of Computer Science, New York University.
    [69]
    Hyunju Shim, Bogyeong Kang, and Kyungsoo Kwag. 2009. Web2Animation - automatic generation of 3D animation from the web text. In IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT’09, 1:596--601.
    [70]
    Jamie Shotton, John Winn, Carsten Rother, and Antonio Criminisi. 2007. Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int. J. Comput. Vis. 81, 1, 2--23.
    [71]
    Christian Spika, Katharina Schwarz, Holger Dammertz, and Hendrik P. A. Lensch. 2011. AVDT - automatic visualization of descriptive texts. In Proceedings of the Vision, Modeling, and Visualization Workshop, Peter Eisert, Joachim Hornegger, and Konrad Polthier (Eds.). 129--36. Eurographics Association, Berlin.
    [72]
    Vundavalli Srinivasarao and Vasudeva Varma. 2012. Web image annotation using an effective term weighting. In Computational Linguistics and Intelligent Text Processing, Alexander Gelbukh (Ed.). 286--96. Lecture Notes in Computer Science 7182. Springer, Berlin. http://link.springer.com/chapter/10.1007/978-3-642-28601-8_24.
    [73]
    Kaoru Sumi and Mizue Nagata. 2006. Animated storytelling system via text. In Proceedings of the 2006 ACM SIGCHI International Conference on Advances in Computer Entertainment Technology. ACE’06. ACM, New York, NY.
    [74]
    Kaoru Sumi and Katsumi Tanaka. 2005. Automatic conversion from e-content into virtual storytelling. In Virtual Storytelling. Using Virtual Reality Technologies for Storytelling, Gérard Subsol (Ed.). 260--69. Lecture Notes in Computer Science 3805. Springer, Berlin. http://link.springer.com/chapter/10.1007/11590361_30.
    [75]
    Yosuke Takashima, Hideo Shimazu, and Masahiro Tomono. 1987. Story driven animation. In Proceedings of the SIGCHI/GI Conference on Human Factors in Computing Systems and Graphics Interface, 149--53. CHI’87. ACM New York, NY.
    [76]
    Ro Valitutti. 2004. WordNet-Affect: An affective extension of wordnet. In Proceedings of the 4th International Conference on Language Resources and Evaluation, 1083--86.
    [77]
    J. Z. Wang, J. Li, and G. Wiederhold. 2001. Simplicity: Semantics-sensitive integrated matching for picture libraries. IEEE Trans. Pattern Anal. Mach. Intell. 23, 9, 947--63.
    [78]
    Wayne Ward. 1991. Understanding spontaneous speech: The phoenix system. In, 1991 International Conference on Acoustics, Speech, and Signal Processing, 1991. ICASSP-91, 365--67, 1.
    [79]
    Terry Winograd. 1971. Procedures as a representation for data in a computer program for understanding natural language. Technical Report AITR-235. M.I.T. Artificial Intelligence Laboratory. http://dspace.mit.edu/handle/1721.1/7095.
    [80]
    Patrick Ye and Timothy Baldwin. 2008. Towards automatic animated storyboarding. In Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 1, 578--83. AAAI’08. AAAI Press, Chicago, IL. http://dl.acm.org/citation.cfm?id=1619995.1620089.
    [81]
    Xin Zeng. 2007. Generation of a 3D Virtual Environment Based on Story Descriptions. Ph.D. Dissertation, University of Wolverhampton.
    [82]
    Xin Zeng, Qasim Mehdi, and Norman Gough. 2005a. 3D scene creation using story-based descriptions. In Proceedings of CGAIMS’2005, Qasim Mehdi, Norman Gough, and A Elmaghraby (Eds.). 74--80. University of Wolverhampton, School of Computing and Information Technology, Louisville, KY.
    [83]
    Xin Zeng, Q. H. Mehdi, and N. E. Gough. 2005b. From visual semantic parameterization to graphic visualization. In Proceedings of the Ninth International Conference on Information Visualisation, 2005, 488--93.
    [84]
    Xin Zeng and Manling Tan. 2007. The development of a language interface for 3D scene generation. In Proceedings of the Second IASTED International Conference on Human Computer Interaction, 136--41. IASTED-HCI’07. ACTA Press, Anaheim, CA. http://dl.acm.org/citation.cfm?id=1698252.1698277.
    [85]
    Yin Zhang, Rong Jin, and Zhi-Hua Zhou. 2010. Understanding bag-of-words model: A statistical framework. Int. J. Mach. Learn. Cybernet. 1, 1--4, 43--52.
    [86]
    Xiaojin Zhu, Andrew B. Goldberg, Mohamed Eldawy, Charles R. Dyer, and Bradley Strock. 2007. A text-to-picture synthesis system for augmenting communication. In Proceedings of the 22Nd National Conference on Artificial Intelligence - Volume 2, 1590--95. AAAI’07. AAAI Press, Vancouver, British Columbia, Canada. http://dl.acm.org/citation.cfm?id = 1619797.1619900.
    [87]
    C. Lawrence Zitnick, Devi Parikh, and Lucy Vanderwende. 2013. Learning the visual interpretation of sentences. In Proceedings of the 2013 IEEE International Conference on Computer Vision, 1681--88. ICCV’13. Washington, DC, USA: IEEE Computer Society.

    Cited By

    View all
    • (2024)Reversal of the Word Sense Disambiguation Task Using a Deep Learning ModelApplied Sciences10.3390/app1413555014:13(5550)Online publication date: 26-Jun-2024
    • (2023)Visual Captions: Augmenting Verbal Communication with On-the-fly VisualsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581566(1-20)Online publication date: 19-Apr-2023
    • (2023)Fine-Grained Feature Generation for Generalized Zero-Shot Video ClassificationIEEE Transactions on Image Processing10.1109/TIP.2023.324716732(1599-1612)Online publication date: 2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Computing Surveys
    ACM Computing Surveys  Volume 49, Issue 1
    March 2017
    705 pages
    ISSN:0360-0300
    EISSN:1557-7341
    DOI:10.1145/2911992
    • Editor:
    • Sartaj Sahni
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 June 2016
    Accepted: 01 April 2016
    Revised: 01 February 2016
    Received: 01 October 2015
    Published in CSUR Volume 49, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Text-to-picture conversion
    2. natural language understanding
    3. symbol grounding
    4. text-to-animation conversion
    5. text-to-scene conversion

    Qualifiers

    • Survey
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)54
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 14 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Reversal of the Word Sense Disambiguation Task Using a Deep Learning ModelApplied Sciences10.3390/app1413555014:13(5550)Online publication date: 26-Jun-2024
    • (2023)Visual Captions: Augmenting Verbal Communication with On-the-fly VisualsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581566(1-20)Online publication date: 19-Apr-2023
    • (2023)Fine-Grained Feature Generation for Generalized Zero-Shot Video ClassificationIEEE Transactions on Image Processing10.1109/TIP.2023.324716732(1599-1612)Online publication date: 2023
    • (2023)Evaluating the usage of Text to3D scene generation methods in Game-Based Learning2023 24th International Conference on Control Systems and Computer Science (CSCS)10.1109/CSCS59211.2023.00105(633-640)Online publication date: May-2023
    • (2023)A Review of Text-to-Animation SystemsIEEE Access10.1109/ACCESS.2023.330490311(86071-86087)Online publication date: 2023
    • (2023)Semantic Scene Builder: Towards a Context Sensitive Text-to-3D Scene FrameworkDigital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management10.1007/978-3-031-35748-0_32(461-479)Online publication date: 9-Jul-2023
    • (2022)Text to 3D,2D scene generation systems, Frameworks and approaches: a survey2022 4th International Conference on Pattern Analysis and Intelligent Systems (PAIS)10.1109/PAIS56586.2022.9946876(1-6)Online publication date: 12-Oct-2022
    • (2022)Recognition of visual scene elements from a story text in Persian natural languageNatural Language Engineering10.1017/S1351324922000390(1-27)Online publication date: 24-Aug-2022
    • (2022)Automatic and intelligent content visualization system based on deep learning and genetic algorithmNeural Computing and Applications10.1007/s00521-022-06887-134:3(2473-2493)Online publication date: 1-Feb-2022
    • (2021)Deep Semantic Parsing with Upper OntologiesApplied Sciences10.3390/app1120942311:20(9423)Online publication date: 11-Oct-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media