Leolani: A Reference Machine with a Theory of Mind for Social Communication

Vossen, Piek; Baez, Selene; Bajc̆etić, Lenka; Kraaijeveld, Bram

doi:10.1007/978-3-030-00794-2_2

Piek Vossen¹⁹,
Selene Baez¹⁹,
Lenka Bajc̆etić¹⁹ &
…
Bram Kraaijeveld¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11107))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

1900 Accesses

Abstract

Our state of mind is based on experiences and what other people tell us. This may result in conflicting information, uncertainty, and alternative facts. We present a robot that models relativity of knowledge and perception within social interaction following principles of the theory of mind. We utilized vision and speech capabilities on a Pepper robot to build an interaction model that stores the interpretations of perceptions and conversations in combination with provenance on its sources. The robot learns directly from what people tell it, possibly in relation to its perception. We demonstrate how the robot’s communication is driven by hunger to acquire more knowledge from and on people and objects, to resolve uncertainties and conflicts, and to share awareness of the perceived environment. Likewise, the robot can make reference to the world and its knowledge about the world and the encounters with people that yielded this knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Social Robots with a Theory of Mind (ToM): Are We Threatened When They Can Read Our Emotions?

SONAR: An Adaptive Control Architecture for Social Norm Aware Robots

Article Open access 01 October 2024

CASPER: Cognitive Architecture for Social Perception and Engagement in Robots

Article Open access 14 March 2024

Notes

1.
Where possible, we follow the PROV-O model: https://www.w3.org/TR/prov-o/.
2.
There are now two perspectives from Lenka on the same claim (she changed his mind), expressed in two different utterances.
3.
The robot continuously detects objects, but these are only stored in memory when they are referenced by humans in the communication.

References

Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/
Amos, B., Ludwiczuk, B., Satyanarayanan, M.: Openface: a general-purpose face recognition library with mobile applications. Technical report, CMU-CS-16-118, CMU School of Computer Science (2016)
Google Scholar
T.W. Project Authors: WebRTC. Online publication (2011). https://webrtc.org/
Baron-Cohen, S.: Mindblindness: An Essay on Autism and Theory of Mind. MIT Press, Cambridge (1997)
Google Scholar
Bratman, M.: Intention, plans, and practical reason (1987)
Google Scholar
Card, S.K.: The Psychology of Human-Computer Interaction. CRC Press, Boca Raton (2017)
Google Scholar
Epley, N., Waytz, A., Cacioppo, J.T.: On seeing human: a three-factor theory of anthropomorphism. Psychol. Rev. 114(4), 864 (2007)
Article Google Scholar
Fokkens, A., Vossen, P., Rospocher, M., Hoekstra, R., van Hage, W.: Grasp: grounded representation and source perspective. In: Proceedings of KnowRSH, RANLP-2017 Workshop, Varna, Bulgaria (2017)
Google Scholar
Google: Cloud speech-to-text - speech recognition. Online publication (2018). https://cloud.google.com/speech-to-text/
Hiatt, L.M., Harrison, A.M., Trafton, J.G.: Accommodating human variability in human-robot teams through theory of mind. In: IJCAI Proceedings-International Joint Conference on Artificial Intelligence, vol. 22, p. 2066 (2011)
Google Scholar
Kanade, T., Cohn, J.F., Tian, Y.: Comprehensive database for facial expression analysis. In: 2000 Proceedings of Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 46–53. IEEE (2000)
Google Scholar
Leslie, A.M.: Pretense and representation: the origins of “theory of mind”. Psychol. Rev. 94(4), 412 (1987)
Article Google Scholar
Mavridis, N.: A review of verbal and non-verbal human-robot interactive communication. Robot. Auton. Syst. 63, 22–35 (2015)
Article MathSciNet Google Scholar
Mirnig, N., Stollnberger, G., Miksch, M., Stadler, S., Giuliani, M., Tscheligi, M.: To err is robot: how humans assess and act toward an erroneous social robot. Front. Robot. AI 4, 21 (2017)
Article Google Scholar
Ono, T., Imai, M., Nakatsu, R.: Reading a robot’s mind: a model of utterance understanding based on the theory of mind mechanism. Adv. Robot. 14(4), 311–326 (2000)
Article Google Scholar
Partan, S.R., Marler, P.: Issues in the classification of multimodal communication signals. Am. Nat. 166(2), 231–245 (2005)
Article Google Scholar
Premack, D., Woodruff, G.: Does the chimpanzee have a theory of mind? Behav. Brain Sci. 4, 515–526 (1978)
Article Google Scholar
Scassellati, B.: Theory of mind for a humanoid robot. Auton. Robot. 12(1), 13–24 (2002)
Article Google Scholar
Scassellati, B.M.: Foundations for a theory of mind for a humanoid robot. Ph.D. thesis, Massachusetts Institute of Technology (2001)
Google Scholar
Serban, I.V., Sordoni, A., Bengio, Y., Courville, A.C., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: AAAI, vol. 16, pp. 3776–3784 (2016)
Google Scholar
She, L., Chai, J.: Interactive learning of grounded verb semantics towards human-robot communication. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1634–1644 (2017)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Computer Vision and Pattern Recognition (CVPR) (2015). http://arxiv.org/abs/1409.4842
Van Hage, W.R., Malaisé, V., Segers, R., Hollink, L., Schreiber, G.: Design and use of the simple event model (SEM). Web Semant.: Sci. Serv. Agents World Wide Web 9(2), 128–136 (2011)
Article Google Scholar
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)
Article Google Scholar
Vossen, P., et al.: Newsreader: using knowledge resources in a cross-lingual reading machine to generate more knowledge from massive streams of news. Knowl.-Based Syst. (2016). http://www.sciencedirect.com/science/article/pii/S0950705116302271
Wahlster, W.: SmartKom: Foundations of Multimodal Dialogue Systems, vol. 12. Springer, Heidelberg (2006). https://doi.org/10.1007/3-540-36678-4
Book Google Scholar

Download references

Acknowledgement

This research was funded by the VU University Amsterdam and the Netherlands Organization for Scientific Research via the Spinoza grant awarded to Piek Vossen. We also thank Bob van der Graft for his support.

Author information

Authors and Affiliations

Computational Lexicology and Terminology Lab, VU University Amsterdam, De Boelelaan 1105, 1081HV, Amsterdam, The Netherlands
Piek Vossen, Selene Baez, Lenka Bajc̆etić & Bram Kraaijeveld

Authors

Piek Vossen
View author publications
You can also search for this author in PubMed Google Scholar
Selene Baez
View author publications
You can also search for this author in PubMed Google Scholar
Lenka Bajc̆etić
View author publications
You can also search for this author in PubMed Google Scholar
Bram Kraaijeveld
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Piek Vossen .

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Aleš Horák
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Karel Pala

Appendix: Dialogues

In the dialogues, L preceding an utterance stands for Leolani, other letters preceding utterances stand for various people. Perceptions of the robot of people and objects are marked using square brackets, e.g. [Sees a new face].

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vossen, P., Baez, S., Bajc̆etić, L., Kraaijeveld, B. (2018). Leolani: A Reference Machine with a Theory of Mind for Social Communication. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2018. Lecture Notes in Computer Science(), vol 11107. Springer, Cham. https://doi.org/10.1007/978-3-030-00794-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-00794-2_2
Published: 08 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00793-5
Online ISBN: 978-3-030-00794-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics