Abstract
This article describes the data collection system and methods adopted in the METALOGUE (Multiperspective Multimodal Dialogue System with Metacognitive Abilities) project. The ultimate goal of the METALOGUE project is to develop a multimodal dialogue system with abilities to deliver instructional advice by interacting with humans in a natural way. The data we are collecting will facilitate the development of a dialogue system which will exploit metacognitive reasoning in order to deliver feedback on the user’s performance in debates and negotiations. The initial data collection scenario consists of debates where two students are exchanging views and arguments on a social issue, such as a proposed ban on smoking in public areas, and delivering their presentations in front of an audience. Approximately 3 hours of data has been recorded to date, and all recorded streams have been precisely synchronized and pre-processed for statistical learning. The data consists of audio, video and 3-dimensional skeletal movement information of the participants. This data will be used in the development of cognitive dialogue and discourse models to underpin educational interventions in public speaking training.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Helvert, J.V., Rosmalen, P.V., Börner, D., Petukhova, V., Alexandersson, J.: Observing, coaching and reflecting: a multi-modal natural language-based dialogue system in a learning context. In: Workshop Proceedings of the 11th International Conference on Intelligent Environments, vol. 19, pp. 220–227. IOS Press (2015)
Tumposky, N.R.: The debate debate. Clearing House: J. Educ. Strateg. Issues Ideas 78(2), 52–56 (2004)
Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., Douville, B., Prevost, S., Stone, M.: Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents. In: Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques, pp. 413–420. ACM (1994)
Huang, L., Morency, L.P., Gratch, J.: Virtual rapport 2.0. In: Proceedings of the International Workshop on Intelligent Virtual Agents, pp. 68–79. Springer (2011)
Zhao, R., Papangelis, A., Cassell, J.: Towards a dyadic computational model of rapport management for human-virtual agent interaction. In: Proceedings of the International Conference on Intelligent Virtual Agents, pp. 514–527. Springer (2014)
van Son, R., Wesseling, W., Sanders, E., van den Heuvel, H.: The IFADV corpus: a free dialog video corpus. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), pp. 501–508. European Language Resources Association (ELRA), Marrakech, Morocco (May 2008). http://www.lrec-conf.org/proceedings/lrec2008/
McCowan, I., Lathoud, G., Lincoln, M., Lisowska, A., Post, W., Reidsma, D., Wellner, P.: The AMI meeting corpus. In: Noldus, L.P.J.J., Grieco, F., Loijens, L.W.S., Zimmerman, P.H. (eds.) Proceedings Measuring Behavior 2005, 5th International Conference on Methods and Techniques in Behavioral Research. Noldus Information Technology, Wageningen (2005)
Ochoa, X., Worsley, M., Chiluiza, K., Luz, S.: Mla’14: Third multimodal learning analytics workshop and grand challenges. In: Proceedings of the 16th International Conference on Multimodal Interaction, pp. 531–532. ICMI ’14, ACM, New York, NY, USA (2014). http://doi.acm.org/10.1145/2663204.2668318
Stassen, H., et al.: Speaking behavior and voice sound characteristics in depressive patients during recovery. J. Psychiatr. Res. 27(3), 289–307 (1993)
Lamerton, J.: Public Speaking. Everything You Need to Know. Harpercollins Publishers Ltd. (2001)
Grandstaff, D.: Speaking as a Professional: Enhance Your Therapy or Coaching Practice Through Presentations, Workshops, and Seminars. A Norton Professional Book. W.W. Norton & Company (2004). https://books.google.ie/books?id=UvmrZdAmNcYC
DeCoske, M.A., White, S.J.: Public speaking revisited: delivery, structure, and style. Am. J. Health-Syst. Pharm. 67(15), 1225–1227 (2010). http://www.ajhp.org/cgi/content/full/67/15/1225
Slaney, M., Stolcke, A., Hakkani-Tür, D.: The relation of eye gaze and face pose: potential impact on speech recognition. In: Proceedings of the 16th International Conference on Multimodal Interaction, pp. 144–147. ACM (2014)
Rouvier, M., Dupuy, G., Gay, P., Khoury, E., Merlin, T., Meignier, S.: An open-source state-of-the-art toolbox for broadcast news diarization. Techniacl report, IDIAP (2013)
Acknowledgements
This research is supported by EU FP7-METALOGUE project under Grant No. 611073 at School of Computer Science and Statistics, Trinity College Dublin.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media Singapore
About this chapter
Cite this chapter
Haider, F., Luz, S., Campbell, N. (2017). Data Collection and Synchronisation: Towards a Multiperspective Multimodal Dialogue System with Metacognitive Abilities. In: Jokinen, K., Wilcock, G. (eds) Dialogues with Social Robots. Lecture Notes in Electrical Engineering, vol 427. Springer, Singapore. https://doi.org/10.1007/978-981-10-2585-3_19
Download citation
DOI: https://doi.org/10.1007/978-981-10-2585-3_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2584-6
Online ISBN: 978-981-10-2585-3
eBook Packages: EngineeringEngineering (R0)