research-article

A Generic Machine Learning Based Approach for Addressee Detection In Multiparty Interaction

Authors:

Mukesh Barange,

Julien Saunier,

Alexandre PauchetAuthors Info & Claims

IVA '19: Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents

Pages 119 - 126

https://doi.org/10.1145/3308532.3329462

Published: 01 July 2019 Publication History

Abstract

Addressee detection is one of the most fundamental tasks for seamless dialogue management and turn taking in human-agent interaction. Whereas addressee detection is implicit in dyadic interaction, it becomes a challenging task in multiparty interactions when more than two participants are involved. Existing research works employ either rule-based or statistical approaches for addressee detection. However, most of these works either have been tested on a single data set or only support a fixed number of participants. In this article, we propose a model based on generic features to predict the addressee in data sets with varying number of participants. The results tested on two different corpora show that the proposed model outperforms existing baselines.

References

[1]

Harm Akker and Rieks Akker. 2009. Are You Being Addressed?-real-time addressee detection to support remote participants in hybrid meetings. In Proceedings of the SIGDIAL 2009 Conference. 21--28.

Digital Library

[2]

Rieks op den Akker and David Traum. 2009. A comparison of addressee detection methods for multiparty conversations. In Workshop on the Semantics and Pragmatics of Dialogue. 99--106.

[3]

Naoya Baba, Hung-Hsuan Huang, and Yukiko I Nakano. 2011. Identifying Utterances Addressed to an Agent in Multiparty Human--Agent Conversations. In International Workshop on Intelligent Virtual Agents. 255--261.

Digital Library

[4]

Jean Carletta. 2007. Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus. Language Resources and Evaluation 41, 2 (2007), 181--190.

[5]

Michel Galley, Kathleen McKeown, Julia Hirschberg, and Elizabeth Shriberg. 2004. Identifying agreement and disagreement in conversational speech: Use of bayesian networks to model pragmatic dependencies. In Proceedings of ACL'04. 669.

Digital Library

[6]

Erving Goffman. 1981. Forms of talk. University of Pennsylvania publications in conduct and communication.

[7]

Marti A. Hearst, Susan T Dumais, Edgar Osuna, John Platt, and Bernhard Scholkopf. 1998. Support vector machines. Intelligent Systems and their applications 13, 4 (1998), 18--28.

Digital Library

[8]

David W Hosmer Jr, Stanley Lemeshow, and Rodney X Sturdivant. 2013. Applied logistic regression. Vol. 398.

[9]

Natasa Jovanovic. 2007. To Whom It May Concern-Addressee Identification in Face-to-Face Meetings. (2007).

[10]

Natasa Jovanovic, Rieks op den Akker, and Anton Nijholt. 2006. A corpus for studying addressing behaviour in multi-party dialogues. LREC'06 40, 1 (2006), 5--23.

[11]

Maria Koutsombogera and Carl Vogel. 2018. Modeling collaborative multimodal behavior in group dialogues: the MULTISIMO Corpus. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC- 2018).

[12]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.

Digital Library

[13]

Rudolf Kruse, Christian Borgelt, Frank Klawonn, Christian Moewes, Matthias Steinbrecher, and Pascal Held. 2013. Multi-layer perceptrons. In Computational Intelligence. 47--81.

[14]

Thao Minh Le, Nobuyuki Shimizu, Takashi Miyazaki, and Koichi Shinoda. 2018. Deep Learning Based Multi-modal Addressee Recognition in Visual Scenes with Utterances. arXiv preprint arXiv:1809.04288 (2018).

[15]

Andy Liaw, Matthew Wiener, et al. 2002. Classification and regression by randomForest. R news 2, 3 (2002), 18--22.

[16]

Usman Malik, Mukesh Barange, Julien Saunier, and Alexandre Pauchet. 2018. Performance Comparison of Machine Learning Models Trained on Manual vs ASR Transcriptions for Dialogue Act Annotation. In 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 1013--1017.

[17]

Usman Malik., Mukesh Barange., Julien Saunier., and Alexandre Pauchet. 2019. Using Multimodal Information to Enhance Addressee Detection in Multiparty Interaction. In Proceedings of the 11th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,. INSTICC, SciTePress, 267--274.

[18]

Iain McCowan, Jean Carletta, W Kraaij, S Ashby, S Bourban, M Flynn, M Guillemot, T Hain, J Kadlec, V Karaiskos, et al. 2005. The AMI meeting corpus. In Proc. of the 5th International Conference on Methods and Techniques in Behavioral Research, Vol. 88. 100.

[19]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research 12, Oct (2011), 2825--2830.

Digital Library

[20]

Chao-Ying Joanne Peng, Kuk Lida Lee, and Gary M Ingersoll. 2002. An introduction to logistic regression analysis and reporting. The journal of educational research 96, 1 (2002), 3--14.

[21]

Adria Recasens, Aditya Khosla, Carl Vondrick, and Antonio Torralba. 2015. Where are they looking?. In Adv. in Neural Information Processing Systems. 199--207.

Digital Library

[22]

Irina Rish et al. 2001. An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence, Vol. 3. IBM New York, 41--46.

[23]

John Searle. 1969. Speech Acts: An Essay in the Philosophy of Language.

[24]

Ovidiu erban and Alexandre Pauchet. 2013. Agentslang: A fast and reliable platform for distributed interactive systems. In 2013 IEEE 9th International Conference on Intelligent Computer Communication and Processing (ICCP). IEEE, 35--42.

[25]

Selmar K Smit and Agoston E Eiben. 2009. Comparing parameter tuning methods for evolutionary algorithms. In Proc of CEC'09. 399--406.

Digital Library

[26]

David R Traum, Susan Robinson, and Jens Stephan. 2006. Evaluation of Multi- Party Reality Dialogue Interaction. Technical Report. University of Southern California Marina Del Rey CA Inst For Creative Technologies.

[27]

Roel Vertegaal. 1998. Look Who's Talking to Whom. Mediating Joint Attention in multiparty (1998).

[28]

Min-Ling Zhang and Zhi-Hua Zhou. 2005. A k-nearest neighbor based algorithm for multi-label classification. In Granular Computing, 2005 IEEE International Conference on, Vol. 2. IEEE, 718--721.

Cited By

Tesema FGu JSong WWu HZhu SLin ZHuang MWang WKumar R(2023)Addressee Detection Using Facial and Audio Features in Mixed Human–Human and Human–Robot Settings: A Deep Learning FrameworkIEEE Systems, Man, and Cybernetics Magazine10.1109/MSMC.2022.32248439:2(25-38)Online publication date: Apr-2023
https://doi.org/10.1109/MSMC.2022.3224843
Malik UBarange MSaunier JPauchet A(2021)A novel focus encoding scheme for addressee detection in multiparty interaction using machine learning algorithmsJournal on Multimodal User Interfaces10.1007/s12193-020-00361-915:2(175-188)Online publication date: 17-Jan-2021
https://doi.org/10.1007/s12193-020-00361-9
McLaren LKoutsombogera MVogel C(2020)Gaze, Dominance and Dialogue Role in the MULTISIMO Corpus2020 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)10.1109/CogInfoCom50765.2020.9237833(000083-000088)Online publication date: 23-Sep-2020
https://doi.org/10.1109/CogInfoCom50765.2020.9237833
Show More Cited By

Recommendations

An End-to-End Conversational Style Matching Agent
IVA '19: Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents

We present an end-to-end voice-based conversational agent that is able to engage in naturalistic multi-turn dialogue and align with the interlocutor's conversational style. The system uses a series of deep neural network components for speech ...
Impact of adaptive multimodal empathic behavior on the user interaction
IVA '22: Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents

Empathic behavior between humans often has a positive effect, particularly in healthcare, since it facilitates relationships, improves engagement, and reduces stress and anxiety. Despite the importance of empathic communication and social relationship ...
Facilitating multiparty dialog with gaze, gesture, and speech
ICMI-MLMI '10: International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction

We study how synchronized gaze, gesture and speech rendered by an embodied conversational agent can influence the flow of conversations in multiparty settings. We begin by reviewing a computational framework for turn-taking that provides the foundation ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

IVA '19: Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents

July 2019

282 pages

ISBN:9781450366724

DOI:10.1145/3308532

General Chairs:
Catherine Pelachaud
CNRS-ISIR, Sorbonne Universite, France
,
Jean-Claude Martin
CNRS-LIMSI, Universite Paris Saclay, France
,
Program Chairs:
Hendrik Buschmeier
Bielefeld University, Germany
,
Gale Lucas
University of Southern California, USA
,
Stefan Kopp
Bielefeld University, Germany

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

IVA '19

Sponsor:

SIGAI

IVA '19: ACM International Conference on Intelligent Virtual Agents

July 2 - 5, 2019

Paris, France

Acceptance Rates

IVA '19 Paper Acceptance Rate 15 of 63 submissions, 24%;

Overall Acceptance Rate 53 of 196 submissions, 27%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
244
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tesema FGu JSong WWu HZhu SLin ZHuang MWang WKumar R(2023)Addressee Detection Using Facial and Audio Features in Mixed Human–Human and Human–Robot Settings: A Deep Learning FrameworkIEEE Systems, Man, and Cybernetics Magazine10.1109/MSMC.2022.32248439:2(25-38)Online publication date: Apr-2023
https://doi.org/10.1109/MSMC.2022.3224843
Malik UBarange MSaunier JPauchet A(2021)A novel focus encoding scheme for addressee detection in multiparty interaction using machine learning algorithmsJournal on Multimodal User Interfaces10.1007/s12193-020-00361-915:2(175-188)Online publication date: 17-Jan-2021
https://doi.org/10.1007/s12193-020-00361-9
McLaren LKoutsombogera MVogel C(2020)Gaze, Dominance and Dialogue Role in the MULTISIMO Corpus2020 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)10.1109/CogInfoCom50765.2020.9237833(000083-000088)Online publication date: 23-Sep-2020
https://doi.org/10.1109/CogInfoCom50765.2020.9237833
Ghannad NDe Guio RParrend P(2020)Feature Selection-Based Approach for Generalized Physical Contradiction RecognitionSystematic Complex Problem Solving in the Age of Digitalization and Open Innovation10.1007/978-3-030-61295-5_26(321-339)Online publication date: 9-Oct-2020
https://doi.org/10.1007/978-3-030-61295-5_26

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten